Study design and participants
The detailed design and baseline characteristics of the JECS cohort have been published previously [26, 27]. The JECS is a nationwide government-funded birth cohort study that focuses on various environmental factors and child health and development. In the JECS, 103,062 pregnancies were registered via recruitment at 15 regional centers in both, rural and urban locations throughout Japan. The sample size was determined in advance to maintain adequate statistical power for evaluating conditions with a prevalence of ≤1%. The eligibility criteria for the pregnant women were as follows: 1) they resided in the study areas at recruitment and were expected to reside continually in Japan for the foreseeable future, 2) the expected delivery date was approximately between August 1, 2011 and mid-2014, and 3) they were capable of comprehending and completing the self-administered questionnaire. Women were excluded if they resided outside the study areas, even if they visited cooperating healthcare providers within the study areas. The study protocol was approved by the Ministry of the Environment’s Institutional Review Board on Epidemiological Studies and by the ethics committees of all participating institutions. All women provided written informed consent prior to participation.
The recruitment was performed between January 2011 and March 2014. Follow-up evaluation was primarily conducted at 1 month postpartum via mailed letters and scheduled in-hospital check-ups, and at 6 months postpartum via mailed letters. Data were acquired using self-administered questionnaires or medical record transcriptions performed by physicians, midwives/nurses, and/or research coordinators. The dataset that was used in the present study is named jecs-an-20180131 (released in March 2018) and contains data from the first trimester, second/third trimester, 1-month follow-up, and 6-month follow-up.
Among the 103,062 pregnancies in the dataset, 5647, 949, and 3676 were excluded owing to multiple registrations, multiple births, and miscarriage or stillbirth, respectively. Among the remaining 92,790 unique mothers with singleton live births, 1727 were excluded owing to completely missing data or no response to the 1- and 6-month EPDS questionnaires; 869 mothers were excluded owing to missing data regarding the highest education level during pregnancy. Therefore, the present study analyzed data from 90,194 unique mothers with singleton live births (Fig. 1).
Measures
Exposure
Socioeconomic status was evaluated based on the women’s highest education level, as this factor is a more stable proxy for socioeconomic status than occupation or income, which frequently change during childbearing years [28]. The highest education level was categorized as ≥16 years (bachelor’s degree or postgraduate degree), >12–<16 years (technical junior college, technical/vocational college, or associate degree), or ≤12 years (junior high school or high school) of education. The data were collected during the second/third trimesters.
Outcomes
Postpartum depression and its symptoms were assessed using the EPDS [18] at 1 and 6 months postpartum. The EPDS is a 10-item self-administered questionnaire that is used to screen for postpartum depression, with the score of each item and the total scores ranging from 0 to 3 (four-point scale) and 0 to 30, respectively. This tool is widely used, and has been translated into > 50 languages; the Japanese version developed by Okano et al. [29] using a back-translation technique provides good internal consistency (Cronbach’s alpha = 0.78) [30], test-retest reliability (r = 0.92), and an optimal cut-off score of 8/9 (75% sensitivity and 93% specificity). The present study also used the 8/9 cut-off point, which was validated in the study by Yamashita et al. [31] (82% sensitivity and 95% specificity) and has since been widely used to identify postpartum depression in Japan [20, 23,24,25, 32, 33].
Previous studies have evaluated the factor structure of EPDS [21, 34, 35], and the Japanese version of EPDS also likely has a three-factor structure that includes anxiety, depression, and anhedonia [19, 20]; however, there is some ambiguity regarding this relationship. Therefore, we conducted factor analysis using the maximum likelihood method and promax rotation, setting the number of predetermined factors to 3; this is consistent with the methods used in previous studies [19, 20]. We then defined the sum of the relevant items as “anxiety” (EPDS items 3 = self-blame, 4 = anxious, and 5 = scared), “depressive symptoms” (items 7 = hard to sleep, 9 = crying, and 10 = self-harm), and “anhedonia” (items 1 = laugh and 2 = enjoyment), based on the subscale rule of items having a factor loading of ≥0.4 for a particular factor and < 0.3 for other factors. The results of the confirmatory factor analysis have been provided in (Additional file 1: Figure S1).
Covariates
We selected both potential confounders, defined as variables impacting on both, exposure and outcome, and potential mediators, defined as variables mediating exposure and outcome, in this study. Firstly, we chose physician-diagnosed histories of depression (yes or no), anxiety disorder (yes or no), dysautonomia (yes or no), and schizophrenia (yes or no) as potential confounding covariates. These were known as risk factors for postpartum depression [36], as they could interrupt academic learning. Secondly, we chose the following variables for potential mediating covariates: maternal age (continuous years), body mass index (<18.5, 18.5–<25, and ≥25 kg/m2), smoking status (never, former, and current), alcohol intake (never, former, current at the rate of 1–3 times/month, and current at the rate of ≥1 times/week), physical activity (continuous METs × h/day), employment status (yes or no), parity (primipara or multipara), marital status (married, single, and divorced or widowed), passive smoking status (never, pre-pregnancy, and during pregnancy), annual household income (<4, 4–<6, ≥6 million Japanese yen), and feeding method at 1 month (exclusive breastfeeding, mixed feeding, or only formula feeding). These covariates could be affected by the education level, and were regarded as risk factors for postpartum depression [36].
The variables were categorized according to standard medical practice, common practice in Japan, and/or based on previous studies [32, 37, 38].
Statistical analysis
The outcome variables at 1 and 6 months postpartum were cases of postpartum depression (defined as any woman with an EPDS of ≥9), the total EPDS score (summated scores for items 1–10), and the sub-scores for anxiety (EPDS items 3, 4, and 5), depressive symptoms (items 7, 9, and 10), and anhedonia (items 1 and 2). As mentioned previously, the exposure variable was defined as the mother’s highest education level (≥16 years, >12–<16 years, and ≤12 years).
Logistic regression analysis was used to calculate the crude and adjusted odds ratios (CORs and AORs) and their corresponding 95% confidence intervals (CIs) for the cases. Generalized linear regression models, setting the logit as a link function after transforming each score into a ratio value (e.g., dividing the total score by 30 and the depression subscale by 9), were used to calculate the CORs and AORs and their 95% CIs for EPDS scores (i.e., total, anxiety, depression, and anhedonia). This analysis corresponds to an extension of logistic analysis when outcomes may be counted by numbers; that is, when the EPDS score distributes binomially rather than normally. In either analysis, the group with ≥16 years of education was considered as the reference group.
The forced entry method was used to include covariates in the multivariate analysis. In model 1, the regression models were adjusted only for the potential confounding covariates. Hence, the AOR from this model was referred to as AOR1. In model 2, the models were adjusted for the potential mediating covariates in addition to the covariates used in model 1. Hence, the AOR from this model was referred to as AOR2.
All analyses were performed using SAS software (version 9.4; SAS Institute Inc., Cary, NC).
Missing data
The response rates were 99.57% at 1 month postpartum (n = 89,803) and 94.72% at 6 months postpartum (n = 85,431), with only 0.43% (n = 391) of the women responding to the 6-month follow-up, but not the 1-month follow-up. Among the 90,194 included pregnancies, the missing data rate was < 1% for most covariates, with the exceptions of physical activity (4.94%, n = 4457) and annual household income (7.17%, n = 6470). The missing data rate for the exposure measure (highest education level) was 0.57% (n = 517). Each of the 10 items from the Japanese EPDS had missing data rates of < 0.90% at 1 month (maximum n = 809); however, 1.95% of the cases (n = 1756) had at least one missing value. The EPDS items had average missing data rates of up to 5.70% at 6 months (maximum n = 5253); however, 6.66% (n = 6007) had at least one missing value. A total of 18,167 (20.14%) mothers had at least one missing value.
Data imputation was performed using chained equations [39] to create 10 imputed datasets, with the data imputed simultaneously irrespective of the measurement time points. When conducting multiple imputations, auxiliary variables that were related to the analyzed variables were also included to preserve the assumption of data missing at random.
Sensitivity analysis
The patterns of the resulting ORs for the complete datasets (n = 76,716 at 1 month and n = 72,809 at 6 months) were compared to those from the multiply imputed datasets (both n = 90,194) to assess the differences between the strategies for addressing missing values.