Skip to main content

Machine learning and Bayesian network analyses identifies associations with insomnia in a national sample of 31,285 treatment-seeking college students

Abstract

Background

A better understanding of the relationships between insomnia and anxiety, mood, eating, and alcohol-use disorders is needed given its prevalence among young adults. Supervised machine learning provides the ability to evaluate which mental disorder is most associated with heightened insomnia among U.S. college students. Combined with Bayesian network analysis, probable directional relationships between insomnia and interacting symptoms may be illuminated.

Methods

The current exploratory analyses utilized a national sample of college students across 26 U.S. colleges and universities collected during population-level screening before entering a randomized controlled trial. We used a 4-step statistical approach: (1) at the disorder level, an elastic net regularization model examined the relative importance of the association between insomnia and 7 mental disorders (major depressive disorder, generalized anxiety disorder, social anxiety disorder, panic disorder, post-traumatic stress disorder, anorexia nervosa, and alcohol use disorder); (2) This model was evaluated within a hold-out sample. (3) at the symptom level, a completed partially directed acyclic graph (CPDAG) was computed via a Bayesian hill-climbing algorithm to estimate potential directionality among insomnia and its most associated disorder [based on SHAP (SHapley Additive exPlanations) values)]; (4) the CPDAG was then tested for generalizability by assessing (in)equality within a hold-out sample using structural hamming distance (SHD).

Results

Of 31,285 participants, 20,597 were women (65.8%); mean (standard deviation) age was 22.96 (4.52) years. The elastic net model demonstrated clinical significance in predicting insomnia severity in the training sample [R2 = .44 (.01); RMSE = 5.00 (0.08)], with comparable performance in the hold-out sample (R2 = .33; RMSE = 5.47). SHAP values indicated that the presence of any mental disorder was associated with higher insomnia scores, with major depressive disorder as the most important disorder associated with heightened insomnia (mean |SHAP|= 3.18). The training CPDAG and hold-out CPDAG (SHD = 7) suggested depression symptoms presupposed insomnia with depressed mood, fatigue, and self-esteem as key parent nodes.

Conclusion

These findings provide insights into the associations between insomnia and mental disorders among college students and warrant further investigation into the potential direction of causality between insomnia and depression.

Trial registration

Trial was registered on the National Institute of Health RePORTER website (R01MH115128 || 23/08/2018).

Peer Review reports

Sleep disturbance is often conceptualized as a transdiagnostic mechanism observed across a range of mental disorders [1] and, in some cases, was even included as a diagnostic criterion [2]. Unlike earlier conceptualizations of insomnia as merely a symptom or consequence of other mental health issues, an emerging perspective suggests that sleep and other mental disorders are intricately intertwined and bidirectional [3,4,5]. Such findings are not unexpected given that patients who reported sleep–wake disorders, notably and most commonly, insomnia, exhibited higher rates of comorbidity (e.g., 40% of those with insomnia vs. 16.4% with no sleep difficulties reported having additional disorders) [6]. Systematic reviews have also found associations between sleep disturbance and most mental disorders, including all anxiety disorders, depression, alcohol use disorder, and eating disorders [7,8,9,10]. Similarly, meta-analytic evidence found comorbidity was associated with increased rapid eye movement  pressure alterations and inhibited sleep depth [11]. However, there is a relative lack of representation of college samples despite their high prevalence of sleep difficulties. For instance, over 60% of U.S. students were categorized as having poor sleep quality [12], with an estimated 7.7% meeting criteria for insomnia [13]. Consequently, shorter durations of sleep and irregular sleep–wake schedules were significantly associated with a lower grade point average [14], along with poor sleep quality and increased odds of developing a mental disorder (e.g., OR = 2.36 [15]). Given the interrelationships between insomnia and several mental disorders within this population, further teasing apart of its relations may help us understand which mental disorder is most associated with heightened insomnia among U.S. college students and their probable directional relationships.

Embracing such complexity requires a nuanced approach. Unlike unregularized approaches (e.g., linear regression), supervised machine learning methods allow the opportunity to aggregate disparate small variable effects to inform clinical outcomes while also accounting for complex, interactive, or non-linear effects [16]. Elastic net regularization (elastic net), in particular, is preferable to unregularized approaches given its ability to minimize adverse effects of multicollinearity and reduce the probability of spurious, false positive associations [17]. Therefore, elastic net has led to parsimonious models with greater stability and accuracy and higher out-of-sample predictive performance relative to linear regression [18, 19]. Consequently, such models were used in recent studies to understand the relationship between insomnia and interacting symptoms. For example, utilizing baseline data from a controlled trial, Bard et al. [20] evaluated functional impairment and the relative importance of depressive and anxiety symptoms among insomnia patients. Also, Lyall et al. [21] employed actigraphy and mental health data from the UK Biobank to determine the most important sleep features (e.g., sleep duration, chronotype) related to depression and whether patients with poorer outcomes could be identified.

Further evidence is thus warranted in determining whether higher levels of insomnia are most related to the presence or absence of some disorders compared to others. McCallum et al. [22] used simple regression and found that both generalized anxiety disorder (GAD) and major depressive disorder (MDD) maintained independent associations with sleep disturbance, even after correcting for confounds, and were found to hold the greatest contribution among nine mental disorders. However, adopting a supervised machine learning approach using elastic net, which balances bias-variance trade-off may be a better approach.

Despite the advent of machine learning frameworks such as the seminal SHAP (Shapley Additive exPlanations) [23], which allow for interpretation of the magnitude of a variable’s impact on model predictions, revealing the structure of relations is a formidable challenge. Network analysis is one methodological approach suited for such an endeavor, given its telos of disentangling the complex dynamics of self-reinforcing causal interactions between symptoms [24]. Broadly, in this approach, a network comprises symptoms (nodes) and the associations between them (edges). In other words, an edge between nodes represents a conditional dependent relationship between two symptoms while keeping all other symptoms in the network constant [25]. Within this approach, hypotheses posit symptoms as causal agents that promote the development of other symptoms and, when unabated, go beyond a critical threshold and develop into a new harmful equilibrium known as a mental disorder [26, 27].

Insomnia as a node or a set of nodes has appeared in many prior cross-sectional network analyses, providing snapshots of associations between symptoms. Extant studies included examining insomnia's network structure [28,29,30,31,32,33,34] but also insomnia’s relationships with single disorders, such as MDD [35,36,37,38], post-traumatic stress disorder (PTSD) [39], psychosis [40], and schizophrenia [41], or with transdiagnostic factors, such as hyperarousal [42] or personality traits [43]. However, cross-sectional networks have also been developed between insomnia and multiple disorders, most commonly between MDD and GAD [20, 44,45,46,47,48,49] or with the further addition of PTSD [50], but also between prolonged grief disorder (PGD) and PTSD [51]. Most studies utilized the graphical Gaussian model (GGM), in other words, an undirected network of partial correlation coefficients, along with the graphical LASSO (least absolute shrinkage and selection operator [52]), as a regularization technique to avoid spurious, false-positive edges. However, as pointed out by Williams and Rast [53] and further highlighted by McNally et al. [54], graphical LASSO was developed and optimized for high-dimensional settings with more variables than the number of participants, which often is not the case in typical network structures thus leading to unwarranted sparsity. Moreover, despite efforts, the conventional GGM approach employed in such studies makes few inferences on potential directionality.

Conversely, Bayesian network analysis, such as directed acyclic graphs (DAGs), allows for estimating directed networks built on cross-sectional data. Although DAGs cannot confirm temporal precedence, it can provide preliminary clues to identify the direction of probabilistic dependence between edges [55]. In other words, if an edge originates from node X and connects to node Y (i.e., X → Y), node Y's presence suggests node X's presence more strongly than vice versa. Whereas the node considered the "parent" (X) might be present without its "offspring" (Y), the presence of the offspring indicates the presence of the parent. However, the assertion of causality is predicated on multiple conditions: these include the absence of any bidirectional causal relations (such as X causing Y and Y causing X) or causal loops (such as X causing Y, Y causing Z, and Z causing X); and second, the absence of any significant variables missing from the dataset [54]. To our knowledge, two studies on insomnia and common comorbidities have taken such a Bayesian approach. In one of these studies, Zhang et al. [56] elucidated associations between insomnia and depression and health-related behaviors (e.g., internet use, physical inactivity, smoking, alcohol consumption) among adolescents in China. In the other study, Yu et al. [57] examined the relationships between sleep disturbance and mental health (e.g., anxiety, depression, loneliness, well-being, health attitudes) among adults in China. However, whether such associations can be generalized to other demographic groups or other mental disorders requires further evaluation.

The current exploratory study thus intended to fill these gaps by first examining the relative importance of 7 disorders (MDD, GAD, social anxiety disorder [SAD], panic disorder [PD], PTSD, anorexia nervosa [AN], and alcohol use disorder [AUD]) association with insomnia within a nationally representative sample of treatment-seeking U.S. college students. We were then interested in exploring a Bayesian network analysis of the most predicted disorder at a symptom level. Thus, we used a straightforward 4-step statistical approach: (1) at the diagnostic level, an elastic net model, along with SHAP, was employed to illuminate the relative importance of mental disorders associated with insomnia and functioned as variable selection for further modeling; (2) the model was tested for generalizability in a hold-out sample; (3) at the symptom level, DAGs characterized the structure, relations, potential importance, and probable direction among insomnia and its most associated disorder; (4) inspired by Bard et al. [20], who randomly partitioned their data into training and hold-out samples to evaluate the replicability of their GGMs, we tested for generalizability of the DAGs by assessing (in)equality within a hold-out sample using structural hamming distance (SHD).

Methods

Participants

The current study was a secondary analysis of 39,194 treatment-seeking participants across 26 U.S. colleges and universities who participated in screening for an ongoing randomized controlled trial investigating the effectiveness of a transdiagnostic coached mobile mental health intervention that used population-level screening for engaging college students in tailored services for preventing and treating anxiety, depression, and eating disorders (clinicaltrial.gov; ID: NCT04162847). Participants were eligible for the screen if they were ≥ 18 years of age, enrolled at one of the 26 participating universities, provided informed consent to participate, and passed a one-item attention check. See Fitzsimmons-Craft et al. [58] for a more detailed description of eligibility criteria. Participants were excluded for only previewing the survey (n = 1), not responding to (n = 5,513) or denying (n = 503) consent for screening, being under 18 years of age (n = 63), not reporting their age (n = 1,154), not being an undergraduate student (n = 629) or not reporting their year in school (n = 46). The final sample was a nationally representative sample of 31,285 undergraduate students. All data for the present study were collected prior to selection for the randomized controlled trial or intervention delivery.

Procedures

Students enrolled at participating universities received an email invitation to complete a brief survey on health and well-being between October 2019 and November 2021. Emails were sent to either the entire student population or a random subset of the student population and either to undergraduate students from all years (17 schools) or only years 1 or 2 (9 schools). Emails informed students that based on their responses, they may be eligible for a subsequent study involving random assignment to conditions designed to support mental health. Emails included a link to an online screening survey via Qualtrics. Participating students were entered into a raffle to win one of several $100 gift cards. The study was approved by the institutional review board of all authors’ universities and administrators at each participating school.

Measures

All models were based on data captured pre-intervention delivery and included insomnia, MDD, GAD, PTSD, SAD, PD, AN, and AUD.

Insomnia was assessed using the Insomnia Severity Index (ISI; [59]). The ISI has seven questions with 5-point Likert scale responses, which are summed to produce a total score between 0 and 28, with higher scores indicating greater insomnia severity. Cronbach’s alpha for the present study was .884. Its internal consistency, concurrent validity, and sensitivity to clinical improvements in insomnia patients are well established [60].

MDD was assessed using the Patient Health Questionnaire-9 (PHQ-9; [61]). Participants reported frequency of depressive symptoms over the past two weeks on 9 items with four-point scales ranging from 0 (“Not at all”) to 3 (“Nearly every day”). Total score ranges from 0 to 27. Cronbach’s alpha for the present study was .877. Participants screened positive for probable MDD if they scored 10 or higher, maintaining a sensitivity of .88 and specificity of .85 [62].

PTSD was assessed using the Primary Care PTSD Screen (PC-PTSD; [63]), which has total scores ranging from 0 to 4. Participants screened positive for probable PTSD if they scored three or higher, which demonstrated a sensitivity of .78 and a specificity of .89 [63]. Cronbach’s alpha for the present study was .806.

GAD was assessed using the Generalized Anxiety Disorder Questionnaire-IV (GAD-Q-IV; [64]), maintaining a .82 specificity and .89 sensitivity, and has a total score ranging from 0 to 12. SAD was assessed using the Social Phobia Diagnostic Questionnaire (SPDQ; [65]), maintaining a .85 specificity and .82 sensitivity, and has a total score ranging from 0 to 27 [65]. PD was assessed using the Panic Disorder Self-Report (PDSR; [66]), maintaining a 1.00 specificity and .89 sensitivity, and has a total score ranging from 0 to 24 [66]. Cronbach’s alpha for the present study was .856, .97, and .959, respectively. These measures all assessed full diagnostic criteria based on the Diagnostic and Statistical Manual of Mental Disorders, 5th edition [2]. Participants screened positive for a disorder if they endorsed all diagnostic criteria. GAD-Q-IV, SPDQ, and PDSR demonstrate strong retest reliability, good convergent and discriminant validity, and a kappa agreement of .67, .66, and .93, with structured interviews, respectively.

AN was assessed by the Weight and Shape Concerns Scale (WCS; [67]). Total scores for the weight/shape concerns scale range from 0 to 100. Participants screened positive for probable AN if they scored 59 or higher and had a current body mass index ≤ 18.45, based on self-reported height and weight. Cronbach alpha for the present study was .797. These criteria have been used in prior online screening studies [68].

AUD was assessed using the Alcohol Use Disorders Identification Test Consumption (AUDIT; [69]). The instrument contains three questions about alcohol consumption with 4-point Likert scale responses, which are summed to obtain a total score ranging from 0 to 12. Cronbach’s alpha for the present study was .85. To identify probable AUD, we used the clinical cut-off of 4 or higher for participants assigned male at birth and 3 or higher for participants assigned female or intersex. This system had .88 sensitivity and .75 specificity for males and .87 sensitivity and .85 specificity for females [70].

Statistical analysis

Pre-processing

The data were randomly partitioned into a 70% split as a training set and a 30% hold-out set to evaluate the final models in completely unseen new cases. Missing values for the included variables in our sample were low and were similar across sets (6.26% and 6.41%, respectively).

Nonetheless, to tackle missing data for all analyses, a machine learning approach for imputation was employed, specifically utilizing nonparametric missing value imputation via random forests facilitated by the R package mice [71]. Imputations were aggregated across 10 multiple imputed datasets, each with 100 iterations, to minimize biased error calculations and produce stable estimates. Random forest imputations were done separately for the training and hold-out sets. Minimal recoding adjustments were made before each imputation to maintain the inherent relationships between variables (as recommended by van Ginkel et al. [72]). Moreover, to prevent “data leakage” of variable distributions between sets, all pre-processing steps were done separately for training and hold-out sets. Topological overlap between node pairs was also screened for and removed if found via the “goldbricker” function within the R package networktools [73].

Supervised machine learning (Elastic net regularization)

Elastic net development

Elastic net regularization is a form of conventional regression that combines both ridge and LASSO norms. It provides a penalization term to balance stability and parsimony with a lambda hyperparameter. In this way, it  determines magnitude and an alpha hyperparameter regulating the balance between the two norms. Tuning of alpha and lambda was conducted using a resampling grid search. The final model was selected using repeated tenfold cross-validation to minimize biased estimates of the true error and assess the stability of model performance [74]. Tenfold cross-validation partitions the sample into 10 subsets, 9 of which are used in the training process and then tested on the remaining subset [75]. This process is iterated for the remaining 10 subsets, building new models until each of the 10 subsets is used only once in the training and testing data. This procedure then repeats the 10 folds by 10 repeats for a total of 10 models. The final model is then averaged to produce a single estimate. Final alpha and lambda values were selected based on the smallest value of root mean square error (RMSE) and were used to estimate model coefficients.

In the current study, the elastic net model considered 7 disorders (MDD, GAD, PTSD, SAD, PD, AUD, AN) as binary variables (i.e., presence vs absence) and insomnia as a continuous outcome (i.e., total ISI score). This approach was used because we were first interested in understanding which clinical disorder would be most associated with insomnia symptoms. Imbalance within the outcome was also addressed by applying the synthetic minority over-sampling technique for regression with Gaussian noise (SMOGN; [76]), which randomly undersamples high-frequency cases and oversamples rare cases using SmoteR and Gaussian noise to generate a more balanced proportion of cases within the continuous outcome and improve prediction accuracy. Imbalance occurs when machine learning models favor predictions from high-frequency cases and ignore rare cases, given preferences for high accuracy even if purely by chance. All analyses were conducted in R version 4.3.1 using the caret package [77].

Elastic net evaluation

The cross-validated elastic net model built from the training sample was evaluated by being applied to individuals within the hold-out sample to predict insomnia severity. Importantly, individuals within the hold-out sample were not utilized as part of the development and tuning of the elastic net model. RMSE determined the accuracy of the model, i.e., the magnitude of error. Lower values represent higher accuracy. The coefficient of determination (R2) was also used, given evidence of R2 being the most informative metric within regression-based supervised machine learning [78]. R2 determined predictability, i.e., the proportion of variance within the outcome explained by the elastic net model. Values are interpreted as percentages and range from 0 to 1, with higher values representing higher predictability. The current study adopted the benchmark set by Uher et al. [79], who found an R2 of 6.3 or higher inferred clinical significance.

Elastic net variable importance

Methods for explainable artificial intelligence were run using SHAP (Shapley Additive exPlanation) values [23] to facilitate the interpretability of the elastic net model. SHAP values assign a value to each variable that represents the average contribution of that variable across all possible combinations of variables. The average SHAP value across all participants is 0, but the average absolute SHAP value informs relative variable importance. We are not aware of any thresholds that could serve as an empirical cut-off for variable selection using SHAP values. Given our use of SHAP values in determining subsequent modeling, we selected any variable above 1 in SHAP values to be included in the network analyses.

Bayesian networks (directed acrylic graphs)

Network estimation

DAG analyses were run via the hill-climbing algorithm from the R package, “bnlearn” [80], to determine the potential directionality and conditional dependencies among symptoms. DAGs return a network comprising symptoms (nodes) and the relations between them (edges). To create the DAGs, a bootstrap function computes the structural aspect of a network by adding edges, removing them, and reversing their direction to optimize a goodness-of-fit score (i.e., Bayesian information criterion [BIC]). This step determines whether an edge exists; however, it does not calculate the weights of edges. To do so, we randomly restarted the process with different candidate edges linking different symptom pairs, perturbing the system. To ensure robustness, we used 50 restarts (as per Briganti et al., [55]) and 100 permutations (as implemented by McNally et al. [81, 82]). In the current study, we employed a Bayesian network via a completed partially directed acrylic graph (CPDAG), a type of Markov equivalence class that encodes identical conditional dependencies between DAGs and accounts for drawbacks of equivalent separate DAGs [83]. Insomnia was included in the DAG analyses as a single-sum score derived from the ISI representing insomnia severity, with all items of the PHQ minus item 3 [insomnia/hypersomnia] to prevent multicollinearity.

Network stability

To verify the stability of the resultant network, we bootstrapped 10,000 samples, computed a network for each sample, and averaged all 10,000 networks to obtain the final network. Following the reasoning of Briganti et al. [55], we first determined the structure of the network and then ascertained the direction of each edge. The bnlearn program computes a BIC value for each edge. The thickness of an edge corresponds to its absolute BIC value and, hence, its importance to model fit. The larger the absolute BIC value, the more damaging it would be to the model fit if one were to remove the edge from the network. Accordingly, high absolute BIC values indicate how important an edge is to the model that best characterizes the data structure. In line with Sachs et al. [84], if an edge ran from symptom X to symptom Y in at least 85% of the bootstrapped networks, this edge appeared in the final, averaged network. After which, if an edge ran from symptom X to symptom Y in at least 51% of the bootstrapped networks, its direction was depicted using an arrow pointing from node X to node Y. Accordingly, such significance thresholds promoted the stability of the final averaged network and led to sparse networks that ensured genuine edges. Lastly, we then computed the identical network but had edge thickness reflect the probability for which the depicted direction of the edge occurred.

Network confirmatory analysis

Often, the reliability and replicability of parameter estimates in cross-sectional network analyses are not considered and are, at the least, questionable (e.g., [85,86,87,88]). For the current investigation, three steps were taken to ensure model stability: (1) random perturbations to avoid local maxima and optimize goodness-of-fit index (i.e., BIC values); (2) bootstrapping 10,000 different DAGs to determine strength and direction of the edges; (3) using significance thresholds outlined in Sachs et al. [84]. Supplementing our bootstrapped stability tests, a confirmatory analysis was run by repeating steps 1–3 for a CPDAG built on a hold-out sample and then assessing for (in)equality between the CPDAG developed on the training sample. To test for the similarity between the training and hold-out CPDAGs, structural hamming distances (SHD) were used, which quantified the sum number of changes between nodes, arcs, and directions that were required of a network for it to turn into the one that was being compared [89]. In other words, calculating the true positive, false positive, and false negative arcs by comparing the training network to the hold-out network, considered the "true" standard network. This allowed for testing whether the network estimation was roughly consistent across both data subsets, further suggesting generalizability.

Results

Sample characteristics

Screening sample characteristics for the entire sample are presented in Table 1, and diagnostic severity, split between training and hold-out samples, may be found in Table 2. Most participants identified as female (63.4%), heterosexual (72.7%), white (65.7%), and non-Hispanic (67.5%).

Table 1 Sample distribution across demographic characteristics
Table 2 Diagnostic severity by data partition

Elastic net regularization

A total of 21,899 participants were included in the training models, and 9,386 participants were included in the hold-out models. Run on the full training sample, the elastic net model derived from repeated tenfold cross-validation was associated with an optimal alpha parameter of 0.1 and a lambda parameter of 0.008 (via RMSE criterion). The final elastic net model demonstrated clinical significance in predicting insomnia severity in the training sample [R2 = .44 (.01), RMSE of 5.00 (0.08)], with comparable variance explained in the hold-out sample (i.e., completely unseen new cases; R2 = .33, RMSE of 5.47). Results of the SHAP variable importance ranking are displayed in Fig. 1, in which SHAP values illustrated that MDD (SHAP = 3.18, cf. β = 6.39) was the most important disorder associated with insomnia, thus apt for further modeling, followed by GAD (SHAP = 0.96, cf. β = 2.30) and PTSD (SHAP = 0.96, cf. β = 2.18).Footnote 1 It should be noted, however, that the presence of any of the seven mental disorders was associated with higher insomnia scores.

Fig. 1
figure 1

SHAP variable importance Note. To determine relative importance, we used a gold standard interpretability method termed SHapley Additive explanation (SHAP). SHAP values provide a more comprehensive understanding of each variable’s contribution to the model's predictions

Directed acrylic graphs

The CPDAG built on training data (N = 21,899), as displayed in Fig. 2, shows a chain of symptoms dependent on the parent node of depressed mood, which directly predicted fatigue, anhedonia, poor self-esteem, concentration problems, eating problems, psychomotor disturbance, suicidal ideation, or insomnia. That is, depressed mood had no incoming edges (i.e., in-degree = 0) but had eight outgoing edges (i.e., out-degree = 8). The most important arrows connected depressed mood to fatigue (with a change in BIC of -4067.81 and a directional probability of 50.52%) and depressed mood to poor self-esteem (with a change in BIC of -3294.17 and a directionality probability of 55.71%). Accordingly, fatigue emerged as a key step in the cascading node with one incoming arrow (i.e., in-degree = 1) and five direct descendants (out-degree = 5): anhedonia, poor self-esteem, concentration problems, eating problems, or insomnia. There were seven total paths for insomnia (depressed mood, fatigue, anhedonia, poor self-esteem, concentration problems, eating problems, or psychomotor disturbance). In other words, all depression symptoms, except for suicidal ideation, presupposed insomnia. That is, insomnia was more likely when depressed mood, fatigue, anhedonia, poor self-esteem, concentration problems, eating problems, or psychomotor disturbance were present than vice versa. Suicidal ideation occurred only through depressed mood, poor self-esteem, or psychomotor disturbance. This could have arisen from eating problems or concentration problems, and depressed mood or poor self-esteem. Suicidal ideation or insomnia were the only symptoms without any descendants and, thus, were not a prerequisite for any other symptoms. Additional DAGs with arrow thickness denoting directional probability using Sachs et al.'s [84] approach were also run and can be found in Fig. 3. BIC values employed in Fig. 2 and directional probability values in Fig. 3 are listed in Table 3.

Fig. 2
figure 2

CPDAG importance. Note. Built on the training sample (n = 21,899). Arrow thickness denotes a change in the Bayesian Information Criterion (BIC; a relative measure of a model’s goodness-of-fit) arising from the proportion of the averaged 10,000 bootstrapped networks wherein that arrow is removed from the network. In other words, the more an arrow contributed to the model fit, the thicker it is

Fig. 3
figure 3

CPDAG directional probability. Note. Built on the training sample (n = 21,899). Edge thickness signifies directional probabilities arising from the proportion of the averaged 10,000 bootstrapped networks wherein that arrow was pointing in that direction, or, in other words, confidence that the direction of prediction flows in the direction depicted in the graph

Table 3 Model fit importance and directional probability of CPDAG

Structural distance

As observed in Fig. 4, 7 needed operators were found (SHD = 7; i.e., adding or deleting an undirected edge, and adding, removing, or reversing the orientation of an edge; [89]) within the hold-out CPDAG (N = 9386) to match the training CPDAG (N = 21,899). The parent node of depressed mood and fatigue, as a cascading node, along with its five direct descendants, anhedonia, poor self-esteem, concentration problems, eating problems, or insomnia, remained the same across networks. However, there were false positives for which directions switched within the hold-out network as compared to the training network or the “true network.” These arrows were concentration problems related to anhedonia or insomnia related to psychomotor disturbance. Accordingly, within the hold-out network, insomnia attained one direct descendant, signifying that psychomotor disturbance was more likely when insomnia was present than vice versa. Suicidal ideation also gained two descendants: psychomotor disturbance or insomnia. In other words, suicidal ideation occurred only through depressed mood or poor self-esteem and directly predicted insomnia or psychomotor disturbance. Thus, nodes without any descendants switched from suicidal ideation and insomnia to psychomotor disturbance within the hold-out sample, implying that suicidal ideation was not a prerequisite for other symptoms.

Fig. 4
figure 4

Similarity between training and hold-out CPDAGsNote: training sample (n = 21,899); hold-out sample (n = 9,386). CPDAG = completed partially directed acyclic graph. Green arrows = true positives; Red arrows = false positives; SHD = structural hamming distance. SHD assesses the similarity between two CPDAGS and represents the number of edge insertions, deletions or flips to transform one graph to another graph. Lower SHD values represent higher similarity

Discussion

The present study set out to investigate associations between insomnia and common mental disorders within a large nationally representative sample of treatment-seeking U.S. college students to (a) determine whether higher levels of insomnia were most related to the presence or absence of some disorders compared to others and (b) tease apart the potential directionality between insomnia and symptoms of its most associated disorder.

We first used a broad range of disorders (MDD, GAD, PTSD, SAD, PD, AUD, AN) to predict insomnia severity using elastic net regularization. The elastic net model accounted for 33% (R2 = .33) of the variance in insomnia, in part due to the inclusion of MDD, which SHAP values identified as the top disorder most associated with insomnia. GAD and PTSD, respectively, were also listed as secondary and tertiary disorders contributing to the model’s performance but to a lesser degree. Findings were in parallel with Bard et al. [20], who found MDD symptoms (e.g., low energy, depressive affect via PHQ-9) to be key features across multiple domains of sleep functioning and impairment as compared to anxiety (GAD-7; [90]) and insomnia symptoms (SCI-9; [91]). Our results also converge with McCallum et al. [22], who found GAD, MDD, and PTSD, respectively, as the top contributors to sleep disturbance, although findings switched between the first and secondary top contributors. Discrepancies may be due in part to sample differences, as we utilized a representative sample of college students in the American population as compared to McCallum et al. [22], who noted self-selection bias within their Australian general community sample. It may also have been due to measurement error, given the prior study's usage of non-validated self-report checklists based on DSM-5 criteria as compared to the present study, which used valid and reliable diagnostic self-report measures with adequate kappa agreement with structured interviews (e.g., GAD-Q-IV, SPDQ, PDSR, PC-PTSD). In addition, our analytic approaches diverged from theirs, given that the present study derived variable importance via the explanatory power of a machine learning model with all disorders contained in the model as compared to p values from separate regressions for each disorder tested.

DAG analyses were conducted to offer additional insight as to how MDD symptoms may have led to insomnia. Depressed mood was found to be the most important parent symptom, directly predicting fatigue, anhedonia, poor self-esteem, concentration problems, eating problems, psychomotor disturbance, suicidal ideation, or insomnia. Stated differently, the presence of fatigue, anhedonia, poor self-esteem, concentration problems, insomnia, eating problems, or psychomotor disturbance each presupposed the presence of depressed mood more than vice versa. In a typical DAG structure, higher upstream nodes are given greater predictive priority, whereas downstream nodes carry less activation potential and are less likely to influence other symptoms in the network. These findings suggested that insomnia was seemingly dependent on other downstream symptoms in the network, indicating that the occurrence of insomnia more likely depended on the presence of MDD symptoms rather than vice versa. Notably, network estimation related to parent nodes was consistent across both training and hold-out samples, further suggesting replicability. However, caution is warranted when inferring nodes with no descendants (i.e., not a prerequisite for other symptoms) as discrepancies between samples were observed. Future simulation studies are needed to determine the typical conditions when differences in network estimations arise between data subsets and their implications on validity.

Nonetheless, our findings are consistent with DSM-5 guidelines on MDD [2], suggesting that depressed mood is a hallmark feature of MDD and is one of the two main symptoms required for assigning a positive diagnosis [92]. Moreover, findings of depressed mood as a parent symptom aligned with extant network reviews on MDD [87, 93,94,95] and investigations that set out to identify MDD’s most important central symptoms (e.g., [96,97,98,99]). Results also aligned with studies on the association between MDD and insomnia [34, 37, 50, 57]. Insomnia was commonly found to be a robust risk factor for both first episode and recurrent depressive episodes [100]. Mechanistically speaking, Harvey [101] denoted that such associations occurred due to the presence of a bidirectional cycle. Disturbances in mood and symptoms during the day disrupt nighttime sleep, whereas sleep deprivation worsens mood regulation and symptoms the following day [4], creating a vicious cycle. Such cycles further persist, given that individuals with mood disorders are vulnerable to disruptions in biological rhythms and that external stressors can lead to such disruptions in biological rhythms [102]. Accordingly, college populations may be prone to such cycles, considering their increased physiological changes, heavy academic workload, and psychosocial stressors [15].

DAG analyses also implicated the presence of insomnia as probabilistically dependent on the presence of either fatigue or poor self-esteem. These findings are in line with existing centrality findings of depressed mood, fatigue, and self-esteem symptoms emerging across western [97, 103,104,105] and eastern cultures [106,107,108]. Furthermore, other findings implicated depressed mood directly leading to fatigue [57] or indirectly impacting insomnia through fatigue [109]. In fact, fatigue has been reported as the highest bridge symptom linking depression and insomnia symptoms [35, 36, 48, 110].

Our findings also have implications for treatment targets among patients with comorbidities. Results suggested that the interrelationships of depressed mood, fatigue, and self-esteem presupposed insomnia. Untreated comorbid insomnia with depression has been shown to maintain the risk of relapse due to both disorder’s links with mood dysregulation [101, 111]. As such, the presence of both disorders should be assessed during population-level screening and patient management. However, and notably, depression treatment does not inherently ameliorate insomnia. For example, sleep-related complaints are often the most common residual symptoms after antidepressant treatment [112], warranting targeted insomnia treatment [113,114,115]. Interventions for insomnia often necessitate specific behavioral strategies (e.g., sleep hygiene), which are not inherent to pharmacotherapy or cognitive behavioral therapy (CBT) for MDD (for review, see [112]). As CBT for insomnia (CBT-I) has also been shown to be effective in treating both insomnia and depressive symptoms among those who have both, further randomized clinical trials are needed to determine if treatment combinations are better than either approach alone, for example, evaluating treatment efficacy comparing CBT-I and CBT for depression to CBT-I vs. CBT for depression.

The current study is not without caveats and deserves careful consideration. All analyses were based on observational and exploratory data rather than experimental. Although Bayesian learning methods can enable probabilistic causal inferences, networks derived from such data cannot make strong inferences of causation from cross-sectional data. To make such inferences within the network paradigm requires additional assumptions (e.g., [116,117,118]). Also, our CPDAG models rested on several key assumptions inherent to Bayesian networks, including the assumption of causal relations among symptoms and acyclicity and that no important variables were excluded from the network. There were reasons to suspect that the acyclicity assumption may have been violated, given the degree of potential reverse directionality. Here, arrows deemed most important seemed relatively thin, indicating that the arrow was pointing in both directions in a substantial percentage of bootstrapped networks. For example, depressed mood and fatigue almost certainly had a bidirectional influence on one another. Accordingly, the edge connecting depressed mood to fatigue pointed in that direction in 50% of the 10,000 bootstrapped networks. The direction of the association between these two variables may thus have tipped in both directions, implying a possible 'hidden' cycle within an acyclic graph. The impact of violating the assumption of acyclicity is unknown but, at a minimum, implies that the current DAG analyses failed to detect feedback loops. Accordingly, failing to meet acyclicity may have also led to skewed interpretations of true positive, false positive, and false negative arc comparisons between training and hold-out samples, given that there could have been bi-directionality. Hence, a major limitation of the present findings is it may be treated as only a simplified snapshot of probabilistic causal relations. We are not aware of any studies that used temporal network analyses to investigate directional relationships between MDD and insomnia. Although Jordan et al. [119] found evidence of longitudinal associations between sleep processes and depressed mood within a naturalistic setting, results may not generalize to patients with MDD due to sample recruitment strategies (i.e., homogeneous sample of nurses) and measure choices (e.g., a single-item indicator of depressed mood). Future studies could improve upon our approach by gathering time-series data to detect feedback loops to elucidate potential bidirectional dependencies between depression and insomnia.

The present study reveals associations related to insomnia and common comorbidities within a nationally representative sample of treatment-seeking U.S. college students. Results illuminated MDD as the most important association that contributed to heightened insomnia with the interrelationships of depressed mood, fatigue, and self-esteem, presupposing insomnia. These findings serve as a foundation for generating hypotheses rather than conclusive causal evidence emphasizing the need for further research into the temporal precedence between insomnia and common comorbidities. The presented approach to computing Bayesian network analyses may be helpful when developing supervised machine learning models in teasing apart the structure of relations and probable directional relationships between features and predicted outcomes.

Availability of data and materials

The dataset used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Notes

  1. MDD as the most important predictor did not change when elastic net models were run as a classification problem, with predictors transformed into continuous variables representing sum scores and insomnia as a binary outcome representing caseness (i.e., presence vs absence): ROC = .826 (.007), Sensitivity = .746 (.010), Specificity = .748 (.020), Accuracy = .742 (95% CI; .733, .751).

Abbreviations

AN:

Anorexia nervosa

AUD:

Alcohol use disorder

AUDIT:

Alcohol use disorders identification test

BIC:

Bayesian information criterion

CBT-I:

Cognitive behavioral therapy for insomnia

CPDAG:

Completed partially directed acrylic graph

DAG:

Directed acyclic graphs

GAD:

Generalized anxiety disorder

GAD-Q-IV:

Generalized Anxiety Disorder Questionnaire-IV

GGM:

Graphical Gaussian model

ISI:

Insomnia Severity Index

LASSO:

Least absolute shrinkage and selection operator

MDD:

Major depressive disorder

PC-PTSD-5:

Primary Care PTSD Screen

PD:

Panic disorder

PDSR:

Panic Disorder Self-Report

PGD:

Prolonged grief disorder

PHQ-9:

Patient Health Questionnaire-9

PTSD:

Post-traumatic stress disorder

R2 :

Coefficient of determination

RMSE:

Root mean square error

SAD:

Social anxiety disorder

SHAP:

Shapley Additive exPlanations

SHD:

Structural hamming distance

SMOGN:

Synthetic minority over-sampling technique for regression with Gaussian noise

SPDQ:

Social Phobia Diagnostic Questionnaire

WCS:

Weight and Shape Concerns Scale

References

  1. Harvey AG, Murray G, Chandler RA, Soehner A. Sleep disturbance as transdiagnostic: consideration of neurobiological mechanisms. Clin Psychol Rev. 2011;31(2):225–35.

    Article  PubMed  Google Scholar 

  2. American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 5th ed. Arlington: American Psychiatric Association; 2013.

    Book  Google Scholar 

  3. Freeman D, Sheaves B, Waite F, Harvey AG, Harrison PJ. Sleep disturbance and psychiatric disorders. Lancet Psychiatry. 2020;7(7):628–37.

    Article  PubMed  Google Scholar 

  4. Barber KE, Rackoff GN, Newman MG. Day-to-day directional relationships between sleep duration and negative affect. J Psychosom Res. 2023;172:111437.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Nguyen VV, Zainal NH, Newman MG. Why sleep is key: Poor sleep quality is a mechanism for the bidirectional relationship between major depressive disorder and generalized anxiety disorder across 18 years. J Anxiety Disord. 2022;90:102601.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Roth T. Insomnia: definition, prevalence, etiology, and consequences. J Clin Sleep Med. 2007;3(5 suppl):S7–10.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Cox RC, Olatunji BO. A systematic review of sleep disturbance in anxiety and related disorders. J Anxiety Disord. 2016;37:104–29.

    Article  PubMed  Google Scholar 

  8. Alvaro PK, Roberts RM, Harris JK. A systematic review assessing Bidirectionality between sleep disturbances, anxiety, and depression. Sleep. 2013;36(7):1059–68.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Kwon M, Park E, Dickerson SS. Adolescent substance use and its association to sleep disturbances: a systematic review. Sleep Health. 2019;5(4):382–94.

    Article  PubMed  Google Scholar 

  10. da Luz FQ, Sainsbury A, Salis Z, Hay P, Cordas T, Morin CM, Paulos-Guarnieri L, Pascoareli L, El Rafihi-Ferreira R. A systematic review with meta-analyses of the relationship between recurrent binge eating and sleep parameters. Int J Obes (Lond). 2023;47(3):145–64.

    PubMed  Google Scholar 

  11. Baglioni C, Nanovska S, Regen W, Spiegelhalder K, Feige B, Nissen C, Reynolds CF, Riemann D. Sleep and mental disorders: a meta-analysis of polysomnographic research. Psychol Bull. 2016;142(9):969–90.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Lund HG, Reider BD, Whiting AB, Prichard JR. Sleep patterns and predictors of disturbed sleep in a large population of college students. J Adolesc Health. 2010;46(2):124–32.

    Article  PubMed  Google Scholar 

  13. Schlarb AA, Kulessa D, Gulewitsch MD. Sleep characteristics, sleep problems, and associations of self-efficacy among German university students. Nat Sci Sleep. 2012;4:1–7.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Gaultney JF. The prevalence of sleep disorders in college students: impact on academic performance. J Am Coll Health. 2010;59(2):91–7.

    Article  PubMed  Google Scholar 

  15. Byrd K, Gelaye B, Tadessea MG, Williams MA, Lemma S, Berhanec Y. Sleep disturbances and common mental disorders in college students. Health Behav Policy Rev. 2014;1(3):229–37.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Chekroud AM, Hawrilenko M, Loho H, Bondar J, Gueorguieva R, Hasan A, Kambeitz J, Corlett PR, Koutsouleris N, Krumholz HM, et al. Illusory generalizability of clinical prediction models. Science. 2024;383(6679):164–7.

    Article  CAS  PubMed  Google Scholar 

  17. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1–22.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Zou H, Hastie T. Addendum: Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol. 2005;67(5):768–768.

    Article  Google Scholar 

  19. Webb CA, Cohen ZD, Beard C, Forgeard M, Peckham AD, Björgvinsson T. Personalized prognostic prediction of treatment outcome for depressed patients in a naturalistic psychiatric hospital setting: A comparison of machine learning approaches. J Consult Clin Psychol. 2020;88(1):25–38.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Bard HA, O’Driscoll C, Miller CB, Henry AL, Cape J, Espie CA. Insomnia, depression, and anxiety symptoms interact and individually impact functioning: A network and relative importance analysis in the context of insomnia. Sleep Med. 2023;101:505–14.

    Article  PubMed  Google Scholar 

  21. Lyall LM, Sangha N, Zhu X, Lyall DM, Ward J, Strawbridge RJ, Cullen B, Smith DJ. Subjective and objective sleep and circadian parameters as predictors of depression-related outcomes: a machine learning approach in UK Biobank. J Affect Disord. 2023;335:83–94.

    Article  PubMed  Google Scholar 

  22. McCallum SM, Batterham PJ, Calear AL, Sunderland M, Carragher N, Kazan D. Associations of fatigue and sleep disturbance with nine common mental disorders. J Psychosom Res. 2019;123:109727.

    Article  PubMed  Google Scholar 

  23. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4768–77.

    Google Scholar 

  24. Borsboom D. Psychometric perspectives on diagnostic systems. J Clin Psychol. 2008;64(9):1089–108.

    Article  PubMed  Google Scholar 

  25. Epskamp S, Borsboom D, Fried EI. Estimating psychological networks and their accuracy: a tutorial paper. Behav Res Methods. 2018;50(1):195–212.

    Article  PubMed  Google Scholar 

  26. Robinaugh DJ, Hoekstra RHA, Toner ER, Borsboom D. The network approach to psychopathology: a review of the literature 2008–2018 and an agenda for future research. Psychol Med. 2020;50(3):353–66.

    Article  PubMed  Google Scholar 

  27. Borsboom D. A network theory of mental disorders. World Psychiatry. 2017;16(1):5–13.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Yuksel D, Kiss O, Prouty DE, Baker FC, de Zambotti M. Clinical characterization of insomnia in adolescents - an integrated approach to psychopathology. Sleep Med. 2022;93:26–38.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Chen P, Zhang L, Sha S, Lam MI, Lok KI, Chow IHI, Si TL, Su Z, Cheung T, Feng Y, et al. Prevalence of insomnia and its association with quality of life among Macau residents shortly after the summer 2022 COVID-19 outbreak: a network analysis perspective. Front Psychiatry. 2023;14:1113122.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Chen P, Zhao YJ, An FR, Li XH, Lam MI, Lok KI, Wang YY, Li JX, Su Z, Cheung T, et al. Prevalence of insomnia and its association with quality of life in caregivers of psychiatric inpatients during the COVID-19 pandemic: a network analysis. BMC Psychiatry. 2023;23(1):837.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Cha EJ, Bang YR, Jeon HJ, Yoon IY. Network structure of insomnia symptoms in shift workers compared to non-shift workers. Chronobiol Int. 2023;40(3):246–52.

    Article  PubMed  Google Scholar 

  32. Cha EJ, Hong S, Kim S, Chung S, Jeon HJ. Contribution of dysfunctional sleep-related cognitions on insomnia severity: a network perspective. J Clin Sleep Med. 2024;20:743–51.

    Article  PubMed  Google Scholar 

  33. Takano Y, Ibata R, Nakano N, Sakano Y. Network analysis to estimate central insomnia symptoms among daytime workers at-risk for insomnia. Sci Rep. 2023;13(1):16406.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Bai W, Zhao Y, An F, Zhang Q, Sha S, Cheung T, Cheng CP, Ng CH, Xiang YT. Network Analysis of Insomnia in Chinese mental health professionals during the COVID-19 pandemic: a cross-sectional study. Nat Sci Sleep. 2021;13:1921–30.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Zhao N, Zhao YJ, An F, Zhang Q, Sha S, Su Z, Cheung T, Jackson T, Zang YF, Xiang YT. Network analysis of comorbid insomnia and depressive symptoms among psychiatric practitioners during the COVID-19 pandemic. J Clin Sleep Med. 2023;19(7):1271–9.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Ma Z, Chen XY, Tao Y, Huang S, Yang Z, Chen J, Bu L, Wang C, Fan F. How to improve the long-term quality of life, insomnia, and depression of survivors 10 years after the Wenchuan earthquake? A network analysis Asian J Psychiatr. 2022;73:103137.

    Article  PubMed  Google Scholar 

  37. Ma Z, Wang D, Chen XY, Tao Y, Yang Z, Zhang Y, Huang S, Bu L, Wang C, Wu L, Fan F. Network structure of insomnia and depressive symptoms among shift workers in China. Sleep Med. 2022;100:150–6.

    Article  PubMed  Google Scholar 

  38. Zhang N, Ma S, Wang P, Yao L, Kang L, Wang W, Nie Z, Chen M, Ma C, Liu Z. Psychosocial factors of insomnia in depression: a network approach. BMC Psychiatry. 2023;23(1):949.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Pavlova I, Rogowska AM. Exposure to war, war nightmares, insomnia, and war-related posttraumatic stress disorder: A network analysis among university students during the war in Ukraine. J Affect Disord. 2023;342:148–56.

    Article  PubMed  Google Scholar 

  40. Misiak B, Gaweda L, Moustafa AA, Samochowiec J. Insomnia moderates the association between psychotic-like experiences and suicidal ideation in a non-clinical population: a network analysis. Eur Arch Psychiatry Clin Neurosci. 2024;274(2):255–63.

    Article  PubMed  Google Scholar 

  41. Peng P, Wang Q, Zhou Y, Hao Y, Chen S, Wu Q, Li M, Wang Y, Yang Q, Wang X, et al. Inter-relationships of insomnia and psychiatric symptoms with suicidal ideation among patients with chronic schizophrenia: a network perspective. Prog Neuropsychopharmacol Biol Psychiatry. 2024;129:110899.

    Article  PubMed  Google Scholar 

  42. Zhao W, Van Someren EJW, Xu Z, Ren Z, Tang L, Li C, Lei X. Identifying the insomnia-related psychological issues associated with hyperarousal: A network perspective. Int J Psychophysiol. 2024;195:112276.

    Article  PubMed  Google Scholar 

  43. Dekker K, Blanken TF, Van Someren EJ. Insomnia and personality-A network approach. Brain Sci. 2017;7(3):28.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Bai W, Zhao YJ, Cai H, Sha S, Zhang Q, Lei SM, Lok GKI, Chow IHI, Cheung T, Su Z, et al. Network analysis of depression, anxiety, insomnia and quality of life among Macau residents during the COVID-19 pandemic. J Affect Disord. 2022;311:181–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Zhang L, Tao Y, Hou W, Niu H, Ma Z, Zheng Z, Wang S, Zhang S, Lv Y, Li Q, Liu X. Seeking bridge symptoms of anxiety, depression, and sleep disturbance among the elderly during the lockdown of the COVID-19 pandemic-A network approach. Front Psychiatry. 2022;13:919251.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Chattrattrai T, Blanken TF, Lobbezoo F, Su N, Aarab G, Van Someren EJW. A network analysis of self-reported sleep bruxism in the Netherlands sleep registry: its associations with insomnia and several demographic, psychological, and life-style factors. Sleep Med. 2022;93:63–70.

    Article  PubMed  Google Scholar 

  47. Tao Y, Hou W, Niu H, Ma Z, Zhang S, Zhang L, et al. Centrality and bridge symptoms of anxiety, depression, and sleep disturbance among college students during the COVID-19 pandemic-A network analysis. Curr Psychol. 2024;43:13897–908.

    Article  Google Scholar 

  48. Peng P, Liang M, Wang Q, Lu L, Wu Q, Chen Q. Night shifts, insomnia, anxiety, and depression among Chinese nurses during the COVID-19 pandemic remission period: a network approach. Front Public Health. 2022;10:1040298.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Cha EJ, Jeon HJ, Chung S. Central symptoms of insomnia in relation to depression and COVID-19 anxiety in general population: a network analysis. J Clin Med. 2022;11(12):3416.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Li W, Zhao N, Yan X, Xu X, Zou S, Wang H, Li Y, Du X, Zhang L, Zhang Q, et al. Network analysis of depression, anxiety, posttraumatic stress symptoms, insomnia, pain, and fatigue in clinically stable older patients with psychiatric disorders during the covid-19 outbreak. J Geriatr Psychiatry Neurol. 2022;35(2):196–205.

    Article  PubMed  Google Scholar 

  51. Xu X, Xie T, Zhou N, Shi G, Wen J, Wang J, Li X, Poppen PJ. Network analysis of PGD, PTSD and insomnia symptoms in Chinese shidu parents with PGD. Eur J Psychotraumatol. 2022;13(1):2057674.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432–41.

    Article  PubMed  Google Scholar 

  53. Williams DR, Rast P. Back to the basics: Rethinking partial correlation network methodology. Br J Math Stat Psychol. 2020;73(2):187–212.

    Article  PubMed  Google Scholar 

  54. McNally RJ, Robinaugh DJ, Deckersbach T, Sylvia LG, Nierenberg AA. Estimating the symptom structure of bipolar disorder via network analysis: Energy dysregulation as a central symptom. J Psychopathol Clin Sci. 2022;131(1):86–97.

    Article  PubMed  Google Scholar 

  55. Briganti G, Scutari M, McNally RJ. A tutorial on bayesian networks for psychopathology researchers. Psychol Methods. 2023;28(4):947–61.

    Article  PubMed  Google Scholar 

  56. Zhang Y, Ma Z, Chen W, Wang D, Fan F. Network analysis of health-related behaviors, insomnia, and depression among urban left-behind adolescents in China. Child Psychiatry Hum Dev. 2023:1–12.

  57. Yu B, Fu Y, Dong S, Reinhardt JD, Jia P, Yang S. Identifying potential action points for improving sleep and mental health among employees: a network analysis. Sleep Med. 2024;113:76–83.

    Article  PubMed  Google Scholar 

  58. Fitzsimmons-Craft EE, Taylor CB, Newman MG, Zainal NH, Rojas-Ashe EE, Lipson SK, Firebaugh ML, Ceglarek P, Topooco N, Jacobson NC, et al. Harnessing mobile technology to reduce mental health disorders in college populations: a randomized controlled trial study protocol. Contemp Clin Trials. 2021;103:106320.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Morin CM, Belleville G, Bélanger L, Ivers H. The Insomnia Severity Index: Psychometric indicators to detect insomnia cases and evaluate treatment response. Sleep. 2011;34(5):601–8.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Bastien CH, Vallières A, Morin CM. Validation of the Insomnia Severity Index as an outcome measure for insomnia research. Sleep Med. 2001;2(4):297–307.

    Article  PubMed  Google Scholar 

  61. Kroenke K, Spitzer RL. The PHQ-9: a new depression diagnostic and severity measure. Psychiatr Ann. 2002;32(9):509–15.

    Article  Google Scholar 

  62. Manea L, Gilbody S, McMillan D. Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis. CMAJ. 2012;184(3):E191–196.

    Article  PubMed  PubMed Central  Google Scholar 

  63. Prins A, Bovin MJ, Smolenski DJ, Marx BP, Kimerling R, Jenkins-Guarnieri MA, Kaloupek DG, Schnurr PP, Kaiser AP, Leyva YE, Tiet QQ. The Primary Care PTSD Screen for DSM-5 (PC-PTSD-5): development and evaluation within a veteran primary care sample. J Gen Intern Med. 2016;31(10):1206–11.

    Article  PubMed  PubMed Central  Google Scholar 

  64. Newman MG, Zuellig AR, Kachin KE, Constantino MJ, Przeworski A, Erickson T, Cashman-McGrath L. Preliminary reliability and validity of the Generalized Anxiety Disorder Questionnaire-IV: a revised self-report diagnostic measure of generalized anxiety disorder. Behav Ther. 2002;33(2):215–33.

    Article  Google Scholar 

  65. Newman MG, Kachin KE, Zuellig AR, Constantino MJ, Cashman-McGrath L. The social phobia diagnostic questionnaire: preliminary validation of a new self-report diagnostic measure of social phobia. Psychol Med. 2003;33(4):623–35.

    Article  CAS  PubMed  Google Scholar 

  66. Newman MG, Holmes M, Zuellig AR, Kachin KE, Behar E. The reliability and validity of the Panic Disorder Self-Report: a new diagnostic screening measure of panic disorder. Psychol Assess. 2006;18(1):49–61.

    Article  PubMed  Google Scholar 

  67. Killen JD, Taylor CB, Hayward C, Wilson DM, Haydel KF, Hammer LD, Simmonds B, Robinson TN, Litt I, Varady A, Kraemer H. Pursuit of thinness and onset of eating disorder symptoms in a community sample of adolescent girls: a three-year prospective analysis. Int J Eat Disord. 1994;16(3):227–38.

    Article  CAS  PubMed  Google Scholar 

  68. Fitzsimmons-Craft EE, Balantekin KN, Eichen DM, Graham AK, Monterubio GE, Sadeh-Sharvit S, Goel NJ, Flatt RE, Saffran K, Karam AM, et al. Screening and offering online programs for eating disorders: Reach, pathology, and differences across eating disorder status groups at 28 U.S. universities. Int J Eat Disord. 2019;52(10):1125–36.

    Article  PubMed  PubMed Central  Google Scholar 

  69. Bush K, Kivlahan DR, McDonell MB, Fihn SD, Bradley KA. The AUDIT alcohol consumption questions (AUDIT-C): an effective brief screening test for problem drinking. Arch Intern Med. 1998;158(16):1789–95.

    Article  CAS  PubMed  Google Scholar 

  70. Bradley KA, DeBenedetti AF, Volk RJ, Williams EC, Frank D, Kivlahan DR. AUDIT-C as a brief screen for alcohol misuse in primary care. Alcohol Clin Exp Res. 2007;31(7):1208–17.

    Article  PubMed  Google Scholar 

  71. Buuren Sv, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67.

    Article  Google Scholar 

  72. van Ginkel JR, Linting M, Rippe RCA, van der Voort A. Rebutting existing misconceptions about multiple imputation as a method for handling missing data. J Pers Assess. 2020;102(3):297–308.

    Article  PubMed  Google Scholar 

  73. Jones PJ, Heeren A, McNally RJ. Commentary: A network theory of mental disorders. Front Psychol. 2017;8:1305.

    Article  PubMed  PubMed Central  Google Scholar 

  74. Kim J-H. Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput Stat Data Anal. 2009;53(11):3735–45.

    Article  Google Scholar 

  75. Kuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2013.

    Book  Google Scholar 

  76. Branco P, Torgo L, Ribeiro RP. SMOGN: a pre-processing approach for imbalanced regression. In: First International Workshop on Learning with Imbalanced Domains: Theory and Applications: 2017: Proceedings of Machine Learning Research; 2017. p. 36–50.

  77. Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28(5):1–26.

    Article  Google Scholar 

  78. Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci. 2021;7:e623.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Uher R, Tansey KE, Malki K, Perlis RH. Biomarkers predicting treatment outcome in depression: what is clinically significant? Pharmacogenomics. 2012;13(2):233–40.

    Article  PubMed  Google Scholar 

  80. Scutari M. Learning Bayesian Networks with the bnlearn R Package. J Stat Softw. 2010;35(3):1–22.

    Article  Google Scholar 

  81. McNally RJ, Heeren A, Robinaugh DJ. A Bayesian network analysis of posttraumatic stress disorder symptoms in adults reporting childhood sexual abuse. Eur J Psychotraumatol. 2017;8(sup3):1341276.

    Article  PubMed  PubMed Central  Google Scholar 

  82. McNally RJ, Mair P, Mugno BL, Riemann BC. Co-morbid obsessive-compulsive disorder and depression: A Bayesian network approach. Psychol Med. 2017;47(7):1204–14.

    Article  CAS  PubMed  Google Scholar 

  83. Scutari M, Denis JB. Bayesian Networks: With Examples in R. 2nd ed. New York: Taylor & Francis Group; 2021. p. i–274.

  84. Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP. Causal protein-signaling networks derived from multiparameter single-cell data. Science. 2005;308(5721):523–9.

    Article  CAS  PubMed  Google Scholar 

  85. Forbes MK, Wright AGC, Markon KE, Krueger RF. Evidence that psychopathology symptom networks have limited replicability. J Abnorm Psychol. 2017;126(7):969–88.

    Article  PubMed  PubMed Central  Google Scholar 

  86. Forbes MK, Wright AGC, Markon KE, Krueger RF. Quantifying the reliability and replicability of psychopathology network characteristics. Multivar Behav Res. 2021;56(2):224–42.

    Article  Google Scholar 

  87. Guloksuz S, Pries LK, van Os J. Application of network methods for understanding mental disorders: pitfalls and promise. Psychol Med. 2017;47(16):2743–52.

    Article  CAS  PubMed  Google Scholar 

  88. Funkhouser CJ, Correa KA, Gorka SM, Nelson BD, Phan KL, Shankman SA. The replicability and generalizability of internalizing symptom networks across five samples. J Abnorm Psychol. 2020;129(2):191–203.

    Article  PubMed  Google Scholar 

  89. Tsamardinos I, Brown LE, Aliferis CF. The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn. 2006;65(1):31–78.

    Article  Google Scholar 

  90. Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092–7.

    Article  PubMed  Google Scholar 

  91. Espie CA, Kyle SD, Hames P, Gardani M, Fleming L, Cape J. The Sleep Condition Indicator: a clinical screening tool to evaluate insomnia disorder. BMJ Open. 2014;4(3):e004183.

    Article  PubMed  PubMed Central  Google Scholar 

  92. Kennedy SH. Core symptoms of major depressive disorder: relevance to diagnosis and treatment. Dialogues Clin Neurosci. 2008;10(3):271–7.

    Article  PubMed  PubMed Central  Google Scholar 

  93. Wichers M, Riese H, Hodges TM, Snippe E, Bos FM. A narrative review of network studies in depression: What different methodological approaches tell us about depression. Front Psychiatry. 2021;12:719490.

    Article  PubMed  PubMed Central  Google Scholar 

  94. Malgaroli M, Calderon A, Bonanno GA. Networks of major depressive disorder: A systematic review. Clin Psychol Rev. 2021;85:102000.

    Article  PubMed  Google Scholar 

  95. Contreras A, Nieto I, Valiente C, Espinosa R, Vazquez C. The study of psychopathology from the network analysis perspective: a systematic review. Psychother Psychosom. 2019;88(2):71–83.

    Article  PubMed  Google Scholar 

  96. Huey NS, Guan NC, Gill JS, Hui KO, Sulaiman AH, Kunagasundram S. Core symptoms of major depressive disorder among palliative care patients. Int J Environ Res Public Health. 2018;15(8):1758.

    Article  PubMed  PubMed Central  Google Scholar 

  97. Moradi S, Falsafinejad MR, Delavar A, Rezaeitabar V, Borj’ali A, Aggen SH, Kendler KS. Network modeling of major depressive disorder symptoms in adult women. Psychol Med. 2023;53(12):5449–58.

    Article  PubMed  Google Scholar 

  98. Castellanos MA, Ausin B, Bestea S, Gonzalez-Sanguino C, Munoz M. A network analysis of major depressive disorder symptoms and age- and gender-related differences in people over 65 in a Madrid community sample (Spain). Int J Environ Res Public Health. 2020;17(23):8934.

    Article  PubMed  PubMed Central  Google Scholar 

  99. Berlim MT, Richard-Devantoy S, Dos Santos NR, Turecki G. The network structure of core depressive symptom-domains in major depressive disorder following antidepressant treatment: a randomized clinical trial. Psychol Med. 2021;51(14):2399–413.

    Article  PubMed  Google Scholar 

  100. Perlis ML, Smith LJ, Lyness JM, Matteson SR, Pigeon WR, Jungquist CR, Tu X. Insomnia as a risk factor for onset of depression in the elderly. Behav Sleep Med. 2006;4(2):104–13.

    Article  PubMed  Google Scholar 

  101. Harvey AG. Insomnia, psychiatric disorders, and the transdiagnostic perspective. Curr Dir Psychol Sci. 2008;17(5):299–303.

    Article  Google Scholar 

  102. Ehlers CL, Frank E, Kupfer DJ. Social zeitgebers and biological rhythms. A unified approach to understanding the etiology of depression. Arch Gen Psychiatry. 1988;45(10):948–52.

    Article  CAS  PubMed  Google Scholar 

  103. Fried EI, Epskamp S, Nesse RM, Tuerlinckx F, Borsboom D. What are “good” depression symptoms? Comparing the centrality of DSM and non-DSM symptoms of depression in a network analysis. J Affect Disord. 2016;189:314–20.

    Article  PubMed  Google Scholar 

  104. Mullarkey MC, Marchetti I, Beevers CG. Using network analysis to identify central symptoms of adolescent depression. J Clin Child Adolesc Psychol. 2019;48(4):656–68.

    Article  PubMed  Google Scholar 

  105. Ramos-Vera C, Banos-Chaparro J, Ogundokun RO. The network structure of depressive symptomatology in Peruvian adults with arterial hypertension. F1000Res. 2021;10:19.

    Article  CAS  PubMed  Google Scholar 

  106. Jo D, Kim H. Network analysis of depressive symptoms in South Korean adults: similarities and differences between women and men. Curr Psychol. 2023;43(8):7193–204.

    Article  Google Scholar 

  107. Cheung T, Jin Y, Lam S, Su Z, Hall BJ, Xiang YT. International Research Collaboration on C: Network analysis of depressive symptoms in Hong Kong residents during the COVID-19 pandemic. Transl Psychiatry. 2021;11(1):460.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Xie T, Wen J, Liu X, Wang J, Poppen PJ. Utilizing network analysis to understand the structure of depression in Chinese adolescents: Replication with three depression scales. Curr Psychol. 2023;42(25):21597–608.

    Article  Google Scholar 

  109. Roland A, Windal M, Briganti G, Kornreich C, Mairesse O. Intensity and network structure of insomnia symptoms and the role of mental health during the first two waves of the COVID-19 pandemic. Nat Sci Sleep. 2023;15:1003–17.

    Article  PubMed  PubMed Central  Google Scholar 

  110. Cai H, Zhao YJ, Xing X, Tian T, Qian W, Liang S, Wang Z, Cheung T, Su Z, Tang YL, et al. Network analysis of comorbid anxiety and insomnia among clinicians with depressive symptoms during the late stage of the COVID-19 pandemic: a cross-sectional study. Nat Sci Sleep. 2022;14:1351–62.

    Article  PubMed  PubMed Central  Google Scholar 

  111. Kurian BT, Greer TL, Trivedi MH. Strategies to enhance the therapeutic efficacy of antidepressants: targeting residual symptoms. Expert Rev Neurother. 2009;9(7):975–84.

    Article  PubMed  PubMed Central  Google Scholar 

  112. Asarnow LD, Manber R. Cognitive Behavioral Therapy for Insomnia in Depression. Sleep Med Clin. 2019;14(2):177–84.

    Article  PubMed  PubMed Central  Google Scholar 

  113. Carney CE, Edinger JD, Kuchibhatla M, Lachowski AM, Bogouslavsky O, Krystal AD, Shapiro CM. Cognitive behavioral insomnia therapy for those with insomnia and depression: A randomized controlled clinical trial. Sleep. 2017;40(4):zsx019.

    Article  PubMed  PubMed Central  Google Scholar 

  114. Manber R, Buysse DJ, Edinger J, Krystal A, Luther JF, Wisniewski SR, Trockel M, Kraemer HC, Thase ME. Efficacy of cognitive-behavioral therapy for insomnia combined with antidepressant pharmacotherapy in patients with comorbid depression and insomnia: a randomized controlled trial. J Clin Psychiatry. 2016;77(10):e1316–23.

    Article  PubMed  Google Scholar 

  115. Manber R, Edinger JD, Gress JL, San Pedro-Salcedo MG, Kuo TF, Kalista T. Cognitive behavioral therapy for insomnia enhances depression outcome in patients with comorbid major depressive disorder and insomnia. Sleep. 2008;31(4):489–95.

    Article  PubMed  PubMed Central  Google Scholar 

  116. Huang D, Susser E, Rudolph KE, Keyes KM. Depression networks: A systematic review of the network paradigm causal assumptions. Psychol Med. 2023;53(5):1665–80.

    Article  PubMed  Google Scholar 

  117. Ryan O, Bringmann LF, Schuurman NK. The Challenge of Generating Causal Hypotheses Using Network Models. Struct Equ Modeling. 2022;29(6):953–70.

    Article  Google Scholar 

  118. Dablander F, Hinne M. Node centrality measures are a poor substitute for causal inference. Sci Rep. 2019;9(1):6846.

    Article  PubMed  PubMed Central  Google Scholar 

  119. Jordan DG, Slavish DC, Dietch J, Messman B, Ruggero C, Kelly K, Taylor DJ. Investigating sleep, stress, and mood dynamics via temporal network analysis. Sleep Med. 2023;103:1–11.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

The current study was funded by the National Institute of Mental Health (R01MH115128 || 23/08/2018).

Author information

Authors and Affiliations

Authors

Contributions

AC: Conceptualization, Methodology, Software, Formal analysis, Data Curation, Visualization, Writing – original draft preparation, Writing – review & editing. SYB, MHSN: Data Curation, Writing – original draft preparation. EEFC, DE, DEW, CBT: Investigation, Writing – review and editing, Project administration, Funding acquisition. MGN: Conceptualization, Methodology, Validation, Investigation, Writing – review and editing, Supervision, Project administration, Funding acquisition. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Adam Calderon.

Ethics declarations

Ethics approval and consent to participate

All participants gave informed consent to participate in the study. The protocol was approved by the Institutional Review Boards (IRB) at the coordinating universities (IRB ID #: 201901073).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Calderon, A., Baik, S.Y., Ng, M.H.S. et al. Machine learning and Bayesian network analyses identifies associations with insomnia in a national sample of 31,285 treatment-seeking college students. BMC Psychiatry 24, 656 (2024). https://doi.org/10.1186/s12888-024-06074-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12888-024-06074-7

Keywords