Validation of the German version of the insomnia severity index in adolescents, young adults and adult workers: results from three cross-sectional studies

Background A variety of objective and subjective methods exist to assess insomnia. The Insomnia Severity Index (ISI) was developed to provide a brief self-report instrument useful to assess people’s perception of sleep complaints. The ISI was developed in English, and has been translated into several languages including German. Surprisingly, the psychometric properties of the German version have not been evaluated, although the ISI is often used with German-speaking populations. Methods The psychometric properties of the ISI are tested in three independent samples: 1475 adolescents, 862 university students, and 533 police and emergency response service officers. In all three studies, participants provide information about insomnia (ISI), sleep quality (Pittsburgh Sleep Quality Index), and psychological functioning (diverse instruments). Descriptive statistics, gender differences, homogeneity and internal consistency, convergent validity, and factorial validity (including measurement invariance across genders) are examined in each sample. Results The findings show that the German version of the ISI has generally acceptable psychometric properties and sufficient concurrent validity. Confirmatory factor analyses show that a 1-factor solution achieves good model fit. Furthermore, measurement invariance across gender is supported in all three samples. Conclusions While the ISI has been widely used in German-speaking countries, this study is the first to provide empirical evidence that the German version of this instrument has good psychometric properties and satisfactory convergent and factorial validity across various age groups and both men and women. Thus, the German version of the ISI can be recommended as a brief screening measure in German-speaking populations. Electronic supplementary material The online version of this article (doi:10.1186/s12888-016-0876-8) contains supplementary material, which is available to authorized users.


Background
Poor sleep is being increasingly recognised as a widespread and persistent health complaint. Estimates of prevalence of at least one symptom of insomnia stand at between 20 and 30 % in adult populations [1], with women typically having an increased risk of insomnia [2,3]. A close relationship exists between people's sleep, their daily well-being [4], memory [5], and daytime performance [6]. Moreover, chronic sleep complaints negatively impact on physical and psychological functioning in children [7], adolescents [8], adults [4], and the elderly [9]. For instance, a wealth of studies shows that sleep complaints are closely associated with depressive symptoms, with correlations ranging between r = .50 and .60 [10,11].
A variety of objective and subjective methods exist to assess insomnia. Polysomnography and actigraphy are objective methods. They are considered reliable and valid techniques to assess sleep duration and efficiency [12]. However, polysomnography is costly, typically takes place in artificial sleep environments, and is not capable of detecting insomnia against subjective diagnostic criteria [9]. Among the subjective methods, sleep diaries are the most frequently used form of assessment [12,13]. Although this method is cost-effective and correlates reasonably well with objective methods, sleep diaries depend on the willingness of participants to provide daily reports immediately after awakening over longer periods of time. In contrast, selfreport questionnaires can be used to collect data on sleep problems with minimal effort and cost, while providing information about subjectively perceived consequences linked with sleep problems [9], such as irritability or difficulty concentrating. Given that insomnia constitutes a significant health hazard, which is often undetected and therefore undertreated, such reliable and economic tools are essential in facilitating early recognition and treatment of sleep complaints [14].
Because insomnia is a subjective disorder, Morin [15] developed the Insomnia Severity Index (ISI) to provide a brief self-report instrument useful to assess people's subjective perception of sleep complaints in various populations. Thanks to its brevity (7 items, max. 2 min for completion and 1 min for scoring), the ISI can be used as a screening measure in clinical practice, allows for the assessment of change following treatment, and is also useful for epidemiological research. ISI scores enable a clinical evaluation with regards to insomnia symptoms. The ISI includes both nighttime and daytime components of insomnia and measures (a) the subjective symptoms and consequences of insomnia and (b) the concerns that result from these difficulties. In part, the ISI takes into account the diagnostic criteria of insomnia as formulated in the DSM-IV (Diagnostic and Statistical Manual of Mental Disorders, 4th Edition) by the American Psychiatric Association [16]. The instrument consists of seven items concerning the severity of sleep-onset (item 1) and sleep maintenance difficulties (including nocturnal and early morning awakening, items 2 and 3 respectively), current sleep satisfaction (item 4), interference of sleep difficulties with daily functioning (item 5), apparentness of impairment due to sleep complaints (item 6), and concerns or distress attributable to sleep complaints (item 7). All items are answered on a 5-point Likert scale ranging from 0 (not at all) to 4 (extremely), with a reference period of the previous 2 weeks. The ISI exists in three different versions: self-administered (patient), administered by a significant other (e.g., spouse, parent), and administered by a clinician [17,18]. ISI scores range from 0 to 28 and can be interpreted as follows: 0-7 = no clinically significant insomnia, 8-14 = sub-threshold insomnia, 15-21 = clinical insomnia (moderate severity), and 22-28 = clinical insomnia (severe) [17,19]. The suitability of these cut-offs has been tested both with classical test theory [17,20] and item response theory [19]. The ISI has been used in clinical research and practice for almost 30 years.
The first systematic validation study was published in 2001 by Bastien et al. [17] showing that in a population of 145 patients suffering from clinical sleep disorders, the ISI had good internal consistency and concurrent validity with sleep diaries. As a result of treatment, the changes over time in the ISI were positively correlated with changes in sleep diaries, polysomnography and changes observed by the clinician.
Currently, several studies provide systematic support for the psychometric properties and the validity of the ISI in different populations such as insomnia patients [17,19], young adults meeting the DSM-IV criteria for primary insomnia versus normal sleepers [20], community samples of adults [19,21,22], non-clinical child and adolescent populations [22,23], elder care community day centre visitors [9], and cancer patients [18].
With regards to external construct validity, prior research showed that ISI individual items correlated reasonably well with the corresponding variables on sleep diaries for sleep onset latency, waking after sleep onset and early morning awakening, with correlations between .11 and .91 [17,19,23]. As expected, significant, albeit weaker associations were found between the individual ISI variables and polysomnographic variables, with correlations ranging from .07 to .45 [17,19]. Furthermore, moderate-to-strong relationships exist between the ISI and other self-report sleep questionnaires [9,20,23]. For instance, in Spanish adolescents, the correlation of the ISI total score with the Pittsburgh Sleep Quality Index (PSQI) was r = .68 [22]. Moreover, a good degree of convergence was found between the patient and clinician versions of the ISI, with correlations between r = .50 and .73 [17,21,23]. Additionally, significant correlations were observed between the ISI and other psychological constructs such as depressive symptoms [19,22], anxiety [19,22], general fatigue [19,22], and psychological wellbeing [18,19,23].
The factorial validity of the ISI has been examined in several studies, mostly using exploratory factor analysis. These studies have provided inconsistent findings with solutions suggesting 1 to 3 factors [17][18][19]. So far, only the Spanish version of the ISI has been validated with confirmatory factor analysis (CFA) [22]. Fernandez-Mendoza et al. [22] compared three alternative models: Model 1 posited that all items load on a single factor (default model) [9,25]. Model 2 assumed a two-factor structure with two correlated factors (factor 1 = nighttime sleep difficulties: factor 2 = daytime impact of insomnia) [9,18,26]. Model 3 postulated a three-factor structure with three correlated factors (factor 1 = nighttime sleep difficulties; factor 2 = sleep dissatisfaction; factor 3 = impact of insomnia). According to their data, Model 3 achieved the best model fit.
Given this background, the main objective of the present article was to validate the German version of the ISI across three age groups with a specific focus on gender invariance. This study is warranted for at least four reasons: First, the ISI has been widely used in sleep research during the last 15 years, including many studies with German-speaking samples [30,[32][33][34][35]. Second, German is one of the most frequently spoken languages worldwide with approximately 90 to 95 million first and 10 to 25 million second language speakers [36]. Third, although gender differences regarding the prevalence of insomnia symptoms are consistently reported in the literature [2,3,37], none of the previous studies has examined whether the psychometric properties of the ISI apply equally for women and men. Fourth, few validation studies have focused on younger people [22], and none of them has compared the validity of the ISI across different age groups. Nevertheless, such a comparison seems crucial to assure that an instrument is suitable for both younger and older populations.
Based on previous research, four hypotheses were formulated: Our first hypothesis was that female participants would report higher insomnia scores than their male counterparts [2,3,37]. Our second hypothesis was that the ISI would have adequate homogeneity and internal consistency across all study populations, and both males and females [9,17,19,[21][22][23]. More specifically, we expected inter-item correlations ≥ .20, Cronbach's alpha coefficients ≥ .70, and itemtotal correlations ≥ .30 [38]. Our third hypothesis was that adequate convergent validity would exist in male and female participants across all study populations. That is, we hypothesized that the individual items and the total ISI score would be at least moderately and positively correlated with the corresponding items of the PSQI [17,19,23].
Our fourth and last hypothesis was that the ISI total score would be at least moderately and positively correlated with indicators of psychological functioning [18,19,22,23]. With regard to factorial validity, we did not have a clear-cut hypothesis. Nevertheless, we expected that adequate model fit would be found with either a 1-, 2-or 3-factor model [17,22]. For the best fitting model, we assumed fair factor loadings on the corresponding factors (≥ .45) [39], and at least weak measurement invariance across genders (see methods of study 1 for more information regarding types of measurement invariance).

Participants and procedures
The sample consisted of 1475 adolescents (M age = 13.4 years, SD = 1.4; range: 11-16 years; 49 % males) who were recruited from five middle schools in the German-speaking, north-western part of Switzerland. Data with this sample have been published previously [40]. Written informed consent was obtained from participants and parents, and the local ethics committee approved the study.

Sleep
To assess insomnia, the participants filled in the 7-item Insomnia Severity Index (ISI) [17], which has been described in detail in the introduction section. To ensure optimal translation, we rigorously followed the procedure set out by Brislin [41]. English items were translated into German, and then back-translated into English by an independent translator (see Additional file 1 for the wording of the items in German). To assess sleep quality, participants filled in a German adaptation [42] of the Pittsburgh Sleep Quality Index (PSQI) [43]. The PSQI includes several indicators to assess both sleep quality and sleep disturbances. The psychometric properties of the instrument are convincing [43]. The German version consists of 11 items, which concern two typical weekdays. The participants answered questions anchored on an 8-point Likert scale concerning sleep-related factors just after waking up in the morning (three items: perceived quality of sleep, restoration, and mood), during the daytime (two items: sleepiness and concentration), and before going to bed (two items: sleepiness and mood). Possible answers ranged from 1 (e.g., very bad sleep quality) to 8 (e.g., very good sleep quality). In addition, sleep onset latency (min), sleep duration and the number of awakenings during nighttime were assessed. Detailed information about bedtime and waking up allowed for the calculation of total sleep duration.

Psychological functioning
Psychological functioning was assessed with the KIDSCREEN-52 [44]. The questionnaire consists of 52 items focusing on 10 different domains of children's and adolescents' psychological functioning (e.g., psychological well-being, parent relation, etc.). Answers were given on a 5-point Likert scale, with the anchor points 1 (not at all) and 5 (extremely/always). The various domains can be aggregated to a global quality of life index, with higher mean scores reflecting higher psychological functioning (Cronbach's alpha for the overall index = .92).

Statistical analyses
Univariate analyses of variance (ANOVA) were run to test gender differences. Product-moment correlations were calculated to examine homogeneity and item-total correlations. Cronbach's alpha coefficients were obtained to test internal consistency. Correlational analyses were used to test convergent validity. Finally, CFA was used to test factorial validity. Although several models were tested in prior research [17,22], we assumed that all items would load on the same factor. Thus, the 1-factor CFA model was based on seven observed measures and one latent construct. A default model was used, in which all parameters were freely estimated. This default model was then tested against a model, in which all free factor loadings were set equally across both genders. Parameter estimation was conducted using maximum likelihood (ML), and multiple goodnessof-fit indexes were considered to examine how well the theoretical model fitted the empirical data [45]. Measurement invariance across gender of the measurement model was tested via simultaneous multiple group comparison. Normed fit index (NFI) should be ≥ .95, probability of close fit (PCLOSE) ≥ .50, comparative fit index (CFI) ≥ .95, Tucker Lewis Index (TLI) ≥ .95, and root mean square error of approximation (RMSEA) ≤ .05. We first examined the most parsimonious 1-factor model. In case of unsatisfactory model fit, we continued with the more complex 2-and 3factor models. According to Comrey  test measurement invariance across gender, we compared the default model against a model which assumed configural (same pattern of fixed and free factor loadings across groups), weak (invariant factor loadings across groups), strong (invariant factor loadings and intercepts across groups) and strict measurement invariance (invariant factor loadings, intercepts, and unique factor variances across groups) [46].

Results
As shown in Table 1, the ISI mean score was 6.67 (SD = 4.39). In the total sample, 65 % (n = 963) of the participants reported no insomnia, 29 % (n = 431) sub-threshold insomnia, 4.7 % (n = 69) clinical insomnia of moderate severity, and 1 % (n = 12) severe clinical insomnia. Table 1 points out that girls (M = 7.23, SD = 4.46) reported more severe insomnia symptoms than boys (M = 6.08, SD = 4.24). Similarly, the Chi 2 -test showed that girls were overrepresented in the group with sub-threshold insomnia (32 % vs. 26 %) and moderate clinical insomnia (6 % vs. 3 %), whereas they were underrepresented in the group classified as having no insomnia (61 % vs. 70 %). The majority of inter-item correlations exceeded the critical value of .20 in boys and girls (Table 1). Similarly, most item-total correlations were satisfactory, with an average correlation of .48. The lowest inter-item and item-total correlations were found for item 6 (r it = .28 to .32). The Cronbach's alpha was .76 for the total sample, boys and girls.  Regarding the associations between the ISI and PSQI items, significant correlations were found between item 1 and sleep onset latency, and item 2 and number of awakenings (Table 2). Contrary to our expectation, item 3 was not associated with sleep duration. Furthermore, a strong correlation was found between item 4 and sleep quality. Finally, significant correlations were found for items 5-7 with feeling restored and mood after awakening in the morning, as well as daytime sleepiness and concentration. The correlations between the ISI total score and the PSQI items show similar associations between the two instruments for boys and girls ( Table 2). The moderate (negative) correlation between the ISI and psychological functioning provides further support for convergent validity of the ISI.
With regard to factorial validity, as shown in Table 3, the model fit of the initial 1-factor model was excellent. Furthermore, Table 3 provides support for configural and weak measurement invariance (invariant factor loadings) across genders, p(Δχ 2 ) = .28. Most of the factor loadings were fair to excellent, except for items 5 and 6 with loadings ranging from .17 to . 40. Figure 1 provides the measurement coefficients of the hypothesized 1-factor model (after testing for configural and weak invariance), separately for boys and girls.

Participants and procedures
Sample 2 consisted of 862 students (M age = 24.7, SD = 5.9) from the German-speaking, north-western part of Switzerland (223 men, 639 women), who were recruited from the University of Basel (n = 556) and from the Northwestern University of Applied Sciences (n = 306). Data with this sample have been published previously [31,47,48]. Participants provided informed consent and the local ethics committee approved the study.

Sleep
As in study 1, sleep complaints were assessed with the ISI, quality of sleep with the 11-item German adaptation of the PSQI.

Psychological functioning
Psychological functioning was assessed via depressive symptoms, using the Depression Scale (DS) [49]. This AGFI adjusted goodness of fit index, CFI comparative fit index, TLI Tucker Lewis index, RMR root mean square residual, PClose probability of close fit, RMSEA root mean square error of approximation a e1-e2, e1-e3, e2-e3, e5-e6, e5-e7, e6-7 were allowed to correlate scale consists of 16 items ranging from 1 (not at all true) to 4 (definitely true) and concerns decreased mood, lack of satisfying social and leisure activities, thoughts about suicide, and hopelessness. The internal consistency of the DS proved to be good in the present sample (Cronbach's alpha = .86).

Statistical analyses
The same statistical procedures as those from study 1 were used.

Results
The ISI mean score was 6.56 (SD = 4.31) in the total sample. Moreover, 68 % (n = 585) were categorized as having no insomnia, 26 % (n = 224) as having subthreshold insomnia, 6 % (n = 49) as having moderate clinical insomnia, and 1 % (n = 4) as having severe clinical insomnia. The ANOVA (Table 4) shows that women (M = 6.80, SD = 4.37) reported more insomnia symptoms than men (M = 5.91, SD = 4.13). Gender differences were found for items 1 and 2, but not for the other items. The Chi 2 -test did not detect differences with regard to insomnia categories between men and women. As shown in Table 4, the inter-item correlations were mostly above the critical value of .20. Acceptable itemtotal correlations were also found for men and women (with mean correlations of .51 for men and .49 for women). The Cronbach's alpha was .77 in the total sample, .78 for men and .76 for women.
With regard to the ISI-PSQI correlations, significant associations occurred for item 1 and sleep onset latency, and item 2 and number of awakenings ( Table 5). As in study 1, no significant relationship was found to exist between item 3 and sleep duration. Furthermore, a strong correlation was identified between item 4 and sleep quality. Significant correlations also existed between items 5 and 7 and feeling restored in the morning, as well as daytime sleepiness and concentration. The correlations for item 6 pointed into the same direction, but were generally weak. The correlations between the PSQI items and the ISI revealed similar relationships for men and women. Finally, Table 5 revealed that insomnia is strongly associated with depressive symptoms, independent of participants' gender.
The findings of the CFA confirm that the 1-factor model fits well with the empirical data (Table 3). Strong measurement invariance (invariant factor loadings and intercepts) across genders was supported. Most factor loadings were fair to excellent. However, the factor loadings of item 5 (.29 to .31) and 6 (.11) were poor (Fig. 1).

Participants and procedures
Sample 3 consisted of 533 employees of the police force and emergency response service corps in the Germanspeaking, north-western part of Switzerland (M age = 41.2, SD = 9.8 years, 411 men and 122 women), who responded to a written questionnaire (45 % return rate). Data with this sample have been published previously [50,51]. Participants gave informed consent and the study was performed in accordance with the ethical standards laid down in the Declaration of Helsinki.

Sleep
The same instruments as in study 1 and 2 were used to measure sleep complaints (ISI) and quality of sleep (PSQI).

Psychological functioning
To assess psychological functioning, participants completed the 12-Item Short Form Health Survey (SF-12) [52]. The composite score for the psychological subscale was obtained by weighting each item as described in the SF-12 manual. Higher scores reflect increased health functioning.

As shown in
The majority of inter-item correlations exceeded the critical value of .20 for most pairs of items, independent of participants' gender. Similarly, most item-total correlations were satisfactory (with average correlations of .55 for men and .56 for women). The Cronbach's alpha was .81 in the total sample, .81 for men and .82 for women.
With regard to the associations between the ISI and PSQI items, significant correlations existed between item   (Table 7). Furthermore, a strong correlation was found between item 4 and sleep quality. Significant correlations also existed between items 5 and 7 and feeling restored and mood after awakening in the morning, as well as daytime sleepiness and concentration. The correlations for item 6 pointed into the same direction, but were generally weak. Similar associations were found among men and women with regard to the correlations between the PSQI items and the ISI total score. The significant negative correlations between the ISI and the SF12 psychological functioning scale further support the convergent validity of the instrument ( Table 7).
The CFA corroborated that a 1-factor model had excellent model fit. Furthermore, strict measurement invariance (invariant factor loadings, intercepts, and unique factor variances across groups) was supported across genders ( Table 3). Five of seven factor loadings were very good or excellent. Nevertheless, the factor loadings of items 5 and 6 were poor (between .13 and .26; see Fig. 1).

Discussion
The key findings of the present study are that the German version of the ISI has generally acceptable psychometric properties and sufficient concurrent validity to be recommended as a brief screening measure in both adolescents  and adults. Moreover, the factor structure of the ISI proved to be invariant across gender. Four hypotheses were formulated and each of these will now be discussed in turn.
Our first hypothesis was that female participants would report higher insomnia scores than their male counterparts. This hypothesis was supported in adolescents and young adults, which is consistent with the majority of previous studies [2,3]. Contrary to our hypothesis, no gender differences were found in police and emergency service response officers. A meta-analysis of 29 studies showed that a female predisposition to insomnia is consistent and progressive across age [53]. Thus, increasing age does not provide a plausible explanation. Most likely, this unexpected finding can be attributed to the fact that in study 3 men were overrepresented in shift workers in this specific professional group [50], which might contribute to increased insomnia mean scores among male participants [54]. Further, it is also conceivable that the job of police officer per se led to a selection bias.
Our second hypothesis was that the ISI would have adequate homogeneity and internal consistency across all study populations, and both men and women [9,17,19,[21][22][23]. This hypothesis was generally supported. First, all Cronbach's alpha coefficients were ≥ .70 for both men and women [38]. Second, across all three samples, the majority of inter-item correlations were ≥ .20, and most item-total correlations were ≥ .30. The lowest coefficients were consistently shown for item 6. This is congruent with previous research, and most likely due to the fact that this item refers to the opinion of others about one's own sleep. Thus, information about what others think might not be as subjectively relevant as individual perceptions about one's own sleep. Not surprisingly, therefore, item 6 also had somewhat suboptimal factor loadings in the CFA, both in male and female participants. Nevertheless, we decided not to exclude this item since the low factor loading did not negatively affect the general model fit, and because exclusion of item 6 would not have resulted in substantial improvements in the Cronbach's alpha coefficient.
In our third hypothesis, we assumed that adequate convergent validity would exist for male and female participants throughout all study populations. This hypothesis was fully supported. In line with previous research, our findings show that the individual items and the total ISI score correlate at least moderately and positively with the corresponding PSQI items [17,19,23]. Moreover, our data corroborate previous research showing that the ISI is associated with impaired psychological functioning [18,19,22,23]. With regard to factorial validity, the findings of our studies point out that a 1-factor model provides an excellent model fit across all age groups. We acknowledge that the 2and 3-factor models also had very good model fit (data not shown). For several reasons, however, the 1-factor model seems the most suitable one: First, it is generally recommended to use the most parsimonious model if goodnessof-fit indices are satisfied by several alternative models [55]. Second, the 2-and 3-factor models showed very high interfactor correlations with coefficients ranging between .60 and .98. This reveals a great overlap between the latent factors, which might pose problems associated with multicollinearity when using the separate factors as independent predictors. Third, the loadings of item 5 and 6 on the latent factor did not improve substantially in the multifactorial models. Fourth, in the 3-factor model, one factor consists of one item only (factor 2: satisfaction), which precludes a test of internal consistency. Furthermore, this single item factor proved to have substantial cross-loadings on the other two factors in previous research [22]. Fifth, cut-offs scores to establish the severity of insomnia only exist for the total score [19].
Finally, weak-to-strict measurement invariance across all age groups was supported. This is an important  finding because Widaman et al. [46] emphasized that if participants' answers vary so much with age that significant differences emerge in the factor structure of an instrument, or that relevant ceiling or floor effects occur at different ages, the measuring devices must change.
The present analyses suggest that the ISI is equally suitable to detect insomnia symptoms in adolescents, young adults and adult workers.
The findings of the present study need to be interpreted in light of several limitations: First, the findings are based on cross-sectional data. Thus, we were unable to test predictive validity and the test-retest reliability. Nevertheless, a previous study with a German-speaking sample of high school students showed that the ISI score improved after 3 weeks of daily morning running [56]. Moreover, we were not able to test measurement invariance across time within the same individuals [46]. Second, the findings are entirely based on self-reported data. Thus, we did not test the degree to which the ISI items correlated with objective sleep measures. Nevertheless, the German version of the ISI proved to be weakly but significantly associated with 1-channel EEG measures in previous studies [34], with a magnitude of relationships similar to that reported in the international literature [17,19]. Third, all three studies used nonclinical populations, which precludes the establishment of the discriminant validity of the ISI. Nevertheless, previous research showed that the ISI discriminates well between patients with insomnia and controls without sleep disorders [19][20][21]23], and offers a good balance between sensitivity (likelihood of detecting insomnia in a subject from the insomnia group) and specificity (likelihood of rejecting insomnia in a subject from the control group) [20][21][22]. Fourth, we acknowledge that a clinical evaluation should be seen as the gold standard for the detection of sleep difficulties [19]. However, a clinical evaluation is time-consuming, which may discourage general practitioners from systematically inquiring about sleep in their patients [19]. Thus, the ISI provides a time-and cost-effective alternative, which seems feasible for general practitioners and for public health screening purposes.