The standardized assessment of symptoms, syndromes and diagnoses of mental disorders is essential for estimating the prevalence, onset, and course of mental disorders and determining their risk factors in epidemiologic research. Case definition in clinical and experimental studies also relies on reliable diagnoses. The standardized and fully computerized DIA-X-5 reveals good test-retest reliability for most DSM-5 diagnoses, stem items and time-related information in adolescents and adults.
Test-retest reliability of diagnoses and stem items of the DIA-X-5
Although most of the DIA-X-5 diagnoses showed good to very good test-retest reliability, some diagnoses showed relatively low reliability. For these diagnoses we examined in more detail the response patterns on the level of diagnostic criteria and individual questions.
The summary category of any anxiety disorder (without panic attack) reveals relatively low reliability because of low reliability indices for few specific anxiety diagnoses. For panic disorder, comparing each separate diagnostic criterion, there was no specific response pattern which changed from the first to the second interview. However, a change in the order of questions in this section may have affected subjects’ overall response behavior. In the DIA-X/M-CIDI, the panic attack stem question was followed by panic disorder questions only after which the panic attack symptoms were assessed. This order changed in the DIA-X-5, probing the panic attack symptoms before the panic disorder criteria, due to the prominent role of the panic attack specifier in DSM-5 [12]. Reliability for the panic stem item was good, as was the reliability for panic attack.
For social anxiety disorder, participants’ responses varied particularly for avoidance (5 of 12 discordant cases) and duration of anxiety (6 out of 12 discordant cases). Social anxiety disorder consists of a long criterion list and 9 out of 12 discordant cases were discordant only in one criterion. Agoraphobia and separation anxiety disorder did have a low overall number of cases in this study hampering reliability estimates.
For obsessive-compulsive disorder, different criteria for thoughts and behavior revealed divergent responses patterns between both interviews. For obsessive thoughts, mainly the response to the A criterion changed between both interviews. For compulsive behavior, the list of items/behaviors was expanded in the DIA-X-5 to also probe for OCD-related syndromes including nail biting, hair pulling, skin picking, and mirror checking. Although the mere presence of these symptoms was not counted toward the standard diagnosis of OCD, their inclusion may have affected the responses for OCD, given that symptoms such as nail biting were quite prevalent in the sample. Most discordant OCD-cases were due to the B criterion – referring to distress/impairment (mostly rated “some” in the second interview instead of “much”).
As already noted, the stem items of most diagnoses showed high reliability, even for the disorders for which relatively low reliability indices were found on the diagnostic level. It should be noted though that the stem question for the use of legal drugs, i.e. medication, reveals low reliability indices. This might depend on the type of listed substances and the broad open category of “other medications”. Unfortunately, no cases of medication use disorder were found in the current study, not allowing to test whether retest-reliability for medication would be higher on the diagnostic level.
When comparing the retest-reliability of diagnoses and stem items of the DIA-X-5 with previous results of the DIA-X/M-CIDI [4], kappa values on the diagnostic level are similar for depressive disorders, alcohol and illicit drug use disorder as well as for any DSM disorder. For PTSD, tobacco use disorder and any eating disorder, the DIA-X-5 reveals higher kappa values (kappa deviating at least 0.1), whereas the DIA-X/M-CIDI revealed higher kappa values for obsessive compulsive disorder and any somatoform disorder (relative to somatic symptom disorder). Mixed results were found for anxiety disorders; the DIA-X-5 had higher kappa values for panic attack and generalized anxiety disorder, the DIA-X/M-CIDI had higher kappa values for most anxiety disorder categories, such as any anxiety disorder, social anxiety disorder and panic disorder. However, most of these differences equal out when including kappa values for the stem items which reveal comparable kappa values for panic disorder, social anxiety disorder, agoraphobia, specific phobia, major depressive disorder, eating disorder and PTSD between DIA-X-5 and DIA-X/M-CIDI. However, relevant differences in kappa still remain for the stem items of generalized anxiety disorder (GAD) and OCD – with higher kappa values in the DIA-X-5 – and for dysthymia and manic/hypomanic episode – with higher kappa values for the DIA-X/M-CIDI. The differences between DIA-X/M-CIDI and DIA-X-5 likely depend on the available number of cases, which was higher in the DIA-X/M-CIDI retest study resulting in higher kappa values (first kappa paradox). As previously mentioned, the change in the order of questions for panic disorder might have decreased the kappa values for panic disorder in the DIA-X-5, in comparison to the DIA-X/M-CIDI.
Test-retest reliability of time-related questions
High test-retest reliability was found for the age of onset and age of recency questions in the DIA-X-5; the persistence questions generally revealed slightly lower reliability. Compared to the previous DIA-X/M-CIDI, the ICCs for age of onset in the DIA-X-5 were either similar or somewhat higher for most disorders. Substantially higher ICCs for age of onset were found in the DIA-X/M-CIDI, however, for most specific phobia subtypes, which could be due to the greater number of specific phobia cases in the DIA-X/M-CIDI retest study. In the current study low ICC in age of onset resulted from an overall small number of cases. For the specific phobia subtype, this was combined with an outlier who reported extremely different age of onsets in both interviews. Low ICC’s for age of recency in disruptive mood dysregulation disorder also may be due to few overall cases.
Persistence revealed a stronger variability than the other two time related measures. For specific phobias (situational/natural environment/other type), participants reported diverging onset/recency, most likely because these disorders often manifest early in development and take a waxing and waning course. However, overall persistence - meaning the overall number of years affected - is remembered similarly in both interviews. For illness anxiety and somatic symptom disorder a slow development and varying intensity levels can be assumed leading to difficulties in estimating persistence in terms of number of years affected. For separation anxiety disorder there were too few cases in this study to make reliable conclusions.
Limitations
This test-retest study has limitations: First, the number of subjects assessed is at the lower bound for a test-retest reliability study, which principally affects the reliability estimates for conditions less frequently diagnosed in the sample. However, previous studies on retest reliability of structured clinical interviews included samples of comparable and even lower size with a range of 60 to 43 participants for the M-CIDI [4], the Structured Clinical Interview for DSM-5 disorders (DSM-5 SCID [25]) and the Spanish version of the Kiddie Schedule for Affective Disorders and Schizophrenia present and lifetime version DSM-5 (K-SADS-PL-5 [26]). Second, the test-retest interval varies between one and 36 days. Although an average retest interval of nine days is appropriate considering the variability for example depressive symptoms, shorter intervals (below 7 days) could increase kappa estimates. Third, a convenience sample was recruited for this study. Thus, the sample is community based and therefore useful for an instrument which is designed for representative studies. Fourth, the DIA-X-5 version applied in this study included a range of additional nested questionnaires, lists and screening modules which increased the length of the DIA-X-5 sections. This may have affected the response behavior of the participants. The relatively high agreements of the stem questions, however, argue against a systematic response behavior bias. Finally, the reliability coefficients of some disorders are at a lower bound. Those include panic disorder, social anxiety disorder and obsessive-compulsive disorder. In addition, some disorders reveal a fair Cohens kappa CI higher than 0.40 but do not meet the suggested criteria of ten cases [19], those are PTSD, anorexia nervosa, intermittent explosive disorder, and cannabis use disorder. Some reveal a kappa lower CI < 0.40 or a kappa CI range > .50, those disorders include persistent depressive disorder, panic disorder, obsessive compulsive disorder, any adjustment disorder, somatic symptom disorder, and alcohol use disorder. Concerning the diagnoses of these disorders the DIA-X-5 should be used with caution. Of note, reliability of stem items of these disorders is sufficient.