Diagnostic criteria and instruments
Psychiatric classification systems like DSM-IV and ICD-10 are based on the presence or absence of various operationalized diagnostic criteria. When structured interviews are used, the patients are asked for the presence of the diagnostic criteria by an interviewer. In contrast, in this study the patients rate themselves the diagnostic criteria for GAD (DSM-IV) on the Generalized Anxiety Questionnaire (GAS-Q) and for MDE (DSM-IV) on the Depression Screening Questionnaire (DSQ), and these patient ratings are used as diagnostic reference standard in this study.
The GAS-Q is a modification of the Anxiety Screening Questionnaire , and is a self-rating questionnaire developed to diagnose GAD according to DSM-IV and ICD-10. The GAS-Q consists of 20 items covering the diagnostic criteria for GAD in the DSM-IV. Test-retest reliability of the GAS-Q over a two-day retest period showed a kappa value of 0.74 for the diagnosis of GAD. Congruent validity comparing GAS-Q diagnosis with the DSM IV algorithm for GAD of the Composite International Diagnostic Interview showed a kappa of 0.72 .
The DSQ was made for patient-rating of MDE according to DSM-IV and ICD-10  and was chosen as our reference standard. The DSQ is an 11 item questionnaire in which diagnostic criteria are rated on a three point scale, supplemented by three questions to assess the age at first and current episode, and the number of episodes according to the criterion A of MDE in DSM-IV. Consistent with the DSM-IV criteria, a diagnosis of MDE was assigned when at least five of the items were rated as positive by the patient. In the German part of the European study, the internal consistency of the DSQ showed a Cronbach's coefficient alpha of 0.83 . Test-retest reliability over a two-day period found a kappa value of 0.82 for MDE . Tests of the DSQ diagnosis versus diagnosis of MDE based on structured interview showed a kappa 0.89 .
The HADS consists of seven items for anxiety (HADS-A) and seven for depression (HADS-D). The items are scored on a four-point scale from zero (not present) to three (considerable). The item scores are added, giving sub-scale scores on the HADS-A and the HADS-D from zero to 21. In this study valid HADS subscale scores were defined as having answered at least five of seven items on both the HADS-A and the HADS-D. In order to be valid in patients with somatic problems, the HADS items were based on the psychological aspects of anxiety and depression. The anxiety items were concentrated on general anxiety, and five of the items were close to the diagnostic criteria of GAD. The depression items were based on anhedonia, which is considered to be one of the essential criteria of depression . The concurrent validity of the HADS compared to other questionnaires for anxiety and depression is described between 0.60 and 0.80 for both sub-scales .
The CGI-S is a standardized assessment tool that is widely used as an outcome measure in research . The CGI-S had the following wording: "In your clinical judgement how severely does this patient suffer from MDE/GAD?" The ratings of CGI-S were: 1 = not ill at all, 2 = a borderline case, 3 = only mildly ill, 4 = moderately ill, 5 = seriously ill and 6 = extremely seriously ill. The CGI-S scale was dichotomised into 1–2 = not ill, 3–6 = ill, but we also explored the frequency of cases by a CGI-S score of ≥ 2 (= borderline case).
The statistical analyses were carried out with the SPSS for Windows, version 11.0. Principal Component Analysis (PCA) with oblique rotation was performed to explore the factor structure of the HADS. Internal consistency of the HADS-A and the HADS-D was tested using Cronbach's coefficient alpha. Pearson's correlation coefficient was used for estimation of the overlap between the subscales. Sensitivity and specificity were calculated for different cut-off values for the HADS-A, the HADS-D, and the CGI-S in relation to the prevalence rate of GAD identified with GAS-Q and the rate of MDE identified with DSQ. Sensitivities and specificities by optimal cut-off were used to calculate the rates of true and false positive and negative cases. The Receiver Operating Characteristics (ROC-curve) were depicted graphically, and the Area Under the Curve (AUC) were calculated for the HADS-A, the HADS-D and the CGI-S against the GAS-Q and the DSQ as reference standards. The associations of age and gender to caseness on the instruments were examined by logistic regression analyses. All significance tests were two-tailed, and p-values <.05 were reported as significant.