Validation of the Greek version of the distress thermometer compared to the clinical interview for depression

Background The Distress Thermometer (DT) is worldwide the most commonly used instrument for quick screening of emotional burden in patients with cancer. In order to validate the Greek version of the DT in the Greek population we aimed to explore the capacity of the DT to identify patients with comorbid depressive diagnosis. Methods We analyzed the routinely collected clinical data from 152 patients with cancer who had been evaluated by the consultation-liaison psychiatric service and had received a diagnosis of either depressive disorder or no psychiatric diagnosis. The score of the DT accompanied by the list of problems in the Problem List, the depression status, and the clinical and demographic characteristics entered the data sheet. Results The ROC analysis revealed that the DT achieved a significant discrimination with an area under the curve of 0.79. At a cut-off point of 4, the DT identified 85% of the patients with an ICD-10 depressive diagnosis (sensitivity) and 60% of the patients without a psychiatric diagnosis (specificity). The positive predictive value was 44%, the negative predictive value 92% and the diagnostic odd ratio 8.88. Fatigue and emotional difficulties were the most commonly reported problems by the patients. Conclusion The Greek version of the DT has a sufficient overall accuracy in classifying patients regarding the existence of depressive disorders, in the oncology setting. Therefore, it can be considered as a valid initial screening tool for depression in patients with cancer; patients scoring ≥4 should be assessed by a more thorough mental evaluation.

experience of a psychological (i.e., cognitive, behavioral, emotional), social, spiritual, and/or physical nature that may interfere with the ability to cope effectively with cancer, its physical symptoms, and its treatment. Distress extends along a continuum, ranging from common normal feelings of vulnerability, sadness, and fears to problems that can become disabling, such as depression, anxiety, panic, social isolation, and existential and spiritual crisis" [3]. Distress is a universal phenomenon in patients with cancer; everyone at some point will experience some level of distress [4]. Roughly, a third of patients is estimated to experience high distress [5] although others suggest that the percentage of highly distressed patients is even larger [3,6]. Most importantly, high distress interferes with patients' decision making, and is associated with non-compliance and poor quality of life [3].
While many patients with cancer will affront mood fluctuations and a significant proportion will develop transient adjustment or mood disorders, around 10%, or more, will suffer from major depressive disorder (MDD) [7][8][9]; MDD can be considered as an extreme point on the distress continuum. The prevalence of MDD in patients with cancer is higher than this observed in the general population [7][8][9]. Of note, MDD is not the only nor the most prevalent depressive disorder in patients with cancer [9,10]. Depression exacerbates anxiety, pain, and fatigue, reduces normal functioning and quality of life, and undermines adherence as well as the trusting relationship between the patient and the oncology team. It has also been hypothesized that depression is a significant negative predictor of survival [11,12]. However, depression is both under-recognized and undertreated in the oncology setting [3,7]. In a Scottish study of 21,151 cancer patients, out of 1538 patients who met the diagnostic criteria for clinical depression, 1130 (73%) did not receive any "potentially effective treatment" [7].
The pressing need to detect and reduce the emotional burden in patients with cancer can be partially addressed by the availability of screening tools. An accurate tool can help the oncologists expedite the diagnosis and can lessen the work-load of the limited psychosocial services. The Distress Thermometer (DT) is the most widely used instrument for quick screening of emotional burden in patients with cancer [13]. The DT is a visual analog scale (VAS) ranging from 0 to 10, resembling the format of the well-known Pain Thermometer. Usually, a score higher than 4 (or 5) is considered as high distress and is indicative of an increased probability for a diagnosis of a mental disorder necessitating a thorough psychosocial evaluation [9,13]. There are currently 10 translations of the DT available through the NCCN® website (including the Greek translation) [14]. In a Greek study, the DT was compared to the Hospital Anxiety and Depression Scale (HADS) in elderly patients with colorectal cancer hospitalized in a surgical ward; the authors proposed 7 as the preferred cut-off point [15].
In this study, we aimed to explore the capacity of the Greek version of the DT to discriminate patients with cancer who suffer from clinical depression as defined by the International Classification of Diseases-10th edition (ICD-10).

Study participants
The participants were patients receiving chemotherapy for solid tumours at the Outpatient Clinic in the Department of Medical Oncology at Papageorgiou Hospital in Thessaloniki. Data from the evaluation of 152 patients with cancer in active treatment were used for this report.

Procedure
We analyzed clinical data collected from patients who had been evaluated by the consultation-liaison (C-L) psychiatric service. Written consent was obtained from the participants. Participants consist of patients who had received a psychiatric diagnosis of either depressive disorder or no psychiatric disorder. Interviews were conducted by a resident and two skilled C-L nurses supervised by two licensed psychiatrists. A structured interview (the Mini International Neuropsychiatric Interview, MINI) was used and the diagnosis was reached as a result of a unanimous clinical decision of the C-L team. Patients were also asked by the nurses of the Oncology Department to complete the Distress Thermometer. The clinicians who made the psychiatric diagnosis where blinded to the results of the distress thermometer.

Measures
Demographic and clinical information was obtained by the medical records.

Distress thermometer
The distress thermometer is a single item questionnaire presented by Roth et al. [16]. It is based on a numeric scale ranging from 0 to 10, where 0 indicates no distress and 10 indicates extreme distress. The DT is accompanied by a comprehensive list of problems faced by patients, known as "Problem List", in which common problems/symptoms are presented in five dimensions: practical problems, family problems, emotional problems, spiritual/religious concerns and physical problems.
With the approval and collaboration of the NCCN, the Greek edition of the Distress Thermometer [17] was developed. We used forward-translation and back-translation process and the draft was sent to the NCCN team for review and verification. The translation team of the NCCN requested some revisions which were all addressed reaching the final version of the Greek edition of the Distress Thermometer [translated by Kyranou, Varvara, Syngelakis, Dec 2014].

Psychiatric clinical interview
MINI is a structured psychiatric evaluation for diagnosing mental disorders according to the DSM-IV/ICD-10 criteria. MINI is very convenient in clinical/research settings since it can be completed in less than 30 min, and has sufficient psychometric properties [18,19].

Data analysis
The Statistical Package for Social Science (SPSS), version 25, for Mac was used for the analysis of the data. Descriptive statistics were used for the calculation of frequencies of the demographic characteristics and the problems faced by patients.
Receiver operating characteristic (ROC) analysis was used to define the DT cut off score in order to discriminate the depressed from the non-depressed patients. To identify the best cut-off point we used the Younden index, the empirical criterion of estimating the maximum distance from the diagonal reference line, and the diagnostic odd ratio (DOR) criterion. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were also calculated.
In order to discriminate between groups of patients and their characteristics and to examine the correlation between groups of patients, chi-square tests were used. Independent t-tests were performed to compare between groups of patients and the number of problems they reported on the Problem List, and between the characteristics of the participants and the number of problems they reported.

Descriptive statistics
The characteristics of the participants are presented in Table 1. The mean score of the DT in our sample was 3.60; SD: 3.15.

ROC analysis
The ROC analyses showed that the DT had a fair/good discrimination capacity compared to the clinical interview with an area under the curve (AUC) of 0.79 [SE = 0.04, 95% (0.71, 0.87), p < 0.01] as shown in Fig. 1.
The Youden index and the empirical criterion for locating the maximum distance from the diagonal reference line are presented in Table 2. The point 5 (Youden Index = 0.461, Maximum Distance = 0.326) seems to be the best balanced cut-off point; the results for a cut-off point of 5 are slightly better than those corresponding to a cut-off point of 4 (Youden Index = 0.456, Maximum Distance = 0.322).
On the contrary, the cut-off point of 4 functioned better than the cut-off point of 5 when the DOR was applied for the estimation of the discriminatory capacity (DOR for 4 = 8.883, DOR for 5 = 7.924). In this frame, sensitivity was increased up to 0.85 at the expense of a lower specificity ( Table 3). The importance of the DT from a clinical point of view is that it can identify as many patients with clinical depression as possible; in this case also the choice of the point 4 is the most appropriate cut-off point.

Sociodemographic characteristics of groups of patients
The relationship between distressed / non-distressed patients and their gender was significant [x 2 (1, 152) = 8.032, p ≤ .005] women scored higher than men. The groups did not differ in any of the other patients' sociodemographic characteristic (age, place of living, family status, occupation, education). There was no significant relationship between depressed/non-depressed patients and any sociodemographic characteristic.

Problems reported in the problem list
Problems reported in the Problem List are presented in Table 4.

Discussion
In a convenient sample of patients with cancer, with mixed diagnosis, receiving chemotherapy in an outpatient clinic, the Greek version of the DT compared to the clinical psychiatric interview demonstrated sufficient accuracy in classifying patients with depressive disorders.
The AUC was 0.79. Searching for the optimal cut-off point we faced a dilemma since 4 and 5 had similar operating characteristics. At a cut-off point of ≥4 the sensitivity was 0.85, the specificity 0.60, the PPV 0.44, the NPV 0.92 and the DOR 8.88. At a cut-off point of ≥5 the sensitivity was 0.81, the specificity 0.66, the DOR 7.92 whereas the Youden Index was slightly higher. We decided to choose 4 as the proposed cut-off point because from a clinical point of view at this cut-off point the test performs better (higher DOR, sensitivity exceeding the 0.85 level).
Our decision is not only clinically relevant but also recommended. Ma et al. [13] in their meta-analysis faced the same dilemma in the comparison of the DT to the DSM-IV. From their part, they chose a higher DOR and a higher sensitivity instead of a slightly better Youden Index. Thus, they recommended 4 as the optimal cut-off score "in order to rule in as many cases". Furthermore, consistency on a global scale was an additional important criterion for adopting 4 as the cut-off score in the Greek Version as 4 is the preferred cut-off point worldwide [13].
The psychometric properties of the DT have been examined during the last 20 years, compared to several different tools. Paradoxically, in examining a screening test  which of course attempts to detect firstly the most severely distressed people, i.e. those with a psychiatric diagnosis, the gold standard, the clinical interview, has not been commonly utilized. Ma et al. [13] in their meta-analysis examining the accuracy of the DT included 42 eligible studies from 20 counties in which 10 different reference standards were used. Only 8 of the 42 studies used "the real standard (the clinical interview)" while the others used questionnaires, mainly the HADS; researchers consider this finding as a limitation in their meta-analysis. Accordingly, Donovan et al. [20], in their research for translated versions of the DT, presented 23 publications describing the use of a non-validated foreign language version of the DT. Only in four of them mental diagnosis, following clinical interview, was utilized as a criterion in the ROC analysis. In our study, DT showed a good sensitivity of 85% but a relatively low specificity of 60%. According to Ma et al. [13], when all the results were pooled together the DT, at the cut-off point of 4, demonstrated "a good balance between pooled sensitivity (0.81, 95% CI 0.79-0.82) and pooled specificity (0.72, 95% CI 0.71-0.72)". When DT was compared to HADS-Total "the balance between pooled sensitivity (0.82, 95% CI 0.80-0.84) and pooled specificity (0.73, 95% CI 0.72-0.74) was maximized". At the same cut-off point, in the comparison of the DT to the clinical interview/DSM-IV, the pooled sensitivity was 0.84 (95% CI 0.80-0.88) but the pooled specificity dropped to 0.63 (95% CI 0.61-0.66). Finally, in the comparison of the DT to the clinical interview/ICD-10 the pooled sensitivity was 0.79 (95% CI 0.60-0.87) and the pooled specificity 0.60 (95% CI 0.52-0.68). It is worth mentioning that in a previously published meta-analysis the psychometric properties were found even lower [21], while there are some studies that failed to find a link between the DT and the clinical interview [22,23].
Few studies have focused in the ability of the DT to identify depressive disorders compared to the clinical interview. Akizuki et al. [24] reported that DT revealed a sensitivity of 84% and specificity of 61% for detection of adjustment disorders and major depression. Grassi et al. [25] found a sensitivity of 79.5% and specificity of 75.4%, following an ICD-10 diagnosis of affective syndrome. Rooney et al. [26] reported a sensitivity of 94 to 67% (at different time points) and specificity of 69 to 75% for MDD; researchers investigated the operating characteristics of HADS, PHQ-9 and DT and they concluded that "due to a modest positive predictive value of either instrument, patients scoring above these thresholds need a clinical assessment to diagnose or exclude depression". On the other hand, in the Wagner et al. study [27] where DT, Hopkins Symptom Check List-25 (HSCL-25), PHQ-9/ PHQ-2 and Structured Clinical Interview (SCID) for major depression, dysthymia, and adjustment disorders were usedthe DT showed a sensitivity of 0.80% and a specificity of 52%; the authors underlined that: "The NCCN®-DT (AUC=0.59) indicated poor accuracy in classifying patients with regard to the presence of mood disorders." Our results are in agreement with those derived by most researchers who used the psychiatric interview as the gold standard to validate the DT's accuracy; the Greek version of the NCCN®'s Distress Thermometer exhibited at least similar psychometric properties to previous reports from other international studies. Additionally, our results support a 2-step process; patients scoring ≥4 should undergo a more thorough mental evaluation.
The psychometric properties of the DT have raised a debate regarding its usefulness. Recklitis et al. [28] in their study of the DT compared to a psychiatric interview reported a sensitivity of 68.18%, and a specificity of 78.33%; they emphasized that "The DT … failed to identify 31.81% of survivors with a SCID diagnosis. No alternative DT cut-off score met criteria for acceptable sensitivity (≥.85) and specificity (≥.75)." Wagner et al. [27] extend similar concerns to an extreme by questioning the DT as useless.
Given the necessity of detecting mental problems in patients with cancer, various instruments are offered to clinicians to assist them identify patients in need for psychosocial support. Oncologists seem to face difficulties in recognizing the psychiatric morbidity [22,29]. The nurses are often the first point of encounter with the patient and as such can be extremely assisted by a brief measure of psychological distress screening [30]. The DT belongs in the category of Ultra Short Term Questionnaires; in the clinical setting these tests are very easy to administer, quick and inexpensive. Nevertheless, their feasibility is counterbalanced by a modest accuracy and a poor specificity [21]. It would be worth noting that short tests do operate better when applied to rule out  non-depressed patients [9,31] In a busy oncology department, it would be extremely useful for the clinicians to be aware of the patients not suffering from depression. As for those highly distressed, a more thorough assessment of a possible diagnosis of depression can be utilized [9,26,31,32]. As expected distressed/depressed patients reported more problems on the Problem List compared with non-distressed/non-depressed patients. The most frequent problems reported by the distressed/depressed patients were fatigue, followed by emotional problems, more specifically worries and nervousness; while pain and sleep were reported at a high percentage, spiritual/religious concerns, child care and sexual problems were in contrary at a low percentage. 'Sexual problems' was the only item in which more non-depressed than depressed patients expressed concerns to a significant point. However, the lack of randomization cannot exclude the possibility of selection bias in our sample.
In a previous Greek study, Antoniadis et al. [15] compared the DT with the HADS in elderly (mean age: 70, SD: 9.5) patients with colorectal cancer who were admitted for surgery in a period of surgical treatment; the researchers excluded patients with major health problems as well as those with a psychiatric history during the past 5 years. "Compared to cancer patients from other countries the mean HADS score of [their] sample was significantly higher" [15]. The mean score of DT was 5.7 (sd 2.7), the AUC 0.805 and for the cut-off point of 7, sensitivity was 0.73, specificity 0.80. In the Problem List worries (81.0%), nervousness (78.6%), fears (70.2%), treatment decisions (69.0%), sleep (67.9%), sadness (65.5%), child care (59.5%) and fatigue (52.4%) were the most reported. According to the authors, cultural factors may have contributed to the differences, especially for the high cut-off score; they also speculated that the socioeconomic condition in Greece and the economic crisis may have had an impact. Our results are not in agreement with these assumptions. Greek cultural factors or socioeconomic condition did not differentiate our results, which are similar to those reported from other countries [13]. Possibly, the sampling procedure and the treatment phase had a crucial influence on the differences reported by Antoniadis et al. Of note, DT scores may differ at different time points on the cancer trajectory [26,33].
Several limitations to this study need to be acknowledged. This was a single-center study, at a University Hospital with patients in active treatment. We used a non-random sample and the numbers do not allow for comparisons between patients suffering from different types of cancer or being on different chemotherapy regimens. Finally, we did not search for possible subtypes within the construct of depression. A multi-center study, with a large heterogeneous sample will allow for more detailed comparisons between subgroups of patients on different points within the illness trajectory.

Conclusions
The Greek version of the Distress Thermometer performs well, at the cut-off point of 4, in classifying patients regarding the existence of depressive disorders; it can be utilized in a 2-step approach to diagnose MDD or related disorders. We consider the validation of the Greek version of the DT, a well-known international screening tool, as our contribution to lessen the emotional burden of our patients, and as a step forward in the underserved area of psychosocial interventions.