The validity and reliability of the PHQ-9 on screening of depression in neurology: a cross sectional study
BMC Psychiatry volume 22, Article number: 98 (2022)
This study aimed to explore the validity and reliability of the Patient Health Questionnaire-9 (PHQ-9) on screening of depression among patients with neurological disorders, and to explore factors influencing such patients.
In this study, 277 subjects who were admitted to the department of neurology of our hospital due to different neurological disorders completed the PHQ-9 questionnaire. The Mini-International Neuropsychiatric Interview (MINI) and Hamilton Rating Scale for Depression (HAMD) were employed to evaluate the depressive symptoms of patients who completed the PHQ-9 questionnaire. The internal consistency, criterion validity, structural validity, and optimal cut-off values of PHQ-9 were evaluated, and the consistency assessment was conducted between the depression severity as assessed by PHQ-9, HAMD and MINI. Logistic regression analysis was used to calculate the risk factors of depression.
The Cronbach’s α coefficient of the PHQ-9 was 0.839. The Pearson’s correlation coefficient among the 9 items of the PHQ-9 scale was 0.160 ~ 0.578 (P < 0.01), and the Pearson’s correlation coefficient between each item and the total score was at the range of 0.608 ~ 0.773. Taking the results of MINI as the gold standard, the area under the receiver operating characteristic (ROC) curve of the PHQ-9 results for all the subjects (n = 277) was 0.898 (95% confidence interval (CI): 0.859 ~ 0.937, P < 0.01). When the cut-off score was equal to 5, the values of sensitivity, specificity, and the Youden’s index were 91.2, 76.6%, and 0.678, respectively. Multivariate logistic regression analysis showed that the influence of unemployment on the occurrence of depression was statistically significant (P = 0.027, OR = 3.080, 95%CI: 1.133 ~ 8.374).
The application of PHQ-9 for screening of depression among Chinese patients with neurological disorders showed a good reliability and validity.
Mental disorder, also called mental illness or psychiatric disorder, is a behavioral or mental pattern that causes significant distress or impairment of personal functioning . Depression, as an important mental disorder, was ranked as the third cause of burden of disease worldwide in 2008 and may rank first by 2030 . Depression, which often accompanies multiple diseases, imposes serious health and economic burdens to society [3, 4]. It is highly prevalent among patients suffering from various chronic conditions . There are multiple ways in which depression can be identified. As for mild forms of depression, it may recover without much clinical assistance or only need primary care. However, major depression, especially severe depression, requires advanced care and early identification . Identifying cases with depression that require advanced care is not only a main challenge to primary care, but also for clinicians, especially for non-psychiatric physicians.
Neurology and psychiatry are often closely related. There are several factors influencing the incidence of depression on patients with neurological disorders, and controversial results were reported. A previous study showed that epilepsy was an independent risk factor for depression . Scholars found that severe motor function, dyskinesia, poor sleep quality, and cognitive impairment were independent predictors of depression in Parkinson’s disease (PD) patients who were admitted to department of neurology [8, 9]. However, depression in patients with neurological disorders and associated risk factors need further clinicians’ and scholars’ attention. Training non-psychiatric doctors to successfully identify patients with severe depression through the method of mental examination may resolve the mentioned challenge, while it is costly and time-consuming [10, 11]. A large number of health care systems have employed screening tools, such as the Self-rating Depression Scale (SDS) , the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) , Composite International Diagnostic Interview (CIDI) , the Mini-International Neuropsychological Interview (MINI) , the Cornell Scale for Depression in Dementia (CSDD) , and the Hamilton Rating Scale for Depression (HAMD) [17, 18] to evaluate severity of depressive symptoms. However, such tools are not optimal as they (1) tie up significant resources, such as trained professionals [14, 15, 17], (2) cannot be used for diagnosis but with many items needed to be evaluated , or (3) can only be used for diagnosis of specific patients .
Patient Health Questionnaire-9 (PHQ-9) was derived from the depression part in the Patient Health Questionnaire (PHQ) compiled by Spitzer et al. in 1999 . PHQ-9 was recommended by the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5). Response options on the items range from ‘not at all’ (0-point) to ‘nearly every day’ (3-point). The scale can not only screen for depression, but also show the severity of depression . Because of its convenient use and good reliability and validity, it has been widely used for depression screening in the internal medicine department of primary hospitals. The depression screening of the elderly, patients with epilepsy, and stroke patients also had good reliability and validity [21,22,23].
However, there still lies some uncertainties to be explored. Different studies have shown that the optimal cut-off value of PHQ-9 varies in different populations. The PHQ-9 maker used a cut-off value of 10, with the sensitivity 88% and the specificity 88% . In 2012, a meta-analysis showed that the optimal cut-off value of PHQ-9 was 8-11 . The best cut-off value of PHQ-9 for diagnosing depression still needs further discussion. The original researchers of PHQ-9 used 5, 10, 15, 20 as the demarcation values for mild depression, moderate depression, severe depression, and very severe depression . If PHQ-9 is used in different populations, the screening cut-off value changes, then the corresponding evaluation of depression severity may also change, which has certain guiding significance for treatment. What’s more, the PHQ-9 may lack some symptoms that are meaningful to the depressive patient, and the description of the symptoms is not clear enough . For example, patients with depression will regard abnormal perception, depersonalization, isolation, loneliness, and physical sensations (such as tremor, fatigue, restlessness, nausea, inability to relax, etc.) as meaningful or strong feelings of their depression. However, these symptoms are not reflected in the PHQ-9. As a self-rated scale, the PHQ-9 still needs to be completed by doctor involved in assisting patients to reduce confusion and to express their inner feelings accurately. Regarding the use of the PHQ-9, there lies several problems, such as the cut-off value, the inconsistency of reliability and validity when use in neurology, and the language expression which may need to be adjusted. PHQ-9 needs to be further explored in patients with neurological disorders in order to improve its diagnostic value.
In the present study, general data of patients who were admitted to the Department of Neurology of an affiliated hospital of Peking University due to different neurological disorders were collected, and the PHQ-9 questionnaire was distributed among those patients. One trained psychiatrist used the Mini-International Neuropsychiatric Interview (MINI) to evaluate the depressive symptoms of patients who completed the PHQ-9 questionnaire. Two senior psychiatrist used the HAMD to assess the severity of depression. The internal consistency, criterion validity, structural validity, and optimal cut-off values of PHQ-9 were evaluated, and the consistency assessment was conducted between the depression severity as assessed by PHQ-9 and HAMD. We also explored factors (e.g., age, gender, medical insurance, course of disease, work conditions, etc.) influencing such patients and discussed their influences comprehensively.
The aim, design and setting of the study
We aimed to explore the validity and reliability of the Patient Health Questionnaire-9 (PHQ-9) in the neurology ward when screening depression. This is a cross-sectional study. We hoped to screen all inpatients in neurology for depression and its severity using the PHQ-9. This study was approved by the Ethics Committee of Peking University Six Hospital (No.2009025).
From January 2016 to June 2016, patients with depression who suffered from neurological disorders were admitted to the Neurology Department of an affiliated hospital of Peking University (Beijing, China). Inclusion criteria were as follows: i) patients (age ≥ 18 years old) from the Neurology Department of an affiliated hospital of Peking University, ii) absence of a significant cognitive impairment (Mini-Mental Status Examination > 21) [27, 28], and iii) patients who signed the written informed consent form prior to commencing the study. Exclusion criteria were as follows: i) patients with speech dysfunction and hearing impairment, who could not complete the questionnaire, or ii) patients who aged < 18 years old. A total of 300 questionnaires were distributed among eligible patients, and a total of 290 questionnaires were returned, accounting for 96.7%. Those patients received MINI and Hamilton Rating Scale for Depression (HAMD), and the total number of patients who completed all the survey was 277. A self-edited questionnaire was designed to collect patients’ general data, including patients’ name, gender, age, ethnicity, marriage status, work experience, treatment costs, course of disease, diagnostic method, etc. The flowchart of patients’ selection is shown in Fig. 1.
The PHQ-9 is a 9-question instrument given to patients in a primary care setting to screen the presence and severity of depression. This is a self-rating scale. The results of the PHQ-9 are used to make a depression diagnosis according to the DSM-IV (Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition) criteria. Here, the PHQ-9 was formulated based on DSM-IV to understand how often patients have been bothered by symptoms of depression in the period of two weeks (0 point = never, 1 point = a few days, 2 point = more than half of the days, 3 point = almost every day). Each item was scored on a scale of 0-3, with a total score ranging from 0 to 27. Based on these scores, depressive symptoms could be divided into “none or minimum” (0-4), “mild” (5-9), “moderate” (10-14), “moderately severe” (15-19), and “severe” (20-27).
MINI (Chinese version)
As a semi-fixed diagnostic tool developed by a number of Chinese scholars, the MINI is a short structured interview used to diagnose 16 axis I DSM-IV and ICD-10 (International Classifications of Diseases and Related Health Problems, Tenth Revision) disorders . A previous research showed that the Chinese version of MINI had good reliability and validity, as well as high sensitivity and specificity for depressive disorders. The current study used the evaluation results of depression in the MINI as the “gold standard” to assess the validity of PHQ-9. We defined “1” = have depression and “0” = have no depression. This scale was completed through interviews. The depression diagnosis was made by 1 psychiatrist, deputy chief physician.
The HAMD  is a 17-item instrument that was designed to measure frequency and intensity of depressive symptoms in individuals with major depressive disorders. HAMD possesses a good reliability and validity. It comprises of 17 items, and was previously grouped into 5 structural factors (i.e., anxiety/somatization, mental disorders, retardation symptoms, sleep disturbances, and weight loss) by Cleary and Guy . The higher the score, the more severe the symptoms. The following ranges for the HAMD were recommended: no depression (0–7); mild depression (8–16); moderate depression (17–23); and severe depression (≥24). In our research, two physicians assessed the severity of the patients’ depression using HAMD.
Explanation: Zα/2 is the Z value of cumulative probability in normal distribution (Z0.05 / 2 = 1.960); δ is the allowable error; α is the inspection level; P is sensitivity or specificity.
Taking the 10 points recommended by the original maker of PHQ-9 as the screening cut-off value, a study covering 6000 subjects reported that the sensitivity was 88% and the specificity was 88% . Therefore, it was expected that the sensitivity and specificity of this test would be 88% both. We took 0.05 as the significance level α, 0.08 as the allowable error. According to the formula, the number of samples in the case group was 63 and that in the control group was 63, too. The incidence of depression in hospitalized patients in neurology department was 25.0-50% [9, 31, 32]. This experiment predicted that the prevalence rate was 25%. It was estimated that the number of PHQ-9 questionnaires issued at least should be 63 / 0.25 = 252. Taking into account the 10% loss to follow-up rate, we set the sample size as 300 cases.
SPSS 22.0 statistical software (IBM, Armonk, NY, USA) was used to perform statistical analysis, and descriptive statistics were used for expressing general data and other related descriptions. The receiver operating characteristic (ROC) curve was employed to analyze the validity, sensitivity, specificity, positive predictive value, negative predictive value, and Youden’s index of the PHQ-9, so as to find the best diagnostic cut-off score. Based on the cut-off points, consistency analysis between the severity of depression obtained by PHQ-9 and HAMD revealed a Kappa score. The linear regression analysis of PHQ-9 and HAMD was performed to obtain the PHQ-9 cut-off score for depressive symptoms with different diversities. The intraclass correlation coefficient (ICC) and Cronbach’s alpha coefficient were used to assess internal consistency. The confirmatory factor analysis was employed to analyze the structural validity of the scale. We used logistic regression analysis to explore risk factors of depression.
The participants’ general data
A total of 277 participants were involved (Table 1), including 181 (65.3%) male and 96 (34.7%) female cases. The participants aged 18-88 years old, with a mean age of 60.56 ± 15.53 years old. Regarding ethnicity, 264 (95.3%) were Han people and 13 (4.7%) cases were from other ethnicities. Table 1 presented the participants’ demographic characteristics. In terms of participants’ marital status, 252 (91.0%), 2 (0.7%), 4 (1.4%), and 19 (6.9%) cases were married, divorced, widowed, and unmarried, respectively. Regarding occupation, 83 (30%), 112 (40.4%), and 29 (10.5%) patients were in-service, retired, and unemployed, respectively, and 53 (19.1%) patients had other professions. As for medical expenses, 271 (97.8%) cases were covered by health insurance, while 6 (2.2%) cases were at their own expenses. There were 209 (75.4%) cases of cerebrovascular diseases, 14 (5.1%) cases of peripheral neuropathy, and 54 (19.5%) cases with other diseases (e.g., cervical spondylosis, epilepsy, etc.). Besides, 205 (74.0%) and 30 (10.8%) patients’ course of disease was within 1 month, over 12 months, and that of 42 (15.2%) patients varied between 1 and 12 months.
PHQ-9 scores, HAMD-17 scores and MINI results
The mean score of PHQ-9 in 277 cases was 5.27 ± 1.86. It was revealed that 166 (59.9%) patients had no depression, and 111 (40.1%) patients had depressive symptoms. Among them, 61 (22.0%), 22 (8.0%), 16 (5.8%), and 12 (4.3%) cases had mild, moderate, severe, and extremely severe-depression, respectively. The mean HAMD-17 score of 277 cases was 7.75 ± 2.83. According to HAMD-17 scores, 158 (57.0%) patients had no depression. Among 119 (43.0%) patients with depression, 82 (29.6%), 27 (9.8%), and 10 (3.6%) patients had mild, moderate, and severe-depression, respectively. MINI is fully structured to allow administration in about 15 to 20 min even by nonspecialized interviewers. It demonstrates good sensitivity, specificity, validity and reliability in the assessment of psychiatric disorders. The major depressive episode module was used in this study as a gold standard. Among 277 subjects who completed the MINI, 68 (24.5%) cases were diagnosed with depression, while 209 (75.5%) cases had no depression.
In order to investigate the reproducibility and consistency of PHQ-9, reliability coefficients as measured by Cronbach’s alpha were calculated. The Cronbach’s α coefficient for PHQ-9 was 0.839. When one of the items of PHQ-9 was deleted, the α coefficient was still between 0.806 ~ 0.839. The Pearson’s correlation coefficient among the 9 items of the PHQ-9 scale was at the range of 0.160 ~ 0.578 (P < 0.01), and the Pearson’s correlation coefficient between each item and the total score was within 0.608 ~ 0.773. The above-mentioned coefficients were statistically significant (P < 0.01) (Table 2).
In this research, the eigenvalues of factor-1, factor-2, and factor-3 were 3.385, 1.248, and 1.050, with the corresponding the explanatory variances of 37.615, 13.868, and 11.661%, respectively. The cumulative interpretation variance of the three factors was 63.114%. For rotated component matrix of factor analysis, the coefficients of interest decline, fatigue, mental motor delay, difficulty in paying attention, emotional depression, and factor-1 were 0.736, 0.717, 0.701, 0.694, and 0.563, respectively. The coefficients of suicide and self-injury, inferiority and factor-2 were 0.806 and 0.758, respectively. The matrix coefficients of sleep disorder, eating disorder and factor-3 were 0.828 and 0.781, respectively (Table 3).
Criterion validity was assessed by ROC curve. The PHQ-9 score simultaneously showing the highest sensitivity and specificity was evaluated using the ROC curve. PHQ-9’s accuracy was estimated by the area under the ROC curve (AUC). As shown in Fig. 2, the results of ROC curve analysis indicated that the AUC for PHQ-9 was 0.898 (95% confidence interval (CI): 0.859 ~ 0.937), which indicated that PHQ-9 possessed a good ability to identify depressive symptoms. When the cut-off scores were 3, 4, 5, 6, 7, 8, 9, and 10, the rates of sensitivity were 95.6, 94.1, 91.2, 82.4, 77.9, 69.1, 61.8, and 54.4%, the rates of specificity were 55.5, 68.4, 76.6, 80.4, 85.2, 87.1, 90.9, and 93.8%, and the values of Youden’s index were 0.511, 0.625, 0.678, 0.628, 0.631, 0.562, 0.527, and 0.482, respectively. When the cut-off score was equal to 5, the values of sensitivity, specificity, and the Youden’s index were 91.2, 76.6%, and 0.678, respectively (Table 4).
Cut-off scores of PHQ-9 for depression with standard of HAMD-17
The consistency analysis between PHQ-9 and HAMD showed a Kappa coefficient of 0.423. Using total score of HAMD as the independent variable, linear regression analysis of total score of HAMD and total score of PHQ-9 were performed (Fig. 3). Using the total score of HAMD as independent variable X and the total score of PHQ-9 as the dependent variable Y, the regression equation was Y = 0.719X - 0.299. The t-test was conducted on regression coefficient of 0.719 (P < 0.01), and regression relation was observed between the total HAMD score and total PHQ-9 score. The coefficient of determination R2 was equal to 0.701, and the regression model showed a good fit. Cut-off points of 7, 17, and 24 on HAMD scale represented mild, moderate, and severe symptom levels; the corresponding cut-off points on PHQ-9 scale were 5, 12, and 17, respectively.
Consistency analysis of PHQ-9 and MINI
It was previously reported that in Chinese version of the PHQ-9, a threshold of 10 or more is an accurate, reliable, and valid measure for screening depressive symptoms. Thus, taking 10 as cut-off score of the PHQ-9, a consistency analysis of the results of PHQ-9 and the MINI was conducted, and the Kappa value was 0.529, P < 0.01. However, with taking 5 as the cut-off score of the PHQ-9, the consistency analysis with the MINI showed that the Kappa value was 0.558, P < 0.01.
Consistency analysis of PHQ-9 and HAMD assessment
Severe and extremely severe cases of depression, as rated by the PHQ-9, were unified as severe. We used the cut-off values of 5, 10, and 15 for mild, moderate, and severe depression, and the depression rating scores derived from PHQ-9 and from HAMD were evaluated for consistency. The consistency analysis between PHQ-9 and HAMD showed a Kappa coefficient of 0.423, P < 0.01. In this study, we used the cut-off scores 5, 12, 17, for mild, moderate, and severe depression derived from PHQ-9 as variables, Kappa = 0.465, P < 0.01.
Analysis of depression-associated factors
Univariate analysis of depressive patients with neurological disorders who were hospitalized in department of neurology was carried out by using the Chi-square test, and the results are summarized in Table 5. The effects of gender, age, marital status, ethnicity, work, expenses of hospitalization, course of disease, and major diseases in the depression and non-depression groups were not statistically significant.
Multivariate logistic regression analysis showed that the influence of unemployment on the occurrence of depression was statistically significant (P = 0.027, odds ratio (OR) = 3.080, 95%CI: 1.133 ~ 8.374). As shown in Table 6, unemployed patients were at a high risk of depression compared with employed patients.
Depression is a widespread mental disorder that can pose threat to thoughts, mood, and physical health . Depression severity was classified into three levels, including mild, moderate, and severe. Individuals with depression not only often experience sadness, but also a lack of interest or enjoyment in activities, decreased energy, insomnia, weight changes, feelings of loss and worthlessness, and recurrent thoughts of death or suicide. The prevalence of depressive disorders was higher in neurology inpatients [34, 35]. Our study found that a Chinese version of the MINI was used to assess the status of inpatients with neurological disorders admitted to the Department of Neurology of Peking University Third Hospital, and the results showed that the prevalence of depression was 24.5%, which was similar to outpatients in different clinical specialties, but significantly higher than outpatients in healthy controls . This indicates that further attention should be paid to depression in non-psychiatric departments (e.g., department of neurology) of general hospitals.
PHQ-9, a universal community screening tool for depression, was herein used, and it was revealed that it had a good reliability and validity when it was applied to depressed patients with neurological disorders who were hospitalized at the department of neurology. It is noteworthy that the DSM-5 also recommends use of PHQ-9 as a tool for evaluating the severity of depression.
Studies conducted in China as well as overseas have consistently shown that PHQ-9 has an I-factor structure, i.e., affective factor; in other words, all items in PHQ-9 measure the same concept [36, 37]. Other studies have reported that PHQ-9 has II-factor structure: cognitive-affective factor and somatic factor . In the current research, the structure validity of the PHQ-9 was analyzed by principal component analysis, and the results extracted three main factors contributing to a cumulative explained variance of 63.114%. The analysis of the three main factors was mainly related to low mood, lack of motivation and somatic symptoms. When the PHQ-9 was compared with the MINI, it outperformed with a reasonable accuracy in identifying cases of depression. The value of AUC was 0.898, suggesting a promising diagnostic ability of the PHQ-9. In a systematic review of PHQ-9, Kroenke et al. showed that the sensitivity was 77 - 88% and the specificity was 88 - 94% with 10 points as the cut-off value . Importantly, the values of sensitivity obtained in this study was not as high as those reported by Kroenke et al. .There lies several reasons. (1) It may be related to the different source of subjects. (2) there may have another reason that when PHQ-9 is used to screen depression with patients in neurology, its sensitivity may be suboptimal and still needs further evaluation by related professionals. (3) What’s more, it might be related to the use of the MINI as a gold standard. A recent meta-analysis showed that the sensitivity of the PHQ-9 was lower (0.77 versus 0.88) when using the MINI as the gold standard compared to semi-structured interviews . In the present study, there was a strong correlation between the total scores of HAMD-17 and PHQ-9, which was consistent with previous findings [41, 42]. These findings support the validity and feasibility of the use of PHQ-9 for assessing depression severity. In the current study, we used PHQ-9 scale scores of 5, 12, and 17 as cut-off scores to designate mild, moderate, and severe symptoms of depression, respectively. This is slightly different from the cut-off scores used by the original developers of the scale. They recommended cut-off scores of 5, 10, 15, and 20 to designate mild, moderate, moderately severe, and severe depression, which is also more easily remembered by clinicians.
Consistent with previous studies, the results of the present study revealed that the PHQ-9 has a high reliability evidenced by the Cronbach’s α coefficient. The internal consistency of the PHQ-9 was assessed by using the Cronbach’s alpha coefficient, and it was found to be 0.839. The correlation coefficients between the nine entries of the scale were 0.160 ~ 0.578 (P < 0.01), and the correlation coefficients between each entry and the total score of the scale were 0.608 ~ 0.773, all of which had a significant correlation relationship (P < 0.01). This indicates that the PHQ-9 has an acceptable predictive performance. Our findings were similar to those observed in validation studies whose Cronbach alpha values were found to be 0.8 in a study of Mexico , 0.74 in Australia , and 0.78 in Thailand .
The present study analyzed factors influencing depression in patients with neurological disorders who were hospitalized in department of neurology. We compared the factors of gender, age, marital status, ethnicity, work experience, hospitalization expenses, course of disease, number of patients, and major disorders in the depression and non-depression groups. The results of univariate analysis did not indicate any statistical significance. Previous studies reported that age and gender are significantly correlated with the occurrence of depression [46, 47]. It was previously found that the scores of depressive symptoms in stroke patients who aged 25-54 to 55-64 years old were significantly higher than those in other age-based groups . It should be noted that the results of the current study did not reveal any significant correlation between depression and age/sex. The influences of age and gender on the depression of patients with neurological disorders who admitted to the department of neurology need further discussion. The current study did not make a detailed classification and comparison of various domestic reimbursement methods. The multivariate logistic regression analysis showed that unemployed cases were at a higher risk of depression. Previous studies showed that depression is closely correlated to unemployment. Scholars  pointed out that nearly one fifth of long-term unemployed men were diagnosed with major depressive disorders. Unemployment may be a potential predictor of depression, weakening the work productivity, thereby increasing the risk of long-term unemployment [50,51,52].
The application of PHQ-9 scale on such patients showed a good reliability and validity. However, the current study contains a number of limitations. First, the samples were only patients with neurological disorders from one general hospital. Second, due to the short period of hospitalization, no retesting of reliability was undertaken. Last but not least, this study did not analyze the effects of various neurological diseases on depression. Thus, further studies need to be carried out to confirm our findings and eliminate the above-mentioned deficiencies.
In summary, depressive disorders are more common among patients with neurological disorders. Since depression can bring many adverse prognoses to patients, even lead to suicide, early identification of depression needs the attention of non psychiatrists. Our study demonstrated good reliability and validity of the PHQ-9 by applying this questionnaire to screen depressed patients in a neurology department of general hospital. PHQ-9 is worth promoting and applying in the general hospital department of neurology.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Patient Health Questionnaire-9
Mini-International Neuropsychiatric Interview
Hamilton Rating Scale for Depression
Receiver Operating Characteristic
Self-rating Depression Scale
Structured Clinical Interview for DSM-IV Axis I Disorders
Composite International Diagnostic Interview
Cornell Scale for Depression in Dementia
Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition
Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition
International Classifications of Diseases and Related Health Problems, Tenth Revision
Intraclass Correlation Coefficient
Area under the ROC curve
Clark LA, Cuthbert B, Lewis-Fernández R, Narrow WE, Reed GM. Three approaches to understanding and classifying mental disorder: ICD-11, DSM-5, and the National Institute of Mental Health's research domain criteria (RDoC). Psychol Sci Public Interest. 2017;18:72–145.
WHO. The global burden of disease: 2004 update. Geneva: World Health Organization; 2008.
Malhi GS, Mann JJ. Depression. Lancet. 2018;392:2299–312.
Vandeleur CL, Fassassi S, Castelao E, et al. Prevalence and correlates of DSM-5 major depressive and related disorders in the community. Psychiatry Res. 2017;250:50–8.
Read JR, Sharpe L, Modini M, Dear BF. Multimorbidity and depression: a systematic review and meta-analysis. J Affect Disord. 2017;221:36–46.
Kasthurirathne SN, Biondich PG, Grannis SJ, Purkayastha S, Vest JR, Jones JF. Identification of patients in need of advanced Care for Depression Using Data Extracted from a statewide health information exchange: a machine learning approach. J Med Internet Res. 2019;21(7):e13809.
Keezer MR, Sisodiya SM, Sander JW. Comorbidities of epilepsy: current concepts and future perspectives. Lancet Neurol. 2016;15:106–15.
Marinus J, Zhu K, Marras C, Aarsland D, van Hilten JJ. Risk factors for non-motor symptoms in Parkinson's disease. Lancet Neurol. 2018;17:559–68.
Marsh L. Depression and Parkinson's disease: current knowledge. Curr Neurol Neurosci Rep. 2013;13:409.
Zhang J, Wei J, Shi LL, et al. Zhonghua Yi Xue Za Zhi. 2007;87:889–93.
Finney GR, Minagar A, Heilman KM. Assessment of mental status. Neurol Clin. 2016;34:1–16.
Zung WW. A self-rating depression scale. Arch Gen Psych. 1965;12:63–70.
First MB, Spitzer RL, Gibbon M, Williams J. Structured clinical interview for DSM-V-TR Axis I disorders-non-patient ed. (SCII/ NP-2/2001 Revision). New York: Biometrics Research Department; 2001.
Wittchen H-U, Semler G. Composite international diagnostic interview. CIDI Interviewerheft (deutsche Bearbeitung) (World Health Organization, Ed.). Weinheim: Beltz Test; 1991.
Sheehan D. The Mini International Neuropsychiatric Interview (M. I. N. I.) 5.0 [J]. https://www.medical-outcomes.com/HTMLFiles/MINI/MINI.htm.
Ben Jemaa S, Marzouki Y, Fredj M, Le Gall D, Bellaj T. The adaptation and validation of an Arabic version of the Cornell scale for depression in dementia (A-CSDD). J Alzheimers Dis. 2019;67:839–48.
Hamilton M. A rating scale for depression. J Neurol Neurosurg psychiatry. 1960;23:56-62. 18. Hamilton M. development of a rating scale for primary depressive illness. Br J Soc Clin Psychol. 1967;6:278–96.
Hamilton M. Development of a rating scale for primary depressive illness. Br J Soc Clin Psychol. 1976;6:278–96.
Spitzer RL, Kroenke K, Williams JB. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary care evaluation of mental disorders. Patient health questionnaire. JAMA. 1999;282:1737–44.
Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606–13.
Haihong Y. Value of 9 items patients' health questionnaire depression scale in screening of post-stroke depression. J Clin Med Pract. 2016;8:28–30.
Phelan E, Williams B, Meeker K, et al. A study of the diagnostic accuracy of the PHQ-9 in primary care elderly. BMC Fam Pract. 2010;11:63.
Rathore JS, Jehi LE, Fan Y, et al. Validation of the patient health Questionnaire-9 (PHQ-9) for depression screening in adults with epilepsy. Epilepsy Behav. 2014;37:215–20.
Manea L, Gilbody S, McMillan D. Optimal cut-off score for diagnosing depression with the patient health questionnaire (PHQ-9): a meta-analysis. CMAJ. 2012;184:E191–6.
Kroenke K, Spitzer RL. The PHQ-9: a new depression diagnostic and severity measure. Psychiatr Ann. 2002;32:509–21.
Malpass A, Dowrick C, Gilbody S, et al. Usefulness of PHQ-9 in primary care to determine meaningful symptoms of low mood: a qualitative study. Br J Gen Pract. 2016;66:e78–84.
Mehdizadeh M, Fereshtehnejad SM, Goudarzi S, et al. Validity and reliability of short-form McGill pain Questionnaire-2 (SF-MPQ-2) in Iranian people with Parkinson's disease. Parkinsons Dis. 2020;2020:2793945.
Taghizadeh G, et al. King's Parkinson's disease pain scale cut-off points for detection of pain severity levels: a reliability and validity study. Neurosci Lett. 2021;745:135620.
Sheehan DV, Lecrubier Y, Sheehan KH, et al. The Mini-international neuropsychiatric interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J Clin Psychiatry. 1998;20:22–57.
Cleary P, Guy W. Factor analysis of the Hamilton depression scale. Drugs Exp Clin Res. 1977;1:115–20.
Wang J, Wu X, Lai W, et al. Prevalence of depression and depressive symptoms among outpatients: a systematic review and meta-analysis. BMJ Open. 2017;7:e017173.
Feinstein A, Magalhaes S, Richard JF, Audet B, Moore C. The link between multiple sclerosis and depression. Nat Rev Neurol. 2014;10:507–17.
Cui R. Editorial: a systematic review of depression. Curr Neuropharmacol. 2015;13:480.
Yang H, Hong W, Chen L, Tao Y, Peng Z, Zhou H. Analysis of risk factors for depression in Alzheimer's disease patients. Int J Neurosci. 2020;130:1136–41.
Lekoubou A, Bishu KG, Ovbiagele B. Costs and cost-drivers of a diagnosis of depression among adults with epilepsy in the United States. Epilepsy Behav. 2019;98(Pt A):96–100.
Cameron IM, Crawford JR, Lawton K, Reid IC. Psychometric comparison of PHQ-9 and HADS for measuring depression severity in primary care. Br J Gen Pract. 2008;58:32–6.
Liu SI, Yeh ZT, Huang HC, et al. Validation of patient health questionnaire for depression screening among primary care patients in Taiwan. Compr Psychiatry. 2011;52:96–101.
Chilcot J, Rayner L, Lee W, et al. The factor structure of the PHQ-9 in palliative care. J Psychosom Res. 2013;75:60–4.
Kroenke K, Spitzer RL, Williams JB, Löwe B. The patient health questionnaire somatic, anxiety, and depressive symptom scales: a systematic review. Gen Hosp Psychiatry. 2010;32:345–59.
Gilbody S, Richards D, Brealey S, Hewitt C. Screening for depression in medical settings with the patient health questionnaire (PHQ): a diagnostic meta-analysis. J Gen Intern Med. 2007;22:1596–602.
Sun Y, Fu Z, Bo Q, Mao Z, Ma X, Wang C. The reliability and validity of PHQ-9 in patients with major depressive disorder in psychiatric hospital. BMC Psychiatry. 2020;20:474.
Ye X, Shu HL, Feng X, et al. Reliability and validity of the Chinese version of the patient health Questionnaire-9 (C-PHQ-9) in patients with psoriasis: a cross-sectional study. BMJ Open. 2020;10:e033211.
Arrieta J, Aguerrebere M, Raviola G, et al. Validity and utility of the patient health questionnaire (PHQ)-2 and PHQ-9 for screening and diagnosis of depression in rural Chiapas, Mexico: a cross-sectional study. J Clin Psychol. 2017;73:1076–90.
Titov N, Dear BF, McMillan D, Anderson T, Zou J, Sunderland M. Psychometric comparison of the PHQ-9 and BDI-II for measuring response during treatment of depression. Cogn Behav Ther. 2011;40:126–36.
Dajpratham P, Pukrittayakamee P, Atsariyasing W, Wannarit K, Boonhong J, Pongpirul K. The validity and reliability of the PHQ-9 in screening for post-stroke depression. BMC Psychiatry. 2020;20:291.
Azah MN, Shah ME, Shaaban J, Bahri I, Rushidi WM, Yaacob MJ. Validation of the Malay version brief patient health questionnaire (PHQ-9) among adult attending family medicine clinics. Inter Med J. 2005;12:259–63.
Adewuya AO, Ola BA, Afolabi OO. Validity of the patient health questionnaire (PHQ-9) as a screening tool for depression amongst Nigerian university students. J Affect Disord. 2006;96:89–93.
Woldetensay YK, Belachew T, Tesfaye M, et al. Validation of the patient health questionnaire (PHQ-9) as a screening tool for depression in pregnant women: Afaan Oromo version. PLoS One. 2018;13:e0191782.
Hanlon C, Medhin G, Selamu M, et al. Validity of brief screening questionnaires to detect depression in primary care in Ethiopia. J Affect Disord. 2015;186:32–9.
Monahan PO, Shacham E, Reece M, et al. Validity/reliability of PHQ-9 and PHQ-2 depression scales among adults living with HIV/AIDS in western Kenya. J Gen Intern Med. 2009;24:189–97.
Omoro SA, Fann JR, Weymuller EA, Macharia IM, Yueh B. Swahili translation and validation of the patient health Questionnaire-9 depression scale in the Kenyan head and neck cancer patient population. Int J Psychiatry Med. 2006;36:367–81.
Udedi M, Muula AS, Stewart RC, Pence BW. The validity of the patient health Questionnaire-9 to screen for depression in patients with type-2 diabetes mellitus in non-communicable diseases clinics in Malawi. BMC Psychiatry. 2019;19:81.
We would particularly like to thank all patients who participated in the study as well as medical staff and research assistants for their contributions to the study. We thank Dr. Huang Xiao for his contribution to this study.
This work was supported by the National Science and technology support plan of the Ministry of science and technology (2009BA77B00).
Ethics approval and consent to participate
All methods were performed in accordance with the relevant guidelines and regulations. Research involving human participants or human data have been performed in accordance with the Declaration of Helsinki and this study was approved by the Ethics Committee of Peking University Six Hospital. All patients have understood the content and purpose of the study, agreed to participate, and completed the informed consent form.
Consent for publication
All authors read and approved the present manuscript and gave their consent for publication.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sun, Y., Kong, Z., Song, Y. et al. The validity and reliability of the PHQ-9 on screening of depression in neurology: a cross sectional study. BMC Psychiatry 22, 98 (2022). https://doi.org/10.1186/s12888-021-03661-w