- Research article
- Open Access
- Open Peer Review
Psychometric evaluation of the Malay version of the Montgomery- Asberg Depression Rating Scale (MADRS-BM)
BMC Psychiatryvolume 15, Article number: 200 (2015)
This study examines the psychometric properties of the Malay version of the Montgomery-Ǻsberg Depression Rating Scale (MADRS-BM).
A total of 150 participants with (n = 50) and without depression (n = 100) completed the self-rated version of the Montgomery-Ǻsberg Depression Rating Scale (MADRS-S), the Malay versions of the MADRS-BM, the Beck Depression Inventory-II (BDI-II-M), the General Health Questionnaire-12 (GHQ-12), and the Snaith-Hamilton Pleasure Scale (SHAPS-M).
With respect to dimensionality of the MADRS-BM, we obtained one factor solution. With respect to reliability, we found that internal consistency was satisfactory. The scale demonstrated excellent parallel form reliability. The one-week test-retest reliability was good. With respect to validity, positive correlations between the MADRS-BM, BDI-II-M, and the GHQ and negative correlation between the MADRS-BM and SHAPS-M provide initial evidence of MADRS-BM’s concurrent validity. After adjusting for age, gender, ethnicity, educational level, and marital status, individuals with depression significantly reported higher MADRS-BM scores than did individuals without depression. Hence, there is additional evidence for concurrent validity of the MADRS-BM. Cut-off score of 4 distinguished individuals with depression from individuals without depression with a sensitivity of 78 % and a specificity of 86 %.
The MADRS-BM demonstrated promising psychometric properties in terms of dimensionality, reliability, and validity that generally justifies its use in routine clinical practice in Malaysia.
To study treatment efficacy, researchers often rely on the use of clinician-rated instruments . Clinician-rated instruments like the Montgomery-Åsberg Depression Rating Scale (MADRS) have been widely used to assess depression .
The MADRS is a popular scale because of its high inter-rater reliability and high sensitivity to detect changes in treatment effects . Due to these features, the MADRS has been widely used in mood disorders studies [3–5]. However, the MADRS has recently received increased scrutiny due to rising rate of unsuccessful clinical trials . As reported in the clinical trials, poor inter-rater reliability and rater bias are two common shortcomings associated with clinician-rated scales like the MADRS. Due to these shortcomings, clinical assessment pertaining to depression severity is a subject of debate . The robustness of clinical findings is also questionable . To address this research gap, the MADRS-S, a 9-item self-report measure of depression, was developed . Participants rate items on a 4-point Likert scale ranging from 0 (no depressive symptoms) to 3 (worst depressive symptoms). Possible score ranges from 0 to 27, with higher scores indicating greater symptom severity. The MADRS-S has been found to have a high degree of concordance with the clinician-rated MADRS and demonstrates adequate reliability (alpha = 0.84; intraclass correlation coefficient, ICC = 0.78) .
Although in Malaysia, there are a few scientific attempts devoted to validate depression scales such as the Malay versions of the Beck Depression Index (BDI) , Beck Depression Index, Second Edition (BDI-II) , and the Depression Anxiety and Stress Scales (DASS) . In the case of the Malay version of the BDI, the scale has been validated in a depression sample  and has resulted two major revisions—the authors removed four items which have low sensitivity in identifying typical depressive symptoms. Therefore, identification of depressive symptoms in psychiatric samples using the Malay version of the BDI may be prohibited. In the case of the Malay versions of the BDI-II and DASS, the scales have only been validated in specific samples (e.g., men with urological problems, postpartum women, and infertile couples [10, 11]).
The Malay versions of the BDI, BDI-II, and DASS are multidimensional scales. Specific items from these multidimensional scales could not yield a theoretically sound composite score , reducing their sensitivity in detecting changes in depression severity . Unlike the aforementioned scales, the MADRS-S is characterized by a single domain and has good sensitivity in detecting changes in depression and in tracing differential effects of drug on placebo/treatment groups . To the best our knowledge, the Malay version of the MADRS-S has not yet been validated. Therefore, the purpose of this study was to examine the psychometric properties of the Malay version of the Montgomery-Åsberg Depression Rating Scale (MADRS-BM).
Stage 1: Early development of the MADRS-BM
We obtained permission from the original author of the MADRS, Stuart M. Montgomery, for conducting this study. A copy of permission letter was sent to the editor of this journal. The scale was translated from English to Malay in parallel by two bilingual clinical psychologists, whereas a bilingual language expert performed the back-translation. Discrepancies between the original version and the back translation were resolved through discussion and adjustments were made, where necessary. In Stage 1, we finalized the initial version of the MADRS-BM with an expert panel of psychiatrists and family physicians.
Stage 2: Refinement of the MADRS-BM
We pilot-tested the initial version of the MADRS-BM using 10 native Malay-speaking nurses to identify any flaws in terms of wording. We noted any words that were considered unsuitable or inappropriate. The scale was also reviewed by a psychiatric consultant, who has vast experience in clinical research, to ensure satisfactory face, semantic, criterion, and conceptual equivalences. In Stage 2, we redefined the Malay version of the MADRS-BM.
Stage 3: Evaluation of the MADRS-BM
Participants and procedure
The study was conducted from September until December 2013 at Psychiatric Outpatient Clinic, University Malaya Medical Centre. The study protocol was approved by the Medical Ethics Committee (MEC) of the University Malaya Medical Centre. For the purposes of the study, we recruited individuals with and without depression. Criteria to select individuals with depression include: (a) subjects who were diagnosed with major depressive disorder (the first author who is a trained clinical psychiatrist confirmed the diagnoses using the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition, Text Revision (DSM-IV-TR ),(b) subjects who had no other major psychiatric illnesses or psychoses, (c) subjects who are capable of understanding and reading Malay or English, (d) subjects who are 18 or above, and (e) subjects who gave consent with regard to participation of this study. Individuals without depression were medical workers from the University Malaya Medical Centre. Their participation was based on the criteria as indicated above with the exception of (b). Based on subject to ratios of 5:1, it is statistical appropriate to include 45 individuals with depression and 90 individuals without depression, given that the MADRS-BM has nine items . However, to avoid attrition, we decided to recruit 50 individuals with depression and 100 individuals without depression. For data collection, we identified the eligible subjects and explained research procedure to them. After we sought their written consent, we then distributed a self-administered questionnaire. To obtain test-retest reliability of the MADRS-BM, we invited the subjects to complete the scale again after one week.
Participants were invited to provide their socio-demographic information such as age, gender, ethnic group, marital status, educational level, religion, and employment status.
The Malay version of the Beck Depression Inventory-Second Edition (BDI-II-M)
The BDI-II-M is a 21-item self-report measure of depression based on a 2-week time period . Participants rated items based on a 4-point Likert Scale ranging from 0 (no depressive symptoms) to 3 (worst depressive symptoms). Higher scores indicate greater depression. As demonstrated in previous study, the scale demonstrated high internal consistency (alpha = 0.89) and split-half reliability (unequal length Spearman Brown = 0.84) .
The Malay version of the Snaith-Hamilton Pleasure Scale (SHAPS-M)
The SHAPS-M is a 14-item self-report measure of hedonic experience encompassing interest/pastimes, social interaction, sensory experience, and food/drink. Participants rated items based on a 4-point Likert scale ranging from 1 (definitely disagree) to 4 (definitely agree) . Lower scores indicate greater hedonic experience. The scale exhibited excellent internal consistency (alpha = 0.96), concurrent validity, and parallel form reliability (ICC = 0.65) in previous study .
The Malay version of the General Health Questionnaire-12 (GHQ-12)
The Malay version of the GHQ-12 is a 12-item self-report measure of current mental health. Participants rated items based on a 4-point Likert scale ranging from 0 (always) to 3 (never) for positive items and ranging from 3 (always) to 0 (never) for negative items. Higher scores indicate greater symptom severity. As shown in previous study, the scale has good internal consistency (alpha = 0.85) .
The self-rated version of the Montgomery-Ǻsberg Depression Rating Scale (MADRS-S)
The English version of the MADRS-S is a 9-item self-report measure of depression. Participants rated items on a 4-point Likert scale ranging from 0 (no depressive symptoms) to 3 (worst depressive symptoms). Higher scores indicate greater symptom severity. As demonstrated in previous study, the scale has good parallel form reliability (ICC = 0.78) and adequate reliability (alpha = 0.84) .
The Malay version of the Montgomery-Ǻsberg Depression Rating Scale (MADRS-BM)
The Malay version of the MADRS-BM is a 9-item self-report measure of depression. Both the MADRS-BM and the MADRS-S are identical in terms of scoring and interpretation as mentioned above.
Data analyses were completed with the use of Statistical Package for the Social Sciences version 20.0 (SPSS, Chicago, IL, USA). Baseline characteristics pertaining to participants were computed using descriptive statistics. To establish dimensionality of the MADRS-BM, we performed principal component analysis. We used Cronbach’s alpha to provide an indication of internal consistency. We also assessed the homogeneity of the scales by calculating correlation coefficients between items and total scores, if an item was deleted. To examine the parallel form reliability between the MADRS-BM and MADRS, and the one week test-retest reliability of the MADRS-BM, we calculated the ICCs. In establishing concurrent validity, we examined correlations between the MADRS-BM and other measures (BDI-II-M, GHQ-12, and SHAPS-M) with Spearman’s test. To examine whether individuals with and without depression would differ significantly in terms of the MADRS-BM scores, we performed analysis of covariance (ANCOVA), while controlling for age, gender, ethnicity, , marital status, and educational level. The optimal MADRS-BM cut off score for individuals with depression was determined on the co-ordinate points as indicated in the receiver operating characteristic (ROC) analysis; we then obtained the rates of sensitivity and specificity
Table 1 shows demographic information across participants with and without depression. We recruited 50 participants with depression (50 % male, 50 % female) and 100 participants without depression (28 % male, 72 % female).
Dimensionality of the MADRS-BM
Bartlett’s test of sphericity was significant (p < .01) and the Kaiser-Meyer-Olkin measure of sampling adequacy for the MADRS-BM was 0.93, indicating that the sampling adequacy was meritorious . A single factor was extracted using the principle component approach (eigenvalue >1.00), which accounted for 61.3 % of the total variance. Likewise, as indicated by the scree plot, a single predominant factor was displayed. Taken together, the MADRS-BM contained only a single construct measuring individuals’ psychological state.
The MADRS-BM exhibited good internal consistency (alpha = 0.78). All the items had corrected item-total correlations that were 0.7 or above. Removal of items, if any, would not increase the alpha value (see Table 2). The parallel form reliability between the MADRS-S and the MADRS-BM was excellent (ICC = 0.98, p < .01). The scale demonstrated good one-week test-retest reliability (ICC = .88, p < .01).
The MADRS-BM was significantly and positively correlated with the BDI-II-M (p < .01) and the GHQ (p < .01) scores, but the scale was significantly and negatively correlated with the SHAPS-M (p < .01). Therefore concurrent validity of the MADRS-BM was established (see Table 3).
After adjusting for age, gender, ethnicity, educational level, and marital status, individuals with depression (M = 7.97, SD = 5.70) significantly reported higher MADRS-BM scores than did individuals without depression (M = 1.51, SD = 1.39) (Table 4). Our findings found additional evidence for concurrent validity of the MADRS-BM.
The area under the receiver operating characteristic curve (i.e., the AUC) was 0.91 (95 % CI = 0.86-0.96). The optimal cut-off score to distinguish individuals with depression from individual without depression was ≥ 4 with a sensitivity of 78 % and a specificity of 86 %.
Our current findings show that the MADRS-BM has good internal consistency with an alpha value of 0.70. This result is comparable to the properties of the clinician-rated MADRS (alpha = 0.70) . Also comparable to the original version of the MADRS, the MADRS-BM demonstrated good parallel form reliability (ICC = 0.98) and one-week test-retest reliability (ICC = 0.88) . The present findings reveal that the MADRS-BM is at least equivalent, if not better, to the MADRS as an assessment tool for depression. In terms of dimensionality, our findings revealed a single factor that accounted a large proportion of the variance in MADRS-BM. In line with previous studies, its factor structure was similar to that of the MADRS [7, 9].
We also examined the concurrent validity of the the MADRS-BM by linking the MADRS-BM with the BDI-II-M, GHQ-12, and SHAPS-M. Positive correlations between the MADRS-BM, BDI-II-M, and GHQ and negative correlation between the MADRS-BM and SHAPS-M provide initial evidence of MADRS-BM’s concurrent validity. Additional evidence for concurrent validity of the MADRS-BM was reported. After adjusting for some socio-demographic information, individuals with depression significantly reported higher MADRS-BM scores as compared to individuals without depression.
In this study, the cut off score for the MADRS-BM was 4, which is lower than the recommended score of 5, as suggested by the original MADRS. One possible explanation is that the current version of the MADRS-BM is a self-rated scale—participants tend to underrate or underestimate their symptoms. Even though the cut off score was lower than that of the MADRS, the MADRS-BM’s sensitivity was greater than that of the MADRS.
A few limitations of this study warrant consideration. Firstly, given the cross-sectional nature of this study, we were unable to rule out the causal factors of depression. Likewise, we were unable to assess the predictive validity of the MADRS-BM. Secondly, our sample was recruited from an outpatient clinic in a tertiary hospital using convenience sampling; thus we raised concern over generalizability as one possible limitation. Lastly, some clinical features such as the severity of depression and the types of antidepressants being used by the patients were not documented in the current study. The presence of such clinical features could affect the MADRS-BM scores as reported by participants.
In spite of these limitations, the MADRS-BM demonstrated promising psychometric properties in terms of dimensionality, reliability, and validity that generally justifies its use in routine clinical practice in Malaysia. In order to further establish its psychometric properties, future diagnostic studies using the standards for reporting of diagnostic accuracy (STARD) criteria are recommended.
This is a requirement for online studies made by the local ethics committee.
Depression Guideline Panel. Depression in Primary Care, vol 1: Detection and Diagnosis. Clinical Practice Guideline, No 5. AHCPR Publication No. 93–0550. Rockville, Md, US Department of Health and Human Services: Public Health Service, Agency for Health Care Policy and Research; 1993.
Montgomery SA, Asberg M. A new depression scale designed to be sensitive to change. Brit J Psychiat. 1979;134(4):382–9.
Gerstenberg G, Aoshima T, Fukasawa T, Yoshida K, Takahashi H, Higuchi H, et al. Relationship between clinical effects of fluvoxamine and the steady-state plasma concentrations of fluvoxamine and its major metabolite fluvoxamino acid in Japanese depressed patients. Psychopharmacology. 2003;167(4):443–8.
Mihara K, Otani K, Tybring G, Dahl M-L, Bertilsson L, Kaneko S. The CYP2D6 genotype and plasma concentrations of mianserin enantiomers in relation to therapeutic response to mianserin in depressed Japanese patients. J Clin Psychopharmacol. 1997;17(6):467–71.
Mihara K, Yasui-Furukori N, Kondo T, Ishida M, Ono S, Ohkubo T, et al. Relationship between plasma concentrations of trazodone and its active metabolite, m-chlorophenylpiperazine, and its clinical effect in depressed patients. Ther Drug Monit. 2002;24(4):563–6.
Mundt JC, Katzelnick DJ, Kennedy SH, Eisfeld BS, Bouffard BB, Greist JH. Validation of an IVRS version of the MADRS. J Psychiatr Res. 2006;40(3):243–6.
Bondolfi G, Jermann F, Rouget BW, Gex-Fabry M, McQuillan A, Dupont-Willemin A, et al. Self-and clinician-rated Montgomery–Åsberg Depression Rating Scale: Evaluation in clinical practice. J Affect Disord. 2010;121(3):268–72.
Svanborg P, Åsberg M. A new self-rating scale for depression and anxiety states based on the Comprehensive Psychopathological Rating Scale. Acta Psychiatr Scand. 1994;89(1):21–8.
Fantino B, Moore N. The self-reported Montgomery-Åsberg depression rating scale is a useful evaluative tool in major depressive disorder. BMC Psychiatry. 2009;9(1):26.
Quek KF, Low WY, Razack AH, Loh CS. Reliability and Validity of the Malay version of Beck Depression Inventory (BDI) among urological patients. MJP. 2001;9(1):29–34.
Mahmud WMRW, Awang A, Herman I, Mohamed MN. Analysis of the psychometric properties of the Malay version of Beck Depression Inventory II (BDI-II) among postpartum women in Kedah, North West of Peninsular Malaysia. MJMS. 2004;11(2):19.
Musa R, Fadzil MA, Zain Z. Translation, validation and psychometric properties of Bahasa Malaysia version of the Depression Anxiety and Stress Scales (DASS). ASEAN Journal of Psychiatry. 2007;8(2):82–9.
Cusin C, Yang H, Yeung A, Fava M. Rating scales for depression. Handbook of clinical rating scales and assessment in psychiatry and mental health. Humana Press. 2010;7–35.
American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-IV-TR®. Washington, DC: American Psychiatric Publishing; 2000.
Gorsuch RL. Factor analysis. 2nd ed. Hillsdale, New Jersey: Erlbaum; 1983.
Yee A, Loh HS, Ng CG. Factorial validity and reliability of the Simplified-Chinese Version Of Snaith-Hamilton Pleasure Scale: A study among depressed patients at an out-patient clinic in Malaysia. ASEAN J Psychiatry. 2013;15(1):66–71.
Ng CG, Chin SC, Yee AHA, Loh HS, Sulaiman AH, Wong SSK, et al. Validation of Malay Version of Snaith-Hamilton Pleasure Scale: Comparison between Depressed Patients and Healthy Subjects at an Out-Patient Clinic in Malaysia. MJMS. 2014;21(3):62–70.
Yusoff MSB, Rahim AFA, Yaacob MJ. The sensitivity, specificity and reliability of the Malay version 12-items General Health Questionnaire (GHQ-12) in detecting distressed medical students. ASEAN J Psychiatry. 2010;11(1):36–43.
Kaiser HF. An index of factorial simplicity. Psychometrika. 1974;39(1):31–6.
This study was carried out at the University of Malaya Medical Centre, Kuala Lumpur, Malaysia. We would like to extend our appreciation to Stuart M. Montgomery for granting the permissions to translate the MADRS-S and to publish the MADRS-BM. We would also like thank Danial Aziz Bahaman, Helenna Hashim, and Ernie Azwa Yusop for conducting the forward and backward translations of the MADRS-S.
The authors declare that they have no competing interests.
AY conceived the study and developed the study material. ARMY, AY, and HMH conducted the forward and backward translations of the Montgomery–Åsberg Depression Rating Scale. AY and NCG carried out data collection. AY, HSL, KAT and NCG analysed the data, and AY, HSL, NCG, and HMH drafted the manuscript. All authors read and approved the final manuscript.