Detection of malingering: psychometric evaluation of the Chinese version of the structured interview of reported symptoms-2

Background Malingering detection has emerged as an important issue in clinical and forensic settings. The Structured Interview of Reported Symptoms-2 (SIRS-2) was designed to assess the feigned symptoms in both clinical and non-clinical subjects. The aim of the study was to examine the reliability and validity of the Chinese version of this scale. Methods Two studies were conducted to evaluate the reliability and validity of the Chinese Version of SIRS-2. In Study one, with a simulation design, the subjects included a. 40 students asked to simulate symptoms of mental illness; b. 40 general psychiatric inpatients and c. 40 students asked to reply to questions honestly. Scales scores for feigning symptoms among three groups were carried out for discriminant validity of the Chinese Version of SIRS-2. Minnesota Multiphasic Personality Inventory-2(MMPI-2) was administered in 80 undergraduate students. In Study two, with a known-groups comparison design, scales scores for feigning symptoms were compared between 20 suspected malingerers and 80 psychiatric outpatients from two forensic centers using the Chinese Version of SIRS-2. Results The Chinese Version of SIRS-2 demonstrated satisfactory internal consistency in both study one and two. In study one, criterion validity of this scale was supported by its significantly positive correlation with the MMPI-2 (r = 0.282 ~ 0.481 for Infrequency), and by its significantly negative correlation with the MMPI-2 (r = -0.255 ~ -0.519 for Lie and -0.205 ~ 0.391 for Correction). Scores of 10 out of 13 subscales of the Chinese Version of SIRS-2 for simulators were significantly higher than scores of honest students and general psychiatric patients. In study two, the mean scores of the Chinese Version of 13 subscales for suspected malingerers were significantly higher than those of psychiatric outpatients. For discriminant validity, it yielded a large effect size (d = 1.80) for the comparison of the participant groups in study one and two. Moreover, the sensitivity (proportion of malingerers accurately identified by the measure) and specificity (proportion of people accurately classified as responding honestly) of the Chinese version of SIRS-2 in the detection of malingering in these two studies are acceptable. Conclusions The Chinese version of the SIRS-2 has good psychometric properties and is a valid and reliable tool for detection of malingering in Chinese populations.


Background
According to the DSM-IV-TR, malingering is used to denote "the intentional production of false or grossly exaggerated physical or psychological symptoms, motivated by external incentives such as avoiding military duty, avoiding work, obtaining financial compensation, evading criminal prosecution, or obtaining drugs". The surveys for forensic and non-forensic referrals yielded the mean percentages of about 15% and about 7% respectively of malingering [1]. Chiang et al. [2] found that 9.1% of the people in a sample of draftees seeking psychiatric re-evaluation were suspected to be malingerers in Taiwan. The phenomenon of malingering creates social compensation problems and wasting of medical resources. However, detection of malingering is not easy. There were some screening indices in DSM-IV-TR, but these have not been rigorously tested [3]. Available data suggest that these indices may produce unacceptably high false-positives in the neighborhood of 80% [4]. Several psychological tests have been developed in clinical settings, such as the MMPI-2, Personality Assessment Inventory (PAI), and Millon Clinical Multiaxial Inventory-III (MCMI-III) [5,6]. However, these measures were not specifically designed to detect malingering.
In China, there are very few studies addressing the issue of malingering and its detection. For example, Liu [7] described the assessment of malingering in clinical settings, while Yang [8] reviewed the psychopathology and diagnosis of malingering. However, very few studies have examined malingering with a special and standardized measure in Chinese populations.
The Structured Interview of Reported Symptoms −2 (SIRS-2) is one of the most well-known psychological instruments designed to assess malingering [9]. Unlike the MMPI, the SIRS has been developed specifically to assess whether an examinee is feigning psychological symptoms. The SIRS-2 is a 172-item, intervieweradministered rating scale that relies on empiricallybased strategies to assess malingering [9]. The SIRS-2 has been validated for use in clinical and non-clinical samples [9]. The SIRS-2 is widely used for the identification of malingering in forensic psychiatry [10], and it has been validated for the detection of feigning specific disorders or cognitive deficits among psychiatric populations and adolescent offenders [11][12][13]. However, in some cases, the SIRS-2 might misclassify the examinee, which will limit the SIRS-2 applicability. For example, traumatized inpatients with a broad array of presenting symptoms, especially those who report childhood trauma and dissociative symptoms have been misclassified [11].
The SIRS-2 [14] has been found to have moderately high internal consistency with Cronbach 's alphas coefficient for three subscales (primary scales ranging from .77 to .92; supplementary scales ranging from .75 to .82).
The criterion validity, predictive validity and test-retest reliability of SIRS-2 have been established as well [15]. Moreover, the SIRS-2 is one of the few malingering assessment measures to have the discriminant validity for malingering in clinical and non-clinical samples. Specifically, honest samples were found to score higher than malingerer samples in both simulation and known-groups comparison with large effect sizes to discriminate suspected malingerers from controls (mean d = 1.74) [16].
The literature indicates that the SIRS-2 is one of the best instruments for the detection of malingering to date. However, there is no reliable and valid Chinese instrument for malingering detection. Our objective in this study is to examine the psychometric properties of the Chinese Version of SIRS-2 using a simulation design (study one) and known-groups comparison design (study two) in China. There are two reasons for selecting these two designs: (a) Simulation designs is ideal for controlling many elements of the experimental design and provide the best basis for internal validity, but it possesses threats to external validity, as the use of simulators might decrease the generalizability of the results [17,18]; (b) known-groups comparison design is stronger for external validity, given their emphasis on real-world applications. Firstly, we examined the internal consistencies of the Chinese SIRS-2 subscales designed to detect malingering by calculating their Cronbach's alpha coefficients. Second, we evaluated the discriminant validity of each subscale by examining Analysis of Variance (via effect sizes). Finally, we sought to determine which measures were most effective (i.e., sensitivity and negative predictive power, NPP) for evaluation of malingering.

Study one using simulation design
Eighty undergraduate students were recruited through advertisements placed in three University campuses of Hunan Province in Changsha City. Subjects were required to have no language or hearing impediments and no psychiatric history. All subjects were 18 years of age or more.
In addition, 40 consecutive inpatients with mental disorders were recruited from July 1 to December 30 2011 in the second Xiangya hospital of Central South University in Changsha, if they met the following inclusion criteria: 1) patients diagnosed with mental disorders by two psychiatrists according to Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV); 2) understand scale content (We examined the subjects with a Structured Interview, and we used this procedure for every participant: first, we read out the question; then the participant would be asked to narrate the meaning of the question, if right, he could answer the question. If not, we would explain the meaning of the question to him); 3) must be age 18 or older. Diagnoses for the 40 psychiatric patients were mostly Axis I disorders: schizophrenia (55%), mood disorders (30%), substance dependence (2.5%), and stress-related disorders (2.5%).

Study two using known-groups comparison
One hundred subjects in forensic setting participated in known-groups comparison; they were recruited from two forensic psychiatric centers of the second Xiangya Hospital of the Central South University and the Rongjun Hospital. Subjects were screened to ensure that they were of age above 18 years old, and could understand scale content.

Chinese version of SIRS-2
The Chinese version of SIRS-2 was provided by Professor Tam, Wai -Cheong Carl, one of the investigators in this study, who had translated the SIRS-2 following a rigorous translation method. The 172-item Chinese Version of SIRS-2 is a structured interview designed to detect various exaggerated response styles. Each scale provides four classifications: honest, indeterminate, probable faking, and definite faking. The item score of Chinese Version of SIRS-2 ranges from 0 to 2 (0 = no, 1 = yes, 2 = unbearable), with higher scores suggesting feigned symptoms. The Chinese Version of SIRS −2 was scored using the criteria described by Rogers (i.e., at least one subscale in the definite malingering range and/or three subscales in the probable range) [19].

MMPI-2 Chinese version
MMPI-2 has been translated into Chinese and published by the Chinese University of Hong Kong with Hong Kong and China norms [20]. The MMPI-2 Chinese version has demonstrated excellent psychometric properties [20,21] and has been used in malingering detection in a sample of university students in Taiwan [22]. The F (Infrequency) subscale, L (Lie) and K (Correction) subscales of the MMPI-2 were regarded as the validity scales for the assessment of response styles systematically and empirically [23]. The F subscale performed well in the prediction on of the malingering. F subscale is associated with most fake-bad indicators, particularly those assessing exaggerated psychopathology. A lower score in K subscale was associated with exaggerated symptoms of psychosis. Other indexes and scales, including PAI and MCMI-III, were developed over the long life of MMPI to assess malingering. However there is no Chinese version for these scales, so we decided to use the MMPI-2 Chinese version in this study.

Procedure
In study one, undergraduate students were randomly assigned to two different subgroups: honest students group (HS group, n = 40) and simulator students group (SS group, n = 40). HS group as well as the honest inpatients (HP group, n = 40) were asked to respond honestly in the interview. SS group subjects were asked to feign mental illness in the interview and they were given information about the common symptoms of mental disorders.
In study two, forensic experts interviewed 100 forensic subjects and reviewed the information from police as well as the history of the subjects. One hundred forensic subjects were divided into two groups according to the assessment of the forensic experts: forensic malingering group (FM group, n = 20), and forensic honest group (FH group, n = 80). One hundred forensic subjects in study two were asked to respond honestly in the interview. The Chinese Version of SIRS-2 was administered to all subjects in study one and two. At the same time the Chinese Version of MMPI-2 was administered to each participant from universities (HS + SS groups). All interviewers were trained in the use of the SIRS-2 according to the recommendations in the SIRS Manual [24]. All participants provided written informed consent to participate in this research. The study protocol was approved by the Second Xiangya Hospital medical ethics review committee.

Data analysis
All statistical analyses were carried out using SPSS 17.0 Software for Windows. Correlation analyses were used to explore the correlations between the SIRS-2 Chinese version and MMPI-2 Chinese version to examine convergent and divergent validity of detection strategies, especially with the MMPI-2 indicators of fake-bad, defensiveness, and response consistency. For the discriminability of this scale, we used Analysis of Variance to examine the differences among honest students, simulators students and honest inpatients on the SIRS-2 in study one, and used Table 1 Cronbach's alphas for the SIRS-2(CV) subscales analysis of independent t-test to examine the differences between forensic malingering group and forensic honest group on the SIRS-2 in study two. Magnitude of differences between individual groups was characterized by Cohen's d effect size estimates. Classification analysis of different groups was used to explore the validity (predictive accuracy) of the SIRS-2 Chinese version, for example, sensitivity (i.e., proportion of malingerers accurately identified by the measure) and specificity rates (i.e., proportion of people accurately classified as responding honestly). The significance level was set at p < .05. Cronbach's alpha coefficients were calculated for eleven subscales (six primary subscales, four supplementary subscales and one classification subscale) of the SIRS-2 Chinese version (Selectivity of Symptoms(SEL) and Severity of Symptoms(SEV) subscales combine responses from the Blatant Symptoms(BL) and Subtle Symptoms(SU) subscales were therefore not analyzed).

Reliability
For internal consistency of the Chinese Version of SIRS-2, Cronbach's alpha coefficients were calculated in 220 participants ( Table 1). The Cronbach's alpha coefficients range from 0.66 to 0.92. Except for the subscales of direct appraisal of honesty, the Cronbach's alpha coefficients of the various subscales of the Chinese Version of SIRS-2 were compatible with those of the original SIRS-2 cited in the Manual.

Criterion validity
Criterion validity of malingering measured by the Chinese Version SIRS-2 was tested in 80 undergraduates (HS group and SS group) against the scores on the MMPI-2. For convergent validity, scores on the primary scales of the Chinese Version of SIRS-2 were correlated positively well with the malingering ratings on MMPI-2, with the    Table 2 shows the correlations of Primary Scales of SIRS-2 Chinese Version with the MMPI-2 validity scales.

Discriminant validity
In Study one, using simulation design, there were significant differences (e.g., RS, F(118) = 191.91, p < 0.05) in the three groups (HS, SS and HP group) of subjects in the scores of the eight primary subscales except three subscales (DA, Direct Appraisal of Honesty, DS, Defensive Symptoms, OS, Overly Specified Symptoms). For the total scores of Chinese version of SIRS-2, the SS group had the highest scores, followed by the HP group, the HS group had the lowest scores (Table 3). In Study two, using known-groups comparison, the FH group scored significantly lower than the FM group on the Chinese version of SIRS-2 subscales (e.g., RS, t(98) = −7.81, p < 0.05) ( Table 4). The effect sizes of the subscales between SS group and HS group, SS group and HP group, FM group and FH group were shown in Table 5. As the table indicates, all the Cohen'd reflected significant differences between the group means (p < 0.05). The effect sizes of the Primary Scales for feigning (SS group and FM group) versus honest (HS group, HP group and FH group) samples of the original SIRS-2 cited in the Manual are also listed in the table for easy comparison. The mean effect sizes for three groups' comparisons ranged from 1.79 to 1.80, which were compatible to the mean effect size of 2.08 of the validation samples cited in the Manual.

Discriminant analyses of different groups
In study one, HP + HS (honest group) and SS (feign group) and HS group were used as the two groups to calculate the sensitivity and specificity of the Chinese version of SIRS-2. The overall hit rate for detecting malingering of approximately 20% was obtained when classifying HS group, HP group and SS group in simulation design. However, rates were increased to approximately 85% when classifying FM group and FH group in known-groups comparison.
The classification results of simulation design sample are shown in Table 6a. The sensitivity and specificity were 0.20 and 1.00 respectively, while the positive predictive power and negative predictive power were 1.00 and 0.58 respectively. In study 2, the FM group and FH group were used as the two groups to calculate the   sensitivity and specificity of the Chinese version of SIRS-2.
The classification results of known-group design sample are shown in Table 6b. The sensitivity and specificity were 0.85 and 1.00 respectively, while the positive predictive power and negative predictive power were 1.00 and 0.72 respectively. The corresponding values for the sensitivity and specificity of the Chinese version of SIRS-2 for clinical sample described in the Manual were 0.80 and 0.975 respectively, while both the positive predictive power and negative predictive power were 0.91.

Discussion
The results of this study showed good reliability and validity of the Chinese version of SIRS-2. In our current two studies, we evaluated the internal consistency of Chinese version of SIRS-2, its cronbach's alpha coefficients, these results converged with those of Rogers research to suggest that the primary scales had good internal consistency [25]. The SIRS-2 appears to be a promising screening instrument for malingering. However, SIRS-2 are not used for the definitive determination of malingering or feigned mental disorders, that is to say, there is need for a full evaluation of feigning together with other clinical data. Results from study one indicated a high level of discrimination for the primary scales between SS group and HS group and HP group, with the exception of the subscales of DA, DS, and OS. One expected finding was that the total scores of the SIRS for the feigners (SS group and FM group) were higher than those for the honest responders (HP group and HS group). The purpose of the SIRS-2 primary scales was to distinguish between feigned and genuine presentations of mental disorders [26]. Results of the current study generally supported this conclusion, with four subscales (BL, SU, SEL, and SEV) based on detection strategies in the amplified response category showing more effective identifying the malingerers than those subscales that relied on inconsistent response category. Overall, the effect sizes in our studies were moderate. In our study, sensitivity rates were most impressive in the study of forensic sample. And the sensitivity rates in detecting malingering among this sample of forensic subjects were superior to the sensitivity rates described by Rogers et al., [27] (80% for Rogers et al. study vs. 85% for our study). The sensitivity rates for detecting malingering among the University students was substantially lower than for the forensic sample (60% vs. 85%). One possibility for the decreased sensitivity among university students asked to feign symptoms is that they might not understand the symptoms of mental illness very well. Research shows that financial compensation does affect patients' performance in clinical contexts [28,29]. Thus, it is likely the absence of significant financial incentives for the participants in the present study influenced their performance.
Since the MMPI-2 is the most popular and wellresearched objective clinical assessment instrument [30], the construct validity of the Chinese version of SIRS-2 was examined by using MMPI-2. The SIRS primary scales produced correlations with MMPI-2 feigning scales with inverse relations with its scales of Defensiveness. The eight primary scales of the Chinese version of SIRS-2 which were designed to identify feigned mental disorders performed well in this study, often exhibiting the significant positively correlation with F subscale, and exhibiting the significant negatively correlation with K/L of the MMPI-2. The correlation between the MMPI validity scales and the malingering test are statistically significant but very low in order to claim that there is enough covariance between them to be considered measures of the same criteria. This finding is not enough to identify two different measures as meaningfully convergent on the same variable.

Conclusion
The present study found that the Chinese version of the SIRS-2 had good reliability and validity in detecting malingering and would be a suitable tool to use in China.  Honest 0 120 Note. a Others include indeterminate-evaluate, indeterminate-general, disengagement, and genuine responding.