Operationalization of diagnostic criteria of DSM-5 somatic symptom disorders

Background The aim of this study was to test the operationalization of DSM-5 somatic symptom disorder (SSD) psychological criteria among Chinese general hospital outpatients. Methods This multicenter, cross-sectional study enrolled 491 patients from 10 general hospital outpatient departments. The structured clinical “interview about cognitive, affective, and behavioral features associated with somatic complaints” was used to operationalize the SSD criteria B. For comparison, DSM-IV somatoform disorders were assessed with the Mini International Neuropsychiatric Interview plus. Cohen’s к scores were given to illustrate the agreement of the diagnoses. Results A three-structure model of the interview, within which items were classified as respectively assessing the cognitive (B1), affective (B2), and behavioral (B3) features, was examined. According to percentages of screening-positive persons and the receiver operator characteristic (ROC) analysis, a cut-off point of 2 was recommended for each subscale of the interview. With the operationalization, the frequency of DSM-5 SSD was estimated as 36.5% in our sample, and that of DSM-IV somatoform disorders was 8.2%. The agreement between them was small (Cohen’s к = 0.152). Comparisons of sociodemographic features of SSD patients with different severity levels (mild, moderate, severe) showed that mild SSD patients were better-off in terms of financial and employment status, and that the severity subtypes were congruent with the level of depression, anxiety, quality of life impairment, and the frequency of doctor visits. Conclusions The operationalization of the diagnosis and severity specifications of SSD was valid, but the diagnostic agreement between DSM-5 SSD and DSM-IV somatoform disorders was small. The interpretation the SSD criteria should be made cautiously, so that the diagnosis would not became over-inclusive. Electronic supplementary material The online version of this article (10.1186/s12888-017-1526-5) contains supplementary material, which is available to authorized users.


Background
The diagnostic category of somatoform disorders (SD) in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) [1] has been revised and replaced with somatic symptom disorder (SSD) in DSM-5 [2]. Besides the requirement of persistent one or more distressing somatic symptoms, the diagnostic focus has shifted from whether symptoms were medically unexplained to positive psycho-behavioral criteria, including disproportionate thoughts, feelings and behaviors related to somatic symptoms or health concerns [2]. According to the DSM-5, "the prevalence of SSD in the general adult population may be approximately 5%-7%." Concerns were raised that, if handled improperly, a vast group of people might be mislabeled with mental disorders [3]. In addition, for decades, Chinese people have been believed to be more likely to express somatic symptoms than their Western counterparts [4,5]. Past studies have confirmed that distressing somatic symptoms were common in Chinese general hospital outpatients [6,7]. However, it is unknown to what extent the new SSD concept, which focuses more on psychobehavioral characteristics, can be applied to Chinese hospital outpatients.
Nevertheless, instruments to establish the diagnosis of SSD were still lacking, especially regarding the assessment of the psycho-behavioral symptoms. Up to now, the Chinese version of the Structured Clinical Interview for DSM-Disorders, fifth edition (SCID-5) still needs to be validated. Among the known measurements, the Whiteley Index (WI-14 or WI-7) was previously employed to operationalize SSD [8,9]. However, the WI measures health anxiety, reflecting only the affective features specified in the SSD criteria. Two new selfreported questionnaires were also developed and validated to assess the psycho-behavioral criteria, including the SSD-12 [10,11] and the Somatic Symptoms Experiences Questionnaire [12]. However, in terms of establishing a diagnosis, self-rated questionnaires are believed to be less reliable than clinical interviews, in which subjects have the opportunity to ask the meanings of unfamiliar words [13].
Therefore, this is the first study using a structured clinical interview, the "interview about cognitive, affective, and behavioral features associated with somatic complaints" (ICAB), to investigate different operationalization of the three dimensions of the SSD criteria B, "disproportionate and persistent thoughts about", "persistently high level of anxiety about", and "excessive time and energy devoted to" their symptoms [2]. Since these items are general and ambiguous and might only represent a part of the numerous psycho-behavioral features of somatizing patients reviewed in the past, the interview intended not only to capture and operationalize the three dimensions specified in DSM-5 (such as by assessing rumination and catastrophizing thoughts, illness worries, frequent bodily self-observation, and health care utilization), but also to broaden the diagnostic basis by including some more specific psycho-behavioral characteristics of somatizing patients (such as somatic illness beliefs, feeling of injustice, desperation because of symptoms, and negative self-concept of bodily weakness) [8,14]. In a current cohort study, the ICAB has shown relevance and predictive value for somatoform symptoms [15].
In addition, the diagnostic agreement between the DSM-5 SSD and the DSM-IV SD was small in previous studies [8,9]. When the clinical interview ICAB was adopted to assess SSD in Chinese, its agreement with SD remained unclear. Furthermore, with limited instruments, few studies have been conducted to operationalize the SSD severity levels [9]. Thus, using a combination of assessment instruments, we aimed to test the following research questions among a sample of Chinese general hospital outpatients: 1) to operationalize the diagnostic criteria and severity specifications of DSM-5 SSD; and 2) to compare the frequencies and agreement of DSM-5 SSD and DSM-IV somatoform disorder.

Study design and setting
This is a secondary analysis of data collected within a multicenter cross-sectional study between February 1, 2011, and October 30, 2012 [16], which was conducted in 10 outpatient clinics of tertiary hospitals in Beijing, Shanghai, Chengdu, and Kunming (located at the north, southeast, and southwest of China). Among them, the neurology and gastroenterology departments were chosen to represent the modern biomedical settings, the Traditional Chinese Medicine (TCM) departments were selected to represent the traditional medical settings, and the psychological medicine departments were chosen to represent the psychosomatic medical settings. Patients from the above three medical settings were supposed to be evenly recruited. On randomly assigned days, outpatients were consecutively informed about the study and invited to participate.
All participants were assessed by the somatic symptom scale of the Patient Health Questionnaire (PHQ-15), thereby separated into two subgroups-with or without multiple somatic symptoms-at the cut-off point of 10 [17]. Recruitment continued until equal number of patients was enrolled in two subgroups from each medical setting (n = 25).

Subjects
The inclusion criteria of the study were as follows: age 18 years or above, seeking treatment voluntarily for their own problems, and being able to read and sign the informed consent form. The exclusion criteria included language barriers, limited writing skills, cognitive impairment/organic brain disorder/dementia, psychosis, and acute suicidal tendency. All patients were registered, including those who denied participation with reasons (such as lack of time, lack of interest in the study, lack of trust, etc.). Both research assistants (medical students) and clinical doctors ensured that the above criteria were fulfilled.

Assessment instruments
Somatic symptom severity was measured with the PHQ-15. This instrument includes 15 prevalent somatic symptoms in primary care [18]. Studies in both Western and Chinese populations have demonstrated the satisfactory reliability and validity of the PHQ-15 [6,17,19,20]. An optimal cutoff point of 10 was recommended to screen patients with somatoform disorders [17]. An additional question was asked about the symptom duration. The frequency of doctor visits in the past 12 months was also assessed.
The structured clinical interview ICAB was designed to assess the psycho-behavioral criteria associated with somatic symptoms. The development of this interview was introduced in Klaus's cohort study [15], which has also demonstrated good relevance and predictive validity in the context of somatoform symptomatology. Eighteen items of with a binary response format (present/not present) were selected from the pool of 28 items that distinguished individuals with different levels of somatic symptom severity and health care utilization/impairment (see Additional file 1: Table S1) [14,21]. Even though the interview and its nine-factor structure have been proved to be reliable and valid in patients with somatoform disorders, this should be the first time to investigating different operationalization of three dimensions of the SSD criteria B. No reference standard results were available to the participants or assessors when they completed the questionnaires.
The 7-item Whiteley Index (WI-7) [22] was used to evaluate illness anxiety. The Chinese WI-7 has proven to have satisfactory reliability and validity in a general Hong Kong population [23]. A cut-off score of 3 was recommended for screening hypochondriasis [24].
The 9-item depression scale of the Patient Health Questionnaire (PHQ-9) and the 7-item anxiety scale (GAD-7) were used to measure the severity of depression and generalized anxiety, respectively. Both of them have demonstrated good reliability and validity in screening for depressive and anxiety disorders in Chinese general hospital outpatients [25,26].
The 12-item short form health survey (SF-12) captures reliable and valid information on health-related quality of life (QoL) in the Chinese population [27], resulting in a physical (PCS) and a mental composite score (MCS).
The Mini International Neuropsychiatric Interview Plus (MINI Plus, version 5.0.0) is a brief structured interview for the diagnosis of major axis I psychiatric disorders according to the DSM-IV diagnostic criteria [28]. The Chinese version of the MINI has been shown to have good reliability and validity [29]. In our study, modules listed in Table 1 were adopted to establish the diagnosis of somatoform disorders. All participants were invited to complete the MINI plus, which was carried out by trained research assistants.

Operationalization of the DSM-5 SSD concept
The assessment of SSD was operationalized as followed: For criterion A, at least one physical symptom in the PHQ-15 had to be rated as "very bothering." For criterion C, symptoms had to last for more than 6 months to be rated as chronic.
For criteria B, the 18 items in the ICAB were classified as assessing the cognitive (B1), affective (B2), and behavioral (B3) subscales, as proposed by the theoretical conceptualization of DSM-5 SSD (see Additional file 1: Table S1). The cognitive subscale contains seven items to reflect disproportionate thoughts about the seriousness of somatic symptoms, such as "think about bodily complaints most of the time", "hard to thank about other things", "expect serious consequences", etc. The affective subscale measures health anxiety with five items, such as "frequently worry about physical complaints, possible causes and consequences", "worry a lot about health and possible illnesses" and so on. The behavioral subscale includes "frequent check bodily sensation", "feeling of vulnerable or weak so as to avoid certain activities", and "visit doctors as quickly as possible" to enrich the criteria of excessive time and energy devoted to somatic symptoms. The optimal cut-off points for each subscale were established to identify those with positive psycho-behavioral criteria. In addition, in order to compare the results with previous exploratory work [8,30], the total WI-7 score was also employed with a cut-off point of 3 [24].
For the specification of the SSD severity, the mild type required that only one of the SSD B criteria can be fulfilled, the moderate type required two or more of the SSD B criteria, and the severe type required two or more of the SSD B criteria plus "multiple somatic complaints;" the latter were operationalized as a PHQ-15 ≥ 10 in our study.

Statistical procedures
Categorical variables were described as absolute and relative frequencies and evaluated by chi-square difference tests. Continuous data were presented as the means and standard deviations, and they were compared by t-test for two independent groups and by one-way analysis of variance (ANOVA) for three or more independent groups. The Bonferroni method was adopted for multiple comparisons. Cohen's кscores were given to illustrate the agreement of different diagnoses. Since 12 of the 491 (2.4%) participants had missing values, they were replaced with the mean value of the remaining items. A p-value of less than 0.05 (2tailed) was considered significant.
Since equal numbers of patients with or without multiple somatic symptoms were recruited according to our study design, the proportion of SSD patients in the whole sample could not reflect their prevalence. As Schaefert et al.'s study found that 28.1% (79/281) of Chinese general hospital outpatients had a high somatic symptom severity (PHQ-15 ≥ 10) [6], the standardized rate of SSD in our study was calculated accordingly. For example, when 133/238 (55.9%) SOM+ patients and 51/ 253 (20.2%) SOM-patients in our sample fulfilled certain criteria, the prevalence would be estimated as 133/238*28.1% + 51/253*(1-28.1%) = 30.2%.
Cronbach's α was used to estimate the internal consistency of the ICAB and its subscales. Confirmative factor analysis (CFA) was carried out to test its hypothesized factorial structure using the robust weighted least squares estimation with mean and variance adjustment (WLSMV) method. Fit indices based on the scaled chi-square statistic, such as the root mean square error of approximation (RMSEA) and comparative fit index (CFI), were used to evaluate the model fit. A value of 0.05 or less for RMSEA was considered to be very good, while 0.05-0.08 was acceptable and an RMSEA of up to 0.10 was mediocre (Browne and Cudeck, 1992). A value of 0.95 or greater for CFI was considered to be adequate (Bentler, 1990). Criterion validity was examined using the Spearman's correlation between the ICAB total and subscale scores and total scores of the PHQ-15, PHQ-9, GAD-7, and WI-7. To operationalize the diagnostic criteria of SSD, the optimal cut-off points of the ICAB should be determined. Due to the lack of validated clinical interview of the SSD, we explored the potential cutoff points by both the percentages of screening-positive persons and the receiver operator characteristic (ROC) analysis with the quality of life serving as the reference standard.
Statistical analyses were performed with IBM SPSS Statistics 20.0 and Mplus version 7.0.

Study sample and sociodemographic characteristics of participants
The detailed enrollment procedure has already been published [16]. A total of 799 patients were contacted, and 491 (61.4%) of them were included in the study. Two hundered thirty eight participants were classified as patients with multiple somatic symptoms (SOM+, PHQ-15 ≥ 10), with a mean age of 44.3 (±15.9) years and 73.6% being women. The comparison group (SOM-, PHQ-15 < 10) included significantly fewer women (57.9% vs. 73.6%, p < 0.001). There was no significant difference between SOM+ and SOM-participants in terms of other sociodemographic characteristics.
Reliability and validity of the interview concerning cognitive, affective, and behavioral features (ICAB) The ICAB has shown high reliability in this sample (Cronbach's α = 0.90). The validity of the ICAB was assessed with the structure validity, criterion validity and known-group validity. Firstly, its three-factor structure was proved to be acceptable by the confirmative factor analysis (CFI = 0.962, RMSEA = 0.066, 90% confidence interval = 0.059-0.074). Secondly, the sum scores of the WI-7 served as an external validator, and showed moderate correlation with subscales of the ICAB (r = 0.42-0.61, p < .001). Finally, comparisons showed that patients with multiple somatic symptoms scored significantly higher than those without, both on the item level and subscale level of the ICAB (see Additional file 1: Table S1), indicating that the ICAB was valid to differentiate samples with positive psycho-behavioral characteristics.
Due to the lacking of golden standard, we explored the optimal cut-off points of the ICAB by the following two methods. First, we estimated the percentages of positivescreening participants at each cut-off point within each subscale. As shown in Table 2, if only one positive item was required, then the percentage of general hospital outpatients who fulfilled with the SSD criteria B would be as high as 91.4%. Those corresponding percentages decreased when the cut-off points for each subscale increased from 1 to 4.
Then the ROC analyses with the quality of life serving as the reference standard were conducted (see Table 3, Figs. 1 and 2). The best diagnostic performances for PCS and MCS were both achieved at the cut-off points of 2 for the B1 and B2 subscales. For the B3 subscale, the optimal cut-off was 2 in predicting MCS and was 3 in predicting PCS.
Given all the above results, we recommend a cut-off point of 2 for three subscales, which means at least two items within either subscale should be positive to meet the SSD criteria B.

Frequencies and agreement of different operationalization of the DSM-5 SSD criteria
For criterion A, 347/491 (70.7%) participants in our sample rated at least one physical symptom in the PHQ-15 as "very bothering". Concerning criterion C, 338/491 (68.8%) participants reported that their complaints had lasted for more than 6 months. The standardized rates of SSD (fulfilling criteria A, B, and C) were estimated as 36.5% and 30.2% respectively, when the B criteria were operationalized with the ICAB and the WI-7 (Table 4). The agreement between the WI-7 and ICAB in diagnosing SSD was high (Cohen's к = 0.769).

Agreement between DSM-IV somatoform disorders and DSM-5 SSD
According to the MINI interview, the standardized rate of somatoform disorders was estimated as 8.2%. The most common subtypes were hypochondriasis (3.3%), pain disorder (3.0%), somatization disorder (1.7%), and body dysmorphic disorder (0.7%). The Cohen's Kappa between the diagnoses of DSM-IV somatoform disorders and DSM-5 SSD was only 0.152, indicating that the agreement between those two diagnostic concepts was small. However, as shown in Fig. 3, the standardized rate of somatoform disorders was only about one quarter of SSD, which might explain such small agreement between them. To be specific, 73.3% patients with somatoform disorders also met the SSD criteria, while only 19.0% SSD patients were diagnosed with somatoform disorders.

Operationalization and validity of the DSM-5 SSD severity specification
According to the number of positive psycho-behavioral features and the severity of somatic symptoms, the standardized rates of the mild, moderate, and severe SSD subtypes were estimated as 5.9%, 16.7%, and 13.8%, respectively (see Table 4).
To test the validity of the SSD severity specifications, the sociodemographic and clinical features of non-SSD and SSD patients with different severity levels were compared. As shown in Table 5, it seemed that mild SSD patients had higher monthly family income, and higher percentages of mild and moderate SSD patients were employed, compared with the severe type and the non-SSD general hospital outpatients. There is no significant difference among them in terms of other sociodemographic features.
Non-SSD and SSD patients with different severity levels differed significantly regarding all clinical characteristics (see Table 5). Among them, the severe SSD patients consistently had the severest somatic, depressive, general anxiety, and health anxiety symptoms, the most impaired physical and mental QoL, and the most frequent doctor visits. Compared with the non-SSD group, patients with mild SSD had more somatic symptoms, but not more psychological, functional or behavioral problems. The level of anxiety, the QoL impairment, and the number of doctor visits of patients with moderate SSD was also "moderate," that is, significantly higher than the corresponding measures in the mild type, but lower than those of the severe type. This supports that the operationalization of SSD severity was valid.

Discussion
To the best of our knowledge, this is the first study to operationalize the DSM-5 SSD criteria with structured interview among general hospital outpatients. Classified into the cognitive, affective, and behavioral subscales, the ICAB demonstrated high reliability and good structural validity in our clinical sample. Unlike our theoretically derived three-factor structure, a nine-factor structure was generated on the basis of exploratory factor analysis in Klaus's study [15]. Since the three-factor structure was also valid and more compatible with the operationalization of SSD in our study, we adopted it in the subsequent analysis.
Since the diagnosis of SSD required only one of three psycho-behavioral criteria to be fulfilled, clinicians need to be particularly cautious with the interpretation of the descriptive adjectives of "disproportionate", "high level" and "excessive". As our results indicated, if only one positive item was required in the ICAB, as high as 91.4% of general hospital outpatients would be classified as met the SSD psycho-behavioral criteria. For clinicians, especially primary care practitioners, it might be difficult to judge to which degree should be regarded as pathological. Thus, a vast majority of patients could be under the risk of being mislabeled as with mental disorders, as some experts concerned [3]. Therefore, we strongly recommend using other measuring instruments, like the ICAB interview, to help to quantify the extent of abnormity. On one hand, based on our estimated percentages of positive-screening participants and the ROC analysis, at least two positive items within either subscale of the ICAB were required to fulfill the diagnostic criteria, which can help to mitigate the risk of mislabeling. On the other hand, our work enriched the diagnosis by including more cognitive, affective, and behavioral features confirmed by previously studies with somatization patients, so as to better represent the chronic and disabling somatoform symptomatology and to enhance diagnostic validity [16,21].
Operationalized with the above standards of somatic symptoms, psycho-behavioral characteristics, and the duration, the proportion of SSD was estimated as 36.5% in Chinese general hospital outpatients. The prevalence of SSD was estimated as 51.8% among a sample of German psychosomatic inpatients [8], and as 47% in another sample of fibromyalgia syndrome patients [30]. And in a UK general population, 5.7% of participants reported both high somatic symptom burden and illness anxiety [31]. All three studies operationalized the psycho-behavioral criteria with the WI-14, which seemed to be a convenient   and appropriate self-rated questionnaire, but could not measure disproportionate thoughts or behaviors related to somatic symptoms. Therefore, to obtain a clear picture of the prevalence of this new diagnostic concept, further studies with reliable diagnostic tools need to be conducted in different populations and countries. Nevertheless, those exploratory results warned us that the SSD diagnosis, if handled improperly, could become over-inclusive. As far as we know, our study was also the first to examine the SSD severity specifications. Contrary to our expectation, the proportion of patients with mild SSD was the lowest. The reason for this might be that since the three features specified in the SSD B criteria were highly correlated, it was unlikely for a patient to have only one of them. Furthermore, the distribution might be different between outpatients and the general population, with subjects with mild symptom severity not consulting medical doctors. Further research in other settings is needed to clarify this.
Comparisons of sociodemographic features among SSD patients with different severity levels showed that mild SSD patients were better-off in terms of financial and employment status. The potential explanation for this phenomenon could be that their social function was less impaired. This reminded us the degree of social function impairment could be worthy of taken into account for the severity specifications of SSD. In addition, no gender difference was discovered between those with or without SSD. Nevertheless, past studies seemed to support a strong correlation between female gender and the high somatic symptoms severity [6,16]. Such discrepancy could be caused by the different diagnostic criteria of DSM-5, or our operationalization methods. Further studies should be conducted in different populations and with different measurements to clarify the sociodemographic characteristics of patients with SSD.
Comparisons also revealed that the SSD severity subtypes were congruent with the level of depression, anxiety, quality of life impairment, and the frequency of doctor visits, which provided evidence for the validity of the operationalization of the SSD severity specifications. Similar to our results, a study from the Netherlands found that compared to patients with mild SSD, patients with moderate SSD suffered from lower physical functioning and higher levels of depression, while the levels of symptom severity and mental functioning were similar in both groups [9]. However, severe SSD was not defined in this study. Although no severity subtypes specified, a study by Voigt et al. also found that SSD was considerably associated with patients' physical and mental function [32]. The possibility of identifying patients with different severity levels implies the importance of developing and evaluating a severity-stepped model of care.
Last but not least, similar as previous results [8,9], our study found that the diagnostic agreement between the DSM-5 SSD and the DSM-IV somatoform disorders was small. In addition, we went a step further to clarify that the big difference between their estimated prevalence rates should be taken into consideration. Actually, most patients with somatoform disorders also met the SSD criteria, but only about one fifth SSD patients could be diagnosed with somatoform disorders. This was compatible with the enlargement of the diagnostic concept discussed above.
Our study has several limitations. 1) First, since the SCID-5 was still unavailable when our study was conducted, the diagnostic performance and the optimal cut-off point analyses of the ICAB had to be based on indirect indicators instead of the gold standard. 2) Second, equal numbers of patients with and without multiple somatic symptoms were recruited according to the  Fig. 3 The agreement between DSM-IV somatoform disorders and DSM-5 SSD study design instead of consecutively as a whole sample. Therefore, the percentages of SSD and the distribution of its subtypes could not reflect their real prevalence in Chinese general hospital outpatients. Nevertheless, the standardized rates were calculated based on the proportion of patients with multiple somatic symptoms in Chinese general hospital outpatients reported by our previous research. 3) Third, within the MINI Plus, there is no specific module for undifferentiated somatoform disorders; therefore, the prevalence of SD might have been underestimated, which could also influence the measurement of its agreement with SSD. 4) Finally, the results from our cross-sectional sample of Chinese general hospital outpatients from three typical medical settings (biomedicine, Traditional Chinese PHQ-9 total score (M ± SD) 7.5 ± 6.3 1 8.7 ± 4.7 1 8.7 ± 6.8 1 13.0 ± 6.4 3 <.001 GAD-7 total score (M ± SD) 5.2 ± 5.2 1 4.0 ± 3.8 1 6.5 ± 5.5 2 9.5 ± 6.0 3 <.001 WI-7 total score (M ± SD) 3.2 ± 2.0 1 3. Medicine, and psychosomatic medicine) may not be generalizable to patients from other populations.

Conclusions
We hope our work will enrich the understanding of the conceptualization of SSD and its operationalization for clinical practice and research. Our data showed that the operationalization of the diagnosis and severity specifications of SSD was valid, but the diagnostic agreement between DSM-5 SSD and DSM-IV somatoform disorders was small. Besides the shift of diagnostic focus, the much higher estimated prevalence rate of SSD should be taken into consideration. Our results suggested that the interpretation the SSD criteria should be made cautiously, so that the diagnosis would not became overinclusive. Further research is necessary to examine the validity of the ICAB with the DSM-5 version of the SCID, explore the distribution of SSD in different populations, and finally develop and evaluate optimized management strategies for SSD of different severities.

Additional file
Additional file 1: Table S1. Item loadings and the item and subscale level of the interview. The table presented