Skip to main content

Vocal emotion perception in schizophrenia and its diagnostic significance



Cognitive and emotional impairment are among the core features of schizophrenia; assessment of vocal emotion recognition may facilitate the detection of schizophrenia. We explored the differences between cognitive and social aspects of emotion using vocal emotion recognition and detailed clinical characterization.


Clinical symptoms and social and cognitive functioning were assessed by trained clinical psychiatrists. A vocal emotion perception test, including an assessment of emotion recognition and emotional intensity, was conducted. One-hundred-six patients with schizophrenia (SCZ) and 230 healthy controls (HCs) were recruited.


Considering emotion recognition, scores for all emotion categories were significantly lower in SCZ compared to HC. Considering emotional intensity, scores for anger, calmness, sadness, and surprise were significantly lower in the SCZs. Vocal recognition patterns showed a trend of unification and simplification in SCZs. A direct correlation was confirmed between vocal recognition impairment and cognition. In diagnostic tests, only the total score of vocal emotion recognition was a reliable index for the presence of schizophrenia.


This study shows that patients with schizophrenia are characterized by impaired vocal emotion perception. Furthermore, explicit and implicit vocal emotion perception processing in individuals with schizophrenia are viewed as distinct entities. This study provides a voice recognition tool to facilitate and improve the diagnosis of schizophrenia.

Peer Review reports


Schizophrenia is a chronic psychiatric disorder that strongly interferes with major areas of life, including education, work, and daily living. More than 20 million people worldwide suffer from schizophrenia, with a higher rate in men compared to women [1]. Patients with schizophrenia are characterized by serious impairments in social cognition [2, 3], including distortion of emotion and language and social isolation, also resulting in difficulties communicating [4]. Emotional cognitive impairment is another key feature of schizophrenia [5]. To date, many studies have found that patients with schizophrenia show reduced emotional expression, impaired vocal emotion recognition, and impaired understanding of emotional expressions [6]. Vocal emotion recognition plays a crucial role in social communication and is a key element in the detection of schizophrenia [7, 8].

Studies on patients with schizophrenia have found a significant correlation between vocal emotion recognition, cognition, and clinical factors (such as difficulties in auditory processing, the severity of the disease, and negative symptoms [9]). Research on cognition also suggested that dysfunction in vocal emotion recognition in patients with schizophrenia occurs before the apparent disease onset [10,11,12]. However, findings in this regard have been controversial, due to differences in study design and presented stimuli, sex distribution in the sample as well as language background (language type and structure), and differing positive symptoms of schizophrenia [9, 13, 14]. These findings are of great significance to clinical practice and research, as they might highlight the need to develop new and alternative methods to identify and treat patients with schizophrenia and social functioning deficiencies due to mental disorders in general.

Research on vocal emotion recognition can be extended to other neurocognitive fields such as memory, monitoring, thinking and reasoning, literacy, language production, and problem-solving ability. Therefore, in our study, a new voice recognition method was used to systematically explore differences between cognitive and social aspects of emotions as observed in voice emotion recognition, based on epidemiological data, clinical manifestations, and cognitive function screening. Moreover, to determine their functions and clinical significance, we attempted to clarify the effectiveness of this tool in assessing the ability to recognize vocal emotions.

Materials and methods


Participants were recruited from Beijing Huilongguan Hospital from August 2020 to May 2022. All participants provided written informed consent before undergoing any research procedure. The study protocol was conducted in accordance with the Declaration of Helsinki and was approved by the research ethics and institutional review boards of Beijing Huilongguan Hospital.

Inclusion criteria for the patient group were as follows: (1) the patient met the diagnostic criteria of schizophrenia in the Diagnostic and Statistical Manual of Mental Disorders 5th Edition (DSM-V); (3) the patient was between 18 and 60 years of age and had completed more than six years of education; (4) patients and their family members voluntarily participated in the study and signed the informed consent form; (5) the patient’s condition was stable and he/she was able to communicate effectively. The exclusion criteria were as follows: (1) intellectual disability or any brain organic disease; (2) severe recession or impulsive excitement and uncooperative behavior; (3) severe depression, anxiety, and substance abuse; and (4) serious physical disease or drug side effects, which made communication impossible. In total, 106 patients with schizophrenia were enrolled.

The criteria for the healthy control group (230 participants) were as follows: (1) 18–60 years of age; (2) education level of junior high school or above; (3) fluency in Mandarin; (4) clear articulation, and no articulation disorder; (5) no family history of mental illness; (6) healthy mental state, no evidence of anxiety and depression (as attested by a psychiatric interview); (7) the scores of the Self-rating Anxiety Scale (SAS; Dunstan and Scott [15]) and Self-rating Depression Scale (SDS; Dunstan et al. [16]) in the normal range (< 30 points); and (8) normal (or normal after correction) hearing.

Neuropsychological and psychopathological assessment

The evaluation of scales was performed by a team of trained psychiatrists using the routine examination method, and there was good consistency among the measures (intraclass correlation coefficient, ICC > 0.8). The methods are detailed further in the Supplementary Methods.

Basic information questionnaire

A self-assessed general information questionnaire was used to collect data on sex, age, years of education, and course of the disease.

Clinical symptom and social function assessment

The Temporary Experience of Pleasure Scale (TEPS; Chan et al. [17]), Personal and Social Performance Scale (PSP [18]), Positive and Negative Symptom Scale (PANSS; Kirkpatrick et al. [19]), and Brief Negative Symptom Scale (BNSS; Kirkpatrick et al. [19]) were used to evaluate psychotic symptoms in the schizophrenia group.

Cognitive function

The Chinese version of the Consensus Cognitive Battery, originally developed by the National Institute of Mental Health (NIMH) Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) initiative (MCCB [20]), was used to assess cognitive functioning. It includes seven aspects: speed of processing, working memory, verbal learning and memory, visual learning and memory, reasoning and problem-solving, attention or vigilance, and social cognition. There are ten sub-tests.

Vocal emotion perception evaluation

The vocal emotion recognition task comprised 42 standardized emotional voices, performed by two professional sex-specific drama actors, including seven emotions: anger, calmness, disgust, fear, sadness, irony, and surprise. There were six voices for each emotion, including three male and three female voices. High and low materials were also manipulated, with different emotional intensities. The intensity was graded on a 100-point scale. The higher the score, the greater the emotional intensity. After completing the phonetic emotion judgment, the participants entered the next trial, and all trials were presented in a pseudo-randomized order. The evaluation indicators included the emotional category score (the number of attempts to correctly identify a certain phonetic emotional category, with a maximum score of six points) and emotional intensity score (the corresponding emotional intensity after correctly identifying a certain emotional category, with a maximum score of 100 points). See the Supplementary Methods for further details.


Demographics, clinical characteristics, the cognitive function of the cohort

Finally, 106 patients of schizophrenia (SCZ) and 230 healthy controls (HCs) were recruited in the study. We observed no significant difference in the basic demographic data between the patient and control groups. With regard to the severity of schizophrenia, the total PANSS score was 59.39 ± 20.54, that of BNSS was 1.86 ± 1.98, and PSP was 69.38 ± 13.54. The course of disease was 11.22 ± 8.86 years. The total scores of TEPS and MCCB of the patients were significantly lower than those of the healthy controls (p < 0.01). Further details are provided in Table 1. Baseline data suggested that in the cohort, patients and healthy controls were matched in age, sex, and years of education, while the evalution of the patients in scales (TEPS and MCCB) was consistent with the previous studies [19, 20], with statistically significant differences, which indicated vocal emotion perception tests could be further applied to further correlation research to explain the disease characteristics.

Table 1 General demographic data of patients with schizophrenia and healthy controls

Comparison of vocal emotion recognition and intensity scores between patients with schizophrenia (SCZ) and healthy control (HC) groups

Vocal emotion recognition and intensity

Patients’ total scores and their scores for all categories of emotions were significantly lower than those of healthy controls (p < 0.001). However, while the patients’ overall scores for recognition of vocal emotion intensity were still lower than those of healthy controls, only the patients’ scores for anger, calmness, sadness, and surprise were significantly lower than those of healthy controls (p < 0.001). Further details are provided in Table 2.

Additionally, we found significant differences in vocal emotion recognition and intensity recognition models between the SCZ and HC groups. The total score for emotional intensity and recognition was significantly correlated with detailed classifications (including anger, calmness, disgust, fear, sadness, satire, and surprise; Pearson’s correlation coefficient r > 0.9, p < 0.001). Differences can be observed in density plots for recognition and intensity (Fig. 1).

Fig. 1
figure 1

Density plots of emotion recognition and intensity: the density plots indicated a different distribution of emotional recognition and intensity scores (HCs, healthy controls; SCZ, schizophrenia)

Moreover, using a correlation coefficient graph (Fig. 2 [21]), we tried to verify whether the pattern for recognition and intensity was the same between SCZ and HCs. Consistency across recognition differed between SCZ and HCs. The pairwise Pearson’s correlation of the recognition scores for patients increased (r > 0.8, p < 0.001), while the pattern was similar for emotional intensity [21].

Fig. 2
figure 2

Correlation between vocal emotion recognition sub-scales. Consistency across recognition and intensity is not the same between patients and controls. Pairwise Pearson correlation for recognition has increased in patients (r > 0.8, p < 0.001) while the pattern was similar for emotional intensity. (HCs, healthy controls; SCZ, schizophrenia; Int., intensity; Recog., recognition, the legend represents the correlation coefficient, red represents positive correlation, and blue represents negative correlation; circle size represents significance; black boxes represent cluster analysis (Pearson correlation, r > 0.8)

Table 2 Comparison of emotion recognition and intensity between SCZ and HCs group

Correlation between vocal emotion recognition and clinical features

In the current study, we did not identify any significant correlation between sex, age, the clinical course of the disease, and drug dosage (chlorpromazine equivalent (mg)) with vocal emotion recognition and intensity discrimination(p > 0.1). Correlation analysis with PANSS (including 4 subscales) and BNSS (including 7 subscales) showed no significance(p > 0.05).

Correlation between vocal emotion recognition and cognitive function

After multiple regression analysis, we found that the content of vocal emotion recognition was significantly related to cognitive function in patients (ANOVA, F = 3.61, p = 0.002), and, specifically, to working memory (p = 0.015, Table 3); however, vocal emotional intensity was not related to cognitive function (ANOVA, F = 1.51, p = 0.32).

Table 3 Multiple regression analysis between vocal emotion recognition and cognitive function

The vocal recognition content did not correlate with TEPS (ANOVA, F = 0.45, p = 0.77) nor PSP (ANOVA, F = 0.17, p = 0.68). The vocal emotional intensity was not correlated with TEPS (ANOVA, F = 1.15, p = 0.34) nor PSP (ANOVA, F = 0.76, p = 0.39).

Emotion recognition assisted clinical discrimination results

Based on the phenotypes of patients and controls, we conducted diagnostic tests to determine the sensitivity and specificity of clinical indicators. The results showed that the total score of vocal emotion recognition was a good indicator of the presence of schizophrenia (AUC = 84%, p < 0.01), but the respective intensity was not an ideal indicator (AUC < 50%). The overall evaluation efficiency was lower than the total score of the MCCB (AUC = 92%, p < 0.01) and TEPS (AUC = 88%, p < 0.01). The details are shown in Fig. 3.

Fig. 3
figure 3

The clinical receiver operating characteristic curve (ROC) analysis. Figure A compares the ROC analysis of cognitive function tools (MCCB tools), clinical emotion recognition tools (TEPS), and presented vocal recognition tools in the diagnosis of schizophrenia in this study, in which voice emotion recognition indicates a relatively satisfied area under the curve (AUC: 84%, p < 0.01), and sound intensity efficiency is relatively poor (62%). The analysis of detailed emotions is shown in Fig. 3B-C, Figure B shows the ROC analysis of vocal recognition, and Figure C shows the analysis of intensity (AUC).


Vocal emotion perception is increasingly regarded as an important aspect of dysfunction in language use and social communication of patients with schizophrenia [9, 22]. This study provides a new research method for detecting vocal emotional perception in patients with schizophrenia. It focuses on performance characteristics, influencing factors, and potential neural mechanisms. By investigating specific emotions (e.g., fear, sadness, anger, disgust, surprise), including different forms of stimulation (emotional recognition and intensity), this study examined Mandarin, which adopts tone and syllable timing, and expands the research on voice recognition in schizophrenia. At the same time, using a unified software, we detected the effectiveness of our methods in vocal emotion perception and its diagnostic role in schizophrenia.

Clinically, dysprosodia or dysfunction of vocal emotion perception is a cardinal feature of schizophrenia and may also be present in other mental diseases, such as bipolar disorder [9].It is worth importantly noting that when healthy people and specific patients are compared, it could not provide a sophisticated reference in differential diagnosis. However, this study provided an exploration of this technology for emotional problems of schizophrenics through systematic comparison with cognitive and classic scales.

In line with previous research [11, 23, 24], this study confirmed that compared with healthy participants, patients with schizophrenia had impaired vocal emotion recognition and intensity discrimination. In addition, this study suggests that the overall emotion discrimination ability of schizophrenic patients is impaired, affecting not only the processing of more specific negative emotional subsets (such as anger, sadness, and fear) but also non-negative emotions (such as surprise and calmness). However, among all basic emotions examined, sadness was found to be the most difficult to detect in terms of its content and intensity. This was in line with previous studies [14], and further indicated the consistency of this research method to detect vocal emotion perception.

Critically, our study found that emotion recognition tasks are more difficult for patients than emotional intensity identification tasks. As previously reported [25, 26], a dual-language processing model was identified, suggesting that different neural networks might be involved in the explicit and implicit recognition of emotion perception. The patients in the present study showed a significant impairment in explicit vocal emotion processing, which is the recognition of emotional content. The intensity of vocal emotion identification involves implicit processing, and it appears to be relatively preserved in patients. Our research results support the hypothesis that at the behavioral level, implicit and explicit processing of vocal emotion perception in patients with schizophrenia are separated.

Compared with the control group, correlation analysis revealed that the recognition patterns of different contents showed a trend of unification and simplification in schizophrenic patients (Fig. 2). This was consistent with reported changes in brain network simplification [13, 27], which were also found to be one of determinants of social cognitive dysfunction in patients with schizophrenia. In addition, this study identified a direct correlation between vocal recognition impairment and cognition in schizophrenia and confirmed working memory involvement in these processes. Working memory is considered the core component of cognitive impairment in schizophrenia, which is related to employment status and working terms [28, 29]. The clinical relevance of working memory impairment in patients with schizophrenia is largely due to the strong correlation between working memory measurements and other cognitive impairments, such as attention, planning, and memory. Pre-attention and attention processing are important predictors of emotion recognition tasks and are related to sensory processing, especially basic auditory processing of pitch and intensity, which is significantly related to defects in vocal emotion recognition [23, 28, 29]. The present findings supported this cognitive mechanism.

In terms of clinical factors, this study did not find any significant correlation, including sex, age, disease course, and dosage of the patients. This is in line with the existing literature. Similarly, previous studies have found no differences due to intelligence, education, medication, etc. [24, 25, 30]., suggesting that it is related to the brain chemistry of the pathological mechanism of schizophrenia itself, specific brain circuits (or information processing pathways), neuroanatomical abnormalities, and environmental factors. With regard to clinical phenotypes, this study did not find a correlation between vocal recognition and the severity of schizophrenia, which was consistent with some studies [31]. Other studies, however, have suggested that the PANSS score was negatively related to language recognition and even specific positive symptoms, such as hallucinations and delusions, usually aggravate the impairment of vocal and emotional recognition in patients with schizophrenia [8, 32, 33]. More recent studies [25, 34] suggested that negative symptoms, such as loss of pleasure, might be closely related to patients’ vocal emotion recognition defects. However, this relationship was found through behavioral and neurological research techniques (including MRI or fMRI), not through clinical scale evaluations. Of note, vocal emotion perception in patients with schizophrenia is a complicated process that is influenced by various complex factors.

This study provides a good voice recognition tool based on a series of studies [22], and this clinical tool has relatively reliable sensitivity and specificity based on our ROC analysis (Fig. 3). We screened the difference between voice recognition and intensity recognition using our software, which provides a potential determination of the clinical phenotype of patients with schizophrenia in language dysfunctional processing. However, the specificity of this tool for schizophrenia needs further investigation.

Patients with schizophrenia usually perform poorly when dealing with emotional rhythms. Thus far, emotional prosody has been regarded as an important window to detect impairments in schizophrenia, and it has also played a role in improving people’s social communication. Many studies have shown that vocal emotion recognition is an important supplementary clinical index for the early detection of schizophrenia. However, this study provides a good potential clinical tool and significant evidence for the impaired relationship between vocal recognition and cognitive function. This points out an important direction for future research on emotional prosody understanding to be extended to other neurocognitive fields, such as memory, monitoring, thinking and reasoning, reading and writing ability, language production, and problem-solving ability.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.


  1. Jiménez-López E, Sánchez-Morla EM, López-Villarreal A, Aparicio AI, Martínez-Vizcaíno V, Vieta E, et al. Neurocognition and functional outcome in patients with psychotic, non-psychotic bipolar I disorder, and schizophrenia: a five-year follow-up. Eur Psychiatry. 2019;56:60–8.

    Article  PubMed  Google Scholar 

  2. Savla GN, Vella L, Armstrong CC, Penn DL, Twamley EW. Deficits in domains of social cognition in schizophrenia: a meta-analysis of the empirical evidence. Schizophr Bull. 2013;39:979–92.

    Article  PubMed  Google Scholar 

  3. Wible CG, Preus AP, Hashimoto R. A cognitive neuroscience view of schizophrenic symptoms: abnormal activation of a system for social perception and communication. Brain Imaging Behav. 2009;3:85–110.

    Article  PubMed  Google Scholar 

  4. Varga E, Endre S, Bugya T, Tényi T, Herold R. Community-based psychosocial treatment has an impact on social processing and functional outcome in schizophrenia. Front Psychiatry. 2018;9:247.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Melle I. Cognition in schizophrenia: a marker of underlying neurodevelopmental problems? World Psychiatry. 2019;18:164–5.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Kucharska-Pietura K, David AS, Masiak M, Phillips ML. Perception of facial and vocal affect by people with schizophrenia in early and late stages of illness. Br J Psychiatry. 2005;187:523–8.

    Article  PubMed  Google Scholar 

  7. Brazo P, Beaucousin V, Lecardeur L, Razafimandimby A, Dollfus S. Social cognition in schizophrenic patients: the effect of semantic content and emotional prosody in the comprehension of emotional discourse. Front Psychiatry. 2014;5:120.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Müller VI, Kellermann TS, Seligman SC, Turetsky BI, Eickhoff SB. Modulation of affective face processing deficits in Schizophrenia by congruent emotional sounds. Soc Cogn Affect Neurosci. 2014;9:436–44.

    Article  PubMed  Google Scholar 

  9. Lin Y, Ding H, Zhang Y. Emotional prosody processing in schizophrenic patients: a selective review and meta-analysis. J Clin Med. 2018;7.

  10. Chan CC, Wong R, Wang K, Lee TM. Emotion recognition in chinese people with schizophrenia. Psychiatry Res. 2008;157:67–76.

    Article  PubMed  Google Scholar 

  11. Corcoran CM, Keilp JG, Kayser J, Klim C, Butler PD, Bruder GE, et al. Emotion recognition deficits as predictors of transition in individuals at clinical high risk for schizophrenia: a neurodevelopmental perspective. Psychol Med. 2015;45:2959–73.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  12. Wildgruber D, Riecker A, Hertrich I, Erb M, Grodd W, Ethofer T, et al. Identification of emotional intonation evaluated by fMRI. NeuroImage. 2005;24:1233–41.

    Article  PubMed  CAS  Google Scholar 

  13. Kantrowitz JT, Hoptman MJ, Leitman DI, Moreno-Ortega M, Lehrfeld JM, Dias E, et al. Neural substrates of auditory emotion recognition deficits in schizophrenia. J Neurosci. 2015;35:14909–21.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Shea TL, Sergejew AA, Burnham D, Jones C, Rossell SL, Copolov DL, et al. Emotional prosodic processing in auditory hallucinations. Schizophr Res. 2007;90:214–20.

    Article  PubMed  CAS  Google Scholar 

  15. Dunstan DA, Scott N. Norms for Zung’s self-rating anxiety scale. BMC Psychiatry. 2020;20:90.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Dunstan DA, Scott N, Todd AK. Screening for anxiety and depression: reassessing the utility of the Zung scales. BMC Psychiatry. 2017;17:329.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Chan RC, Shi YF, Lai MK, Wang YN, Wang Y, Kring AM. The temporal experience of pleasure scale (TEPS): exploration and confirmation of factor structure in a healthy chinese sample. PLoS ONE. 2012;7:e35352.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Nasrallah H, Morosini P, Gagnon DD. Reliability, validity and ability to detect change of the Personal and Social Performance scale in patients with stable schizophrenia. Psychiatry Res. 2008;161:213–24.

    Article  PubMed  Google Scholar 

  19. Kirkpatrick B, Strauss GP, Nguyen L, Fischer BA, Daniel DG, Cienfuegos A, et al. The brief negative symptom scale: psychometric properties. Schizophr Bull. 2011;37:300–5.

    Article  PubMed  Google Scholar 

  20. Zhang H, Wang Y, Hu Y, Zhu Y, Zhang T, Wang J, et al. Meta-analysis of cognitive function in chinese first-episode schizophrenia: MATRICS Consensus Cognitive Battery (MCCB) profile of impairment. Gen Psychiatr. 2019;32:e100043.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Chen X, Yang Z, Chen W, Zhao Y, Farmer A, Tran B, et al. A multi-center cross-platform single-cell RNA sequencing reference dataset. Sci Data. 2021;8:39.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Luo H, Zhao Y, Fan F, Fan H, Wang Y, Qu W, et al. A bottom-up model of functional outcome in schizophrenia. Sci Rep. 2021;11:7577.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Pinheiro AP, Rezaii N, Rauber A, Liu T, Nestor PG, McCarley RW, et al. Abnormalities in the processing of emotional prosody from single words in schizophrenia. Schizophr Res. 2014;152:235–41.

    Article  PubMed  Google Scholar 

  24. Weisgerber A, Vermeulen N, Peretz I, Samson S, Philippot P, Maurage P, et al. Facial, vocal and musical emotion recognition is altered in paranoid schizophrenic patients. Psychiatry Res. 2015;229:188–93.

    Article  PubMed  Google Scholar 

  25. Roux P, Christophe A, Passerieux C. The emotional paradox: dissociation between explicit and implicit processing of emotional prosody in schizophrenia. Neuropsychologia. 2010;48:3642–9.

    Article  PubMed  CAS  Google Scholar 

  26. Kraus MS, Walker TM, Jarskog LF, Millet RA, Keefe RSE. Basic auditory processing deficits and their association with auditory emotion recognition in schizophrenia. Schizophr Res. 2019;204:155–61.

    Article  PubMed  Google Scholar 

  27. Razafimandimby A, Hervé PY, Marzloff V, Brazo P, Tzourio-Mazoyer N, Dollfus S. Functional deficit of the medial prefrontal cortex during emotional sentence attribution in schizophrenia. Schizophr Res. 2016;178:86–93.

    Article  PubMed  Google Scholar 

  28. Iwashiro N, Yahata N, Kawamuro Y, Kasai K, Yamasue H. Aberrant interference of auditory negative words on attention in patients with schizophrenia. PLoS ONE. 2013;8:e83201.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  29. Anticevic A, Corlett PR. Cognition-emotion dysinteraction in schizophrenia. Front Psychol. 2012;3:392.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Leitman DI, Foxe JJ, Butler PD, Saperstein A, Revheim N, Javitt DC. Sensory contributions to impaired prosodic processing in schizophrenia. Biol Psychiatry. 2005;58:56–61.

    Article  PubMed  Google Scholar 

  31. Amminger GP, Schäfer MR, Papageorgiou K, Klier CM, Schlögelhofer M, Mossaheb N, et al. Emotion recognition in individuals at clinical high-risk for schizophrenia. Schizophr Bull. 2012;38:1030–9.

    Article  PubMed  Google Scholar 

  32. Pinheiro AP, Farinha-Fernandes A, Roberto MS, Kotz SA. Self-voice perception and its relationship with hallucination predisposition. Cogn Neuropsychiatry. 2019;24:237–55.

    Article  PubMed  Google Scholar 

  33. Tseng HH, Chen SH, Liu CM, Howes O, Huang YL, Hsieh MH, et al. Facial and prosodic emotion recognition deficits associate with specific clusters of psychotic symptoms in schizophrenia. PLoS ONE. 2013;8:e66571.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Pinheiro AP, Del Re E, Mezin J, Nestor PG, Rauber A, McCarley RW, et al. Sensory-based and higher-order operations contribute to abnormal emotional prosody processing in schizophrenia: an electrophysiological investigation. Psychol Med. 2013;43:603–18.

    Article  PubMed  CAS  Google Scholar 

Download references


Not applicable.


This work was supported by Beijing Municipal Science & Technology Commission (Z191100006619020,Z211100003521016, Z191100006619104), Scientific Foundation of Beijing Huilongguan Hospital(LY202202)and Beijing Municipal Science & Technology Commission Grant (D171100007017002).

Author information

Authors and Affiliations



WZ,QZ,HA,YY,NF and SY collected, analyzed and interpreted the patient data. WZ and QZ was a major contributor in writing the manuscript. MG,ST,FY orgnized and supported the study. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Shuping Tan or Fude Yang.

Ethics declarations

Ethics approval and consent to partcipate

The study protocol was conducted in accordance with the Declaration of Helsinki and was approved by the research ethics and institutional review boards of Beijing Huilongguan Hospital(2018-46). Informed consent for study participation was obtained from all subjects.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: Supplementary Methods.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, W., Zhang, Q., An, H. et al. Vocal emotion perception in schizophrenia and its diagnostic significance. BMC Psychiatry 23, 760 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: