An evaluation of variation in published estimates of schizophrenia prevalence from 1990─2013: a systematic literature review

Background There is a lack of consistency in findings across studies on the prevalence of schizophrenia, and no recent systematic review of the literature exists. The purpose of this study is to provide an updated systematic review of population-based prevalence estimates and to understand the factors that could account for this variation in prevalence estimates. Methods MEDLINE, Embase, and PsycInfo databases were searched for observational studies describing schizophrenia prevalence in general populations from 2003–2013 and supplemented by studies from a prior review covering 1990–2002. Studies reporting prevalence estimates from specialized populations such as institutionalized, homeless, or incarcerated persons were excluded. Prevalence estimates were compared both across and within studies by factors that might contribute to variability using descriptive statistics. Results Sixty-five primary studies were included; thirty-one (48 %) were from Europe and 35 (54 %) were conducted in samples of ≥50,000 persons. Among 21 studies reporting 12-month prevalence, the median estimate was 0.33 % with an interquartile range (IQR) of 0.26 %–0.51 %. The median estimate of lifetime prevalence among 29 studies was 0.48 % (IQR: 0.34 %–0.85 %). Prevalence across studies appeared to vary by study design, geographic region, time of assessment, and study quality scores; associations between study sample size and prevalence were not observed. Within studies, age-adjusted estimates were higher than crude estimates by 17 %–138 %, the use of a broader definition of schizophrenia spectrum disorders compared to schizophrenia increased case identification by 18 %–90 %, identification of cases from inpatient-only settings versus any setting decreased prevalence by 60 %, and no consistent trends were noted by differing diagnostic criteria. Conclusions This review provides updated information on the epidemiology of schizophrenia in general populations, which is vital information for many stakeholders. Study characteristics appear to play an important role in the variation between estimates. Overall, the evidence is still sparse; for many countries no new studies were identified. Electronic supplementary material The online version of this article (doi:10.1186/s12888-015-0578-7) contains supplementary material, which is available to authorized users.


Background
Schizophrenia is a serious, complex brain disorder, with a reported median incidence of 15.2 per 100,000 persons [1] and a pooled lifetime prevalence of 0.40 % (10 %-90 % quantiles: 0.16-1.21 %), both estimates being based on a review by Saha et al. [2] No comprehensive review has followed Saha et al.'s systematic search in 2003. Moreover, prior reviews highlight the variability in schizophrenia prevalence estimates [3][4][5][6]. Eaton, for example, noted a 12-fold variation in point prevalence and a 10-fold variation in lifetime prevalence, [3] while Goldner et al. observed a 13fold variation in lifetime prevalence of schizophrenia [6].
Inherent variability between estimates may in part be due to the heterogeneity and complexity of the disease [1]. However, other factors also likely contribute to variation observed between reported prevalence estimates. Study design (e.g. cohort or cross-sectional study) and methods can affect case ascertainment in an epidemiological study [7,8]. Population and health care system differences exist at the national and regional level, which highlights the importance of recording the geographic region of an epidemiological study. The sample size of the overall population can be an indicator of the generalizability of an estimate and outlier estimates may be reported from very small populations [9]. Factors such as the diagnostic criteria for schizophrenia have changed over time: The Diagnostic and Statistical Manual of Mental Disorders (DSM-V) currently guides physicians to diagnose schizophrenia along a continuum of severity, from the less severe delusional disorder to the more severe schizoaffective disorder [10]. Therefore, the period in which a study was conducted may influence the number of cases identified and the resulting prevalence estimate. Other factors such as study setting (e.g., prisons, hospitals, or the general community) also likely contribute to this variability [1,2,11].
According to McGrath, the variability between estimates requires the use of systematic reviews and-in a second step-pooled estimates [1]. However, pooled estimates mask essential information when variability in estimates is mainly due to factors such as differences in study design or populations, and a better understanding may be gained from looking at these studies without pooling estimates. The objective of this review is two-fold: 1) to provide an updated systematic review of population-based prevalence estimates; and 2) to understand all main factors that could account for variability in published prevalence estimates, including study design, geographic region, sample size, study dates, and study quality.

Methods
This systematic review adheres to current best practices for conducting systematic reviews of the literature [12,13]. The data source was literature published from January 1, 2003 to October 9, 2013, and the methods used to perform this review involved both electronic and manual components. Studies were identified from the literature by searching the MEDLINE (via PubMed), Embase, and PsycINFO databases for the terms "schizophrenia" and "prevalence". Searches were limited to studies with human subjects and published in the English language. Case reports, letters, commentaries, editorials, reviews, clinical trials, reviews, and in vitro studies were excluded. These three electronic searches were supplemented by additional targeted electronic searches (using broader schizophrenia/psychosis terms) and by a manual search of the bibliographies of all accepted studies. Search results from the various sources were combined, and the duplicate records were removed. The titles and abstracts of each citation were screened and the full text of each potentially relevant citation was retrieved and reviewed. Studies identified in the systematic literature review conducted by Saha et al. [2] were also screened for inclusion in our review if they were published from 1990-2002.
Population-based observational studies (retrospective or prospective) reporting on the prevalence of schizophrenia in the general population were selected for this review. To minimize variation caused by study setting, studies performed in high-risk or other sub-populations (e.g., institutionalized, incarcerated, homeless subjects) were excluded. Studies with fewer than 200 screened people were also excluded to minimize outlier estimates resulting from small sample sizes.
Both descriptive and quantitative study-and patientlevel data from accepted studies were extracted into a data extraction form by a single investigator and then reviewed against the original study by a second investigator. Quantitative data included prevalence estimates which were extracted as reported in each study and then standardized to percentages to facilitate comparisons between studies. Study country was classified by region as presented in Additional file 1: Table S1. The level of evidence score (see Additional file 1: Table S2) was adapted from the review by Saha and colleagues [2]. The maximum score was 15 points per study, and a higher score indicated a greater level of evidence.
Occasionally studies presented multiple prevalence estimates per period (e.g. 12 months, lifetime); for example, a study may have presented three estimates of lifetime prevalence that were calculated using three different sets of diagnostic criteria. In cases such as these, only one estimate was selected per period using the following pre-specified criteria, which were developed to minimize variability between estimates for comparative purposes. The criteria involved selecting 1) crude estimates preferentially, with adjusted estimates only selected if no crude estimates were available; 2) the most recent estimate; 3) an estimate from the most broad catchment area; 4) the most broad case ascertainment method (e.g., cases identified from inpatient, outpatient, and emergency room visits, rather than just one setting); 5) the most recent diagnostic criteria; and 6) estimates based on a narrow definition of schizophrenia, when estimates derived from more expansive definitions were presented. Therefore, studies contributed a maximum of one prevalence estimate per period for the purposes of these analyses, and estimates from different time periods were not compared.
Descriptive statistics, including means, standard deviations, median, ranges, and interquartile ranges (IQRs) were used to summarize prevalence estimates and other continuous variables. Categorical variables such as study characteristics were summarized using counts and proportions. Sub-group analyses of factors including study design, geographic region, sample size, study dates, and quality score were conducted separately for 12-month and lifetime prevalence estimates. To compare estimates from the same prevalence periods, this review emphasizes 12-month and lifetime prevalence estimates (the most commonly reported periods). However, point prevalence and estimates from other periods are also briefly summarized for comprehensiveness. Other factors were also assessed within studies when possible, including differences between prevalence periods, various methods of case identification, and temporal trends.

Study selection
A total of 1185 unique citations were identified from MED-LINE, Embase, and PsycINFO in the systematic review ( Fig. 1). At the abstract screening level, 1100 citations were excluded for the following reasons: prevalence of schizophrenia not reported (n = 802), study type (n = 238), and not a sample from the general population (n = 60). Eighty-five full-text articles were retrieved, plus two articles identified from the targeted searches, and nine identified from manual bibliography checks. Thirty-seven primary studies and 13 related publications (e.g. a different prevalence study published by the same investigators in a particular catchment area and overlapping time period) were included from the 2003-2013 systematic literature review.
Of the 142 articles identified in the Saha review, [2] 56 were published from 1990-2002 and retrieved for further screening. The year 1990 was used as a cut-off to limit the search to more current studies, and this date was selected after verifying that no major studies were excluded prior to that date. Twenty-eight primary articles and 3 related publications from Saha et al. met the inclusion criteria of this review (with the main differences being a restriction to observational studies on the general population published in English), as presented in Fig. 1.

Study characteristics
Among the studies included in this review, 29 were from Europe, 13 were from Asia, 10 were from North America, eight were from Africa, four were from Oceania, and one was a multinational study reporting country-specific estimates for 52 countries from all regions ( Table 1). Over half of the studies (35 or 53.8 %) were conducted with sample sizes of 50,000 people or greater. Study design was evenly split between cross-sectional studies (50.8 %) and cohort studies (49.2 %). The cohort studies primarily utilized healthcare databases or case registers (n = 25), though also included five birth cohort studies [16,24,38,67,70] and two follow-up studies of previously defined cohorts [72,73]. Although publication dates ranged from 1990-2013, over half of the studies (55.4 %) described samples recruited prior to 1999. Forty studies (61.5 %) reported prevalence among diagnosed populations, and the age of patients sampled in each study varied widely, from narrow ranges such as 15-38 years [24] to no age restrictions. The mean quality score and corresponding standard deviation was 8.8 ± 2.6 across all studies, ranging by region from 7.7 among studies conducted in North America to 10.1 among studies conducted in Africa.

Lifetime prevalence
Thirty studies reported lifetime prevalence estimates ( Table 4). The overall median lifetime prevalence estimate across the studies included in this review was 0.48 % (range: 0.06 %-5.00 %; IQR: 0.34 %-0.85 %). Among the studied regions, the lowest reported median lifetime prevalence  One study, published by Nuevo and colleagues, presented the results of the World Health Organization's 2003 World Health Survey (WHS) and detailed estimates of schizophrenia prevalence across 52 countries [56]. Household respondents aged 18+ completed a standardized questionnaire that collected data on demographics, selfreported diagnoses, and treatment of schizophrenia and psychotic symptoms, and the results were considered to be nationally representative. In this study, lifetime prevalence estimates varied widely, from 0.07 % in Vietnam to 5.10 % in Swaziland. The combined prevalence across all countries categorized in the upper or middle-upper economic strata per the World Bank was 1.00 % (15 countries); the combined lifetime prevalence of countries in lower or lowermiddle economic strata (37 countries) was 1.38 %. The total lifetime prevalence of schizophrenia reported across

Point prevalence
Fourteen studies reported the point prevalence of schizophrenia (Additional file 1: Table S3). The median estimate of point prevalence across these studies was 0.32 % (IQR: 0.18 %-0.41 %). The minimum and maximum point prevalence estimates were both among isolated island populations [33,52].

Period prevalence other than 12 months or lifetime
Thirteen studies reported prevalence for periods other than 12 months or lifetime (Additional file 1: Table S3). The periods represented ranged from one month to 19 years, with seven studies representing periods greater than one year and six representing periods less than one year (including two with the period not reported). As expected, the median estimate for periods greater than one year (0.39 %; IQR: 0.26 %-0.57 %) fell between those for 12-month and lifetime prevalence, and was higher than the median for periods less than one year (0.20 %; IQR: 0.18 %-0.28 %).

Within-study estimates
Some trends, such as differences in prevalence methods and case identification, or changes over time, may be better understood by examining differences across estimates within the same study.

Prevalence periods and methods
Eight studies compared lifetime estimates to point prevalence or short period prevalence (i.e., ≤12 months) utilized database methods for one region and cross-sectional methods for the other 2 regions studied; the 3 regional estimates were pooled and this study has been categorized as "cross-sectional" for these analyses a Selected estimates are the most recent year available. When assessment spanned multiple years, the median year was considered. 4 estimates did not have assessment years reported [17,20,29,31,42,61,71,73]. When the prevalence window was expanded from point or 12 months to an individual's lifetime, the relative increase in prevalence ranged broadly across studies, from 0 % to 271 % (excluding one study where the six-month prevalence was 0 compared to a 0.6 % lifetime prevalence). Of those seven studies with calculable increases in prevalence, six (85.7 %) reported increases greater than 33 %, and three (42.3 %) observed the prevalence at least double in value. Studies also differed on whether they reported crude or adjusted prevalence, and many reported both crude and adjusted estimates. Among nine studies that reported both crude and ageadjusted prevalence, the age-adjusted estimates were always higher, with relative differences ranging from 17 % to 138 % [19,21,39,40,47,55,72,77,78]. The median change was +43 %, and the increase was greater than 66 % for all three lifetime prevalence estimates.

Case identification
The use of a broader definition of "schizophrenia spectrum disorders" (including schizophreniform and schizoaffective disorders, versus narrowly defined schizophrenia) increased case identification by 18 %-90 % among six studies, with four studies having increases of 70 % or more [16,19,24,37,40,68,69]. Although studies that only included inpatients were excluded, two Canadian studies compared case identification algorithms that required hospitalizations for schizophrenia to those that included any physician visits; in both studies, inpatientonly lifetime prevalence was approximately 60 % lower than the overall treated lifetime prevalence [71,74]. The diagnostic classification systems used to identify schizophrenia have evolved over time. Three studies compared multiple classification systems applied to the same populations. In a Swedish study, Lindstrom et al. observed little difference across estimates of schizophrenia prevalence defined by DSM versions III (0.40 %), III-R (0.42 %), and IV (0.43 %), and found that International Classification of Disease, 10 th Revision (ICD-10) criteria were slightly more inclusive (0.47 %) than DSM criteria [49]. Similarly, McCreadie et al. reported a higher prevalence in the UK for International Classification of Disease, 9 th Revision (ICD-9) schizophrenia (0.33 %) than ICD-10 (0.30 %), which was higher than DSM-III-R prevalence (0.26 %) [50]. Barrett et al., however, found that applying ICD-10 criteria to their Malaysian sample ident`draw 1ified fewer patients than DSM-IV criteria, and that Research Diagnostic Criteria for schizophrenia was the most inclusive [19].

Temporal trends
Only one study reported trends in schizophrenia prevalence over time using any data from within the last 15 [34]. Two other studies, from Japan [54] and Canada, [74] reported time trends starting in the mid-1980s and spanning a decade, and both suggested an increased schizophrenia prevalence until a peak at the beginning of the 1990s followed by decline in the mid-1990s.

Discussion
To our knowledge, this study is the first systematic review of the prevalence of schizophrenia among general populations published since 2005. Overall, the median 12-month prevalence of schizophrenia was 0.33 %  populations, potentially due to genetics, geography, socioeconomic differences, different perceptions and levels of awareness, or other factors. This might also be an explanation for the very high variation of schizophrenia prevalence estimates across 52 countries in the WHS. Prevalence estimates were higher for studies with low quality scores, which may indicate that the true prevalence of schizophrenia is lower than estimates reported in lower quality studies. Cohort studies yielded higher prevalence estimates compared to cross-sectional studies. Associations between sample size and prevalence were not observed in the present study, presumably as low sample sizes were excluded and only population-based studies were included. However, the sample size of a study and screening procedures would greatly contribute to the likelihood of identifying cases in a catchment area, particularly with a disease with a relatively low prevalence, such as schizophrenia. Only minor differences in prevalence estimates of schizophrenia calculated using different diagnostic criteria (e.g. ICD-9 vs. ICD-10) were observed in this study. A number of studies showed a 70 % or greater increase, however, when a broader case definition of "schizophrenia spectrum disorders" including schizophreniform and schizoaffective disorders was applied, compared to a narrow case definition of schizophrenia alone. Other studies showed differences by study setting, as inpatient-only lifetime prevalence was approximately 60 % lower than overall (inpatient and outpatient) lifetime prevalence. This evidence suggests that a focus on a sub-group of studies that meet a number of criteria (e.g. study quality and recency, cohort design) may provide a better reflection of the true prevalence of schizophrenia as compared to median or pooled estimates that include older, lower quality studies and apparent outliers such as the prevalence estimates resulting from the WHS.
McGrath highlights challenges with regard to the diagnosis of schizophrenia, with modern diagnostic criteria requiring the exclusion of other general somatic conditions and very varied compliance to screening protocols designed to identify these disorders [1]. These will translate into limitations to separate out measurement error from true variations in prevalence and increasing the variations between estimates [1]. The 2005 review by Saha and colleagues also analyzed factors such as diagnostic criteria, case selection methods, and study quality. It found some differences, but stated that findings were inconclusive [9,95] Similarly, our findings suggest that design factors contribute to variance in prevalence estimates.
Whereas other reviews may include prevalence estimates from varied populations such as homeless and incarcerated persons, our methods indicate that a thoughtful selection of studies can minimize the variability of some characteristics that typically affect prevalence estimates, improving our understanding of the burden of this disease in the general population. The median lifetime prevalence estimate reported in this review (0.48 %) is similar to, but slightly greater than, the overall prevalence previously reported by Saha and colleagues (0.40 %) [2]. This difference appears to be due to higher estimates among studies published after the search dates of Saha's review: the median lifetime prevalence among articles published in 2003 or later was 0.51 % (Table 5). Although this study included 28 primary studies from the review by Saha et al., more restrictive selection criteria were applied in this review to compare relatively recent estimates from general populations. Since diagnostic criteria, treatment guidelines, and knowledge about a disease change over time, the restriction to studies published in 1990 or later helped to minimize the impact of these variables on the ascertainment of schizophrenia prevalence. Estimates from this review and the prior review by Saha et al. [2] are less than half the overall estimate reported from the 2003 WHS [56]. As very few studies included in this review reported a prevalence of schizophrenia >1 %, it is possible that the unique study questionnaire used by the WHS, in which respondents self-report previous diagnoses of schizophrenia, led to the differences seen here. Another possible explanation for the higher prevalence estimates reported by the WHS is its use of lay interviewers, who may classify disease differently than psychiatrists, even after the use of standardized reporting forms [96,97].
In the 2003 WHS, the five lowest prevalence estimates (ranging from 0.07 %-0.27 %) were from Asia and Europe, while the five highest prevalence estimates (ranging from 2.72 %-5.70 %) were all from Africa. It is possible that this reflects differences in the awareness of the disease and case ascertainment methods used across various regions. Interestingly, four of the five other studies in this review that reported lifetime prevalence greater than 1 % were from Canada [71] or Finland [16,38,48] (the fifth study was from South Africa), [62] which supports evidence that schizophrenia prevalence may be higher in geographic areas at higher latitudes [98][99][100][101].
The choice of study design does play an important role in identifying those in the general community who have not yet been diagnosed with a mental health disorder such as schizophrenia. Birth cohorts, as well as cross-sectional surveys in which mental health professionals interview community members for symptoms indicative of schizophrenia, are time-consuming and expensive to conduct. Alternatively, surveys in which respondents self-report diagnoses are relatively inexpensive, but this method may introduce bias and miss a clinically significant number of undiagnosed cases. Furthermore, the age range of the study samples included in this review varied greatly, which limited our ability to use age range as a variable for sub-analyses. However, since clinicians have realized that schizophrenia symptoms may onset after 45 years of age, studies that restrict the age range of patients potentially underestimate the prevalence of schizophrenia observed in that population.
There is no consensus about how best to summarize observational studies, with relatively little discussion on the strengths and weakness of different approaches [9]. Published systematic reviews on prevalence typically choose different approaches without discussing the rationales of using one method over another [102][103][104][105][106]. We opted to present median values in this study rather than performing a meta-analysis to generate pooled values, as Saha et al stated, "the decision to combine data from randomized controlled trials or risk factor epidemiological studies are of less relevance to prevalence estimates, where estimates based on very large populations should not necessarily carry more weight than estimates based on small populations" [9]. Thus, the variation inherent in the prevalence estimates that have been extracted becomes lost when pooling across studies conducted with different methods, populations, and other variables. Moreover, we performed sub-group analyses instead of a meta-regression analysis as we wanted to compare the difference of prevalence estimates between subgroups, rather than the size of the effect of factors on the prevalence estimates.

Limitations
The scope of this review was restricted only to general populations, rather than including focused populations such as patients who have been institutionalized or incarcerated, homeless persons, and migrants. Special populations such as these do have a higher reported prevalence of schizophrenia, but they should be described separately, so as not to overestimate the prevalence in the general population. However, such populations should certainly also be considered by policy makers and healthcare providers to understand the full burden of this disease. Other factors such as  [69,70,95,107] and were not assessed in the current study, but these and other unmeasured variables may also be associated with the epidemiology of schizophrenia. Another limitation is the exclusion of non-English literature. However, cross-checking English abstracts of excluded studies showed that few studies (including no major studies) were missed given the language restriction which appears to reflect that studies today are commonly published in English. A number of data gaps became evident in the course of conducting this review. Accurate estimations necessitate the study of sufficiently large populations given the relatively low number of prevalence cases. Several large, heavily populated countries (such as, Brazil, France, Germany, Japan, and Russia) had either one or no published studies on the prevalence of schizophrenia among general populations, while estimates from many other countries were >10 years old and in need of updating. In fact, the only schizophrenia prevalence estimates from Central or South America were the country-specific estimates presented in the 2003 WHS study. The most accurate way to assess schizophrenia prevalence would involve full clinician interviews with the entirety of a population. However, since that is not a feasible method for large populations, a more cost-effective approach could involve screening patients within a nationally-representative survey or registry, and then conducting clinical interviews/ examinations to confirm cases; similar methods were employed by Perala and colleagues in Finland [58].

Conclusions
This updated review provides important evidence on the epidemiology of schizophrenia in general populations, which is vital information for healthcare planning. These data indicate that approximately one in 200 individuals will be diagnosed with schizophrenia at some point during their lifetime. Prevalence estimates across studies varied when looking at different study design, geographic region, time of assessment, and quality scores. As investigatordependent factors likely lead to variations in published estimates, the present review used a thoughtful selection process of estimates for comparative purposes as well as looking at differences between sub-groups.
Although the size of these variations suggest that study characteristics can influence prevalence estimates, this does not preclude the potential influence of other factors which were not assessed in this study, such as environmental factors. These findings also suggest that a focus on studies that meet a number of criteria (e.g., study quality, recency, and cohort design) may provide a better reflection of the true prevalence of schizophrenia as compared to pooled estimates across very heterogenous studies.
Finally, there is a scarcity of data from many countries, and additional well-designed epidemiological studies performed in these locations will help to improve our understanding of the global prevalence of this disease.

Additional file
Additional file 1: Table S1. Regional Classifications of Countries with Prevalence Estimates. Table S2. Evidence Classification. Table S3 Competing interests JS, AW, and PR are employees of Evidera, which provides consulting and other research services to pharmaceutical, device, government, and non-government organizations. In this salaried position, they work with a variety of companies and organizations and are precluded from receiving payment or honoraria directly from these organizations for services rendered. JC was employed by Evidera during the conduct of this study. RW was employed by F. Hoffmann-La Roche Ltd during the conduct of this study.
Authors' contributions JS participated in the study design, data analysis and interpretation, and drafted the manuscript. AW participated in the study design, data analysis, and interpretation of the data. PR and JC participated in the study design, conducted the literature search and data analysis, and participated in the interpretation of the data. RW conceived of the study, and participated in the study design and interpretation of the data. All authors read, provided critical revisions, and approved the final manuscript.