Characteristics and quality of clinical practice guidelines for depression in adults: a scoping review

Background Clinical Practice Guidelines (CPGs) should follow an adequate methodology using an evidence-based approach in order to provide reliable recommendations. However, little is known regarding the quality of CPGs for Depression, which precludes its adequate use by stakeholders and mental health professionals. Thus, the aim of this study was to conduct a scoping review to describe the characteristics and quality of CPGs for Depression in adults. Methods We searched CPGs for Depression in adults in eighteen databases. We included those that were published in English or Spanish between January 2014 and May 2018 and were based on systematic reviews of the evidence. Two independent authors extracted the characteristics, type and number of recommendations, and quality (using the Appraisal of Guidelines for Research and Evaluation-II [AGREE-II]) of each included CPG. Results We included eleven CPGs, of which 9/11 did not include the participation of patients in the development of the CPG, 4/11 CPGs had a score ≥ 70% in the overall evaluation of AGREE-II, and 3/11 CPGs had a score ≥ 70% in its third domain (rigor of development). In addition, only 5/11 CPGs shared their search strategy, while only 4/11 listed the selected studies they used to reach recommendations, and 7/11 CPGs did not clearly state which methodology they used to translate evidence into a recommendation. Conclusions Most of evaluated CPGs did not take into account the patient’s viewpoints, achieved a low score in the rigor of development domain, and did not clearly state the process used to reach the recommendations. Stakeholders, CPCGs developers, and CPGs users should take this into account when choosing CPGs, and interpreting and putting into practice their issued recommendations. Electronic supplementary material The online version of this article (10.1186/s12888-019-2057-z) contains supplementary material, which is available to authorized users.


Background
Depression is recognized as an important public health issue. By the year 2015 it affected around 4.4% of the population [1], and by the year 2016 it was responsible for approximately 6.75% of years lived with disability in adults worldwide [2]. Different actions are needed to improve the care of people suffering from depression, such as the development and implementation of adequate clinical practice guidelines (CPGs).
CPGs are classically defined as a set of "systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances" [3]. CPGs can help to close gaps between evidence and policy, issuing recommendations in favor of the use of effective interventions and against the use of futile interventions [4]; CPGs are needed to establish reliable recommendations achieved through a clear methodology [5]. However, the methodology used for CPGs development is usually poorly defined and varies widely in content and quality between and within developing institutions [6], which may lead to inconsistency between recommendations. Accordingly, a study that compared recommendations from 23 CPGs published between 2007 and 2017 found high inconsistency in recommendations for second and third-line of pharmacological treatment of depression [7].
It is of great importance that stakeholders and mental health professionals are aware of the characteristics and quality of currently used CPGs for depression. The quality of the CPGs should be taken into account when interpreting and putting in practice recommendations issued in these CPGs. However, to our knowledge, only one study has assessed the quality of CPGs for depression and anxiety in children and youth [8], and we have not found studies that have described the quality of CPGs for depression in adults. Thus, the aim of this study was to describe the characteristics and quality of CPGs for depression in adults.

Methods
We performed a scoping review of CPGs for depression in adults published in the last 5 years and evaluated characteristics regarding scope, methods used to reach recommendations, and grading the strength of recommendations. We also assessed the quality of each CPG. The PRISMA guidelines for scoping reviews (PRIS-MA-ScR) were used to secure adequate reporting and to guarantee the replicability of the study [9].
A scoping review is "a form of knowledge synthesis, which incorporate a range of study designs to comprehensively summarize and synthesize evidence with the aim of informing practice, programs, and policy and providing direction to future research priorities" [10]. It is similar to a systematic review, but it mainly differs on the objective they pursue. While a systematic review aims to find an answer to a well-defined question, a scoping review can be used identify, map and discuss certain characteristics in papers or studies [11]. Given that our aim was to identify CPG and their characteristics, we decided to use the later.

Eligibility criteria
We included all CPGs, defined as a document that aimed to state recommendations, that fulfilled the following criteria: assessed screening, diagnosis or management of depression in adults; were published or totally/partially updated in the last 5 years (January 2014-May 2018); full-text were available in English or Spanish; and used systematic reviews of the evidence to guide their recommendations. We decided to include only CPGs based on systematic reviews, based on the current CPG definition which states that they should be designed based on a systematic review of the evidence [12].
We excluded those CPGs that assessed specific types of depression such as bipolar or psychotic depression, or specific types of populations such as depression in patients with cancer or in older people after a stroke.

Search strategy
We performed a comprehensive search in eighteen databases. Our search strategy included terms related to depression and guidelines/practice guidelines. Searchers were performed by two independent researchers (JHZT and DVZ), and the last update was run in June 2018 (see Additional file 1).

Study selection
Two independent researchers (JHZT and DVZ) evaluated if the CPGs met the eligibility criteria for inclusion. When there were discrepancies, a consensus was reached after debating them among all the authors.

Data extraction
The following characteristics were extracted from the CPGs: authors; year of publication; country; involvement of patients or their representatives in the CPG development process; methodology used to reach recommendations; methodology used for grading the strength of recommendations; usage of minimally important difference (MID) when evaluating the effect of interventions; and the number of recommendations and good clinical practice (GCP).
We defined a recommendation as "all the statements in favor or against an intervention based on systematic reviews of the evidence, which typically include a formal assessment of the benefits and drawbacks of available treatment options" [13]. All the statements that synthetize opinions from an organized group of experts (expert consensus) and aim to describe "customary and expected care to be offered to patients" in situations where little to no evidence is available were considered as GCP [13].
We defined MID as a measure of the "smallest change in patient-reported outcomes of interest that patients perceive as important" [14].

Quality appraisal
To assess the quality of CPGs we used the Appraisal of Guidelines Research and Evaluation II (AGREE-II), which has 23 items distributed in six domains (scope and purpose, stakeholder involvement, rigor of development, clarity and presentation, applicability, and editorial independence). Each guideline was rated by two researchers. When a difference in two or more points in each item was found, the item was discussed to get to a consensus. Otherwise, we used the mean of the two raters for each item. Lastly, we followed the AGREE-II Instrument guideline to calculate the scores for each domain [15].
We considered that when a CPG had a total score ≥ 70% it had adequate quality, we also used the same cutoff for each of the domains of the AGREE-II Instrument. This cutoff point was taken from a previous study that evaluated the quality of depression CPGs in children [8]. Likewise, we considered that when a CPG had a score ≥ 70% in the third domain (rigor of development) of the AGREE-II Instrument, the CPG had an adequate rigor of development.
From the included guidelines, 2/11 included patients in the process of development of the CPG (one as part of the guideline development group [NICE], and 1/11 during the external validation [GuiaSalud]). Regarding how the development group reached the recommendations, 3/11 did not clearly state how recommendations were reached (Korea, RANZCP, USTF), 4/11 used expert consensus but did not specify the criteria evaluated (ACP, APA, VADoD, BAP), and 4/11 used a well-specified methodology (either: Grades of Recommendation, Assessment, Development, and Evaluation [GRADE], Scottish Intercollegiate Guidelines Network [SIGN], or Canadian Network for Mood and Anxiety Treatments [CANMAT]). All included guidelines specified the system they used for grading the strength of recommendations ( Table 1).
The number of recommendations stated by each CPG varied between one and 199 recommendations. Three CPGs focused on one topic: the Acupuncture CPG [19] that aimed to give recommendations regarding acupuncture treatment, the ACP guideline [22] that aimed to determine the usage of pharmacological versus nonpharmacological treatment, and the US-Taskforce guideline [25] that aimed to state how the screening of depression should be performed. The other eight CPGs addressed multiple topics: one on non-pharmacological treatment (Korea), two on treatment (APA, RANZCP) [23,24] and five on diagnosis and treatment (CANMAT, NICE, GuiaSalud, VADoD, BAP). Of note, three CPGs issued consensus statements (either GCPs or consensusbased recommendations).

Main findings
This study explores the characteristics, scope, and quality of CPGs for depression in adults that based their recommendations on systematic reviews and were published between January 2014 and May 2018. We included eleven CPGs from seven countries on four continents, from which two reported the patient involvement in the design or validation of the CPGs, six provided recommendations on screening, five on diagnosis, eight on pharmacological treatment, nine on psychological treatment, nine on other non-pharmacological treatments. Regarding the quality assessment, 4/11 CPGs reached a score ≥ 70% in the overall assessment of the AGREE-II instrument, and 3/11 CPGs reached a score ≥ 70% in the rigor of development domain. In addition, only 5/11 CPGs shared their search strategy, while only 4/11 listed the selected studies they used to reach recommendations, and 7/11 CPGs did not clearly state which methodology they used to translate evidence into a recommendation.

Patient involvement
The involvement of patients or their representatives in the development of CPGs is considered important as it is supposed to complement scientific evidence to reach more acceptable and implementable recommendations [29]. Thus, many guidelines development groups recommend its inclusion in every step through the development of CPGs, including the definition of the scope and objectives, the definition of the review questions, the developing of recommendations (sharing their preferences regarding the assessed interventions), and the review of the final version of the CPG [30].
However, we found that 2/11 CPGs reported that patients had participated in the design or validation of   the CPGs. This low patient involvement is similar to that found in other studies. One study that evaluated 62 Dutch guidelines assessed patients' participation in the development process through three items (patients' participation, identification of the patient's input in the CPG, and the emphasis of patients' participation in the individual patient level), and found that only 1/62 CPGs fulfilled satisfactorily these items [31]. The CPG that fulfilled these criteria was the Dutch guideline for depression [32]. Moreover, a study evaluated the patient involvement in guidelines in 101 organizations that publish CPGs in G-I-N North America and National Guideline Clearinghouse and found that only 8% of them require the patient or public involvement on guideline development groups, while 15% sometimes require it or describe it as optional [33].
Our results indicate that few of the CPGs achieve an adequate methodological quality, which could lead to recommendations that are not based on the best available evidence. This situation could be due to the fact that developing a high-quality CPG demands many financial resources, time, highly specialized personnel, and health system support [39][40][41]. In addition, some CPGs may fulfill an adequate rigor of development but attained a low score in the AGREE-II instrument because the development process was not adequately reported [42]. To avoid this, the guideline development groups could apply AGREE-II or another instrument to verify the adequate reporting of their CPGs.
To state a recommendation, two basic steps are needed: the selection of evidence and the methodology used to translate evidence into a recommendation [5]. We evaluated some characteristics in order to understand how these steps were performed.
Regarding the selection of evidence, only 5/11 CPGs shared their search strategy, while only 4/11 listed the selected studies they used to reach recommendations.
Not sharing these information prevents the readers from adequately evaluating if there was any bias in the selection of evidence used to guide the recommendations, and prevents the replication and corroboration of the searches performed.
Regarding the methodology used to translate evidence into a recommendation, 7/11 CPGs did not clearly state the methodology used. A clearly defined methodology is necessary to understand what criteria were used and how the developing group judged each criterion to reach a recommendation. This allows users to understand how subjectivity and possible competing interest of the guideline developing group may have influenced on its recommendations, and help decide if recommendations can or should be implemented in their own settings [43]. Inconsistent recommendations are not rare, as shown by a systematic review that assessed the recommendations stated in CPGs for depression treatment, which found inconsistencies in the recommendations for the second and third line of pharmacological treatment [7]. For CPGs with inadequate methodology, it is necessary to evaluate the suitability of its use, and be careful when considering the implementation of its recommendations.

Limitations and strengths
Our study is not free from limitations. We only collected guidelines published in English or Spanish, so our findings could not be representative of CPGs published in other languages. The CGPs quality was assessed using the AGREE-II instrument, based on the reporting of the CPGs, so guidelines with inadequate reporting could be classified as deficient, despite their actual quality. Lastly, there are not validated cut-off points for the AGREE-II instrument so the discrimination between CPGs with adequate and inadequate quality could be inaccurate.
However, to our knowledge, this is the first study that has evaluated the characteristics and quality of CPGs for depression in adults. This evaluation has some important strengths: we used a systematic search strategy involving eighteen databases to find available CPGs for depression in adults, we used the AGREE-II instrument that provides a standard methodology to critically appraise the quality of CPGs, and we performed independent appraisals by two researchers.

Conclusions
We found eleven CPGs for depression in adults that used systematic reviews to guide their recommendations. Only two CGPs reported patient involvement. Regarding the quality of these CPGs, only 4/11 CPGs reached a score ≥ 70% in the overall assessment of the AGREE-II instrument, and 3/11 CPGs reached a score ≥ 70% in the rigor of development domain. In addition, only 5/11