We followed the SPIRIT statement for reporting trial protocols [19]. This is the first published version of the protocol.
Study design
This is a 12 month follow-up multicenter observational prospective controlled trial comparing patients that receive antidepressant drugs with those who do not. The study was approved by the Clinical Research Ethics Commitee of the Sant Joan de Déu Foundation (CEIC Fundació SJD; Reference Number: EPA-24-12) and the Clinical Research Ethics Commitee of The Jordi Gol i Gurina Foundation (CEIC IDIAP; Reference Number: 5013 – 002).
Study setting and GP enrolment
GP enrolment was conducted six months before patient recruitment. GPs from the province of Barcelona were invited to participate. The University Institute in Primary Care Research Jordi Gol (IDIAP Jordi Gol), which gives technical support to every professional that works in Primary Care in the Catalan Public Health System, spread the invitation to participate in the study to all the GPs from the province of Barcelona. Furthermore, the research team contacted the Primary Care Centers with whom they had worked in previous research studies to invite them to participate.
The study is conducted in 12 primary care centers in the province of Barcelona (Spain). The participating centers have between six and 17 primary care teams (each of them consisting of a GP and a nurse) and attend to a population of 250,000 to 350,000 inhabitants. Sixty-eight GPs participated in the recruitment of patients for the study.
Prior to the study, GPs received a three hour-training on the study protocol, diagnostic criteria for depression, and national guidelines for the treatment (pharmacological and non-pharmacological) of MDD in primary care, divided into two 1.5-hour sessions. Session 1: Diagnosis and non-pharmacological treatment for MDD (active monitoring, sleep hygiene, counseling, frequency of follow-up visits, health education and low intensity psychological therapies); and Session 2: pharmacological treatment of MDD. During the study, a monthly newsletter is sent to the participating GPs to remind them of the topics presented in the training seminars and to inform them about the study progress.
At the beginning of the study, the GPs completed a questionnaire collecting the following variables: sociodemographic characteristics, job characteristics, training, attitude towards depression, interest on mental health and participation in communication groups [20].
Eligibility criteria and recruitment
Eligible patients are adults (≥18 years-old) who receive a diagnosis for a new episode of MDD. The following patients are excluded: those that have taken an antidepressant medication in the previous 60 days; those presenting psychotic or bipolar disorders or on antipsychotics, lithium or antiepileptics in the previous six months; those with history of drug abuse or dependency; those with cognitive impairment that prevents an assessment interview; and those who refuse to give signed consent to participate.
GPs recruit patients for the study from their daily list of patients attending the practice until they reach five patients for each group (active monitoring or pharmacological treatment). Maximum recruiting time is 12 months. For patients meeting the inclusion criteria, GPs inform them of the study’s aim and procedures during the medical visit, where a written informed consent is also obtained. GPs then refer patients for their first assessment appointment.
Interventions
This is a naturalistic study. GPs use their professional clinical judgment to recommend a treatment option to the patient. GPs can recommend a non-pharmacological intervention (Active Monitoring Group) or a pharmacological treatment with antidepressants (Medication Group) following their own clinical criteria and experience.
The patients in the Active Monitoring Group receive the usual treatment that the GPs perform when applying active monitoring without a pharmacological treatment. According to the Catalan guideline [13], which has been presented to all the GPs, active monitoring requires a first follow-up visit within the following 15 days. Afterwards, it recommends from six to eight follow-up visits over 10–12 weeks, where the GPs can consider low intensity psychosocial therapies such as counseling, problem-solving techniques or on-line CBT. Also, it recommends structured and supervised exercise programs of moderate intensity. As part of the stepped care model, in case the patient’s condition does not improve, the GP can intensify the treatment and initiate antidepressants.
Adherence to active monitoring is controlled through patient interviews (patients are asked the number of control visits with the GP and the recommendations to deal with depression from their GP). Also, at the end of the study, the GPs will be asked to describe the actions that were taken with patients in the Active Monitoring Group.
The patients in the Medication Group recieve the antidepressants usually prescribed in Spanish primary care at doses usually recommended according to their symptoms and characteristics. The national guidelines recommend initiating a pharmacological treatment with SSRIs (particularly with citalopram, sertraline, paroxetine or fluoxetine) in accordance with the Catalan Health Service’s recommendations following cost-effectiveness criteria [13].
Adherence to antidepressants is monitored through two methods: pharmacy records (computerized pharmacy records that register information about medication including active principle, dose and units supplied in the patient’s clinical history at the time of purchase), and patients’ self-reported adherence (with the 4-item scale developed by Morisky and colleagues [21]).
At the moment of patient inclusion, GPs complete a form that includes the following information: patient allocation (active monitoring or antidepressants) and reasons for the allocation and type and dose of antidepressant prescribed, if any. Withdrawal or changes in treatment and the reasons (initiation of antidepressants or changes in active principle) will be registered in another form. This will allow the evolution of treatment in naturalistic conditions to be monitored.
Outcomes and participant timeline
The primary outcome of the study is cost-effectiveness, measured in terms of incremental cost per reduction of the severity of depression achieved and in incremental costs per quality adjusted life years (QALYs) gained. Figure 1 shows all the measures administered at each assessment visit as well as the time schedule of patients.
Costs are collected from a societal perspective. Use of health care resources and lost productivity are assessed using the Client Service Receipt Inventory (CSRI) [22] with a recall period of 12 months at baseline and six months at point two and close-out. We collect information on productivity losses, health tests, hospital care (emergency visits and stays), secondary care (visits to psychologists, psychiatrist and other specialists), primary care (visits to GP and nurse), medication use and social care services (visits to social worker). Information on the use of psychotropic medicines (active principle, dose and units supplied) is also collected from computerized pharmacy records.
The unit costs of public healthcare services are obtained from the Official Bulletin of the Catalan Government. Costs of privately funded services are obtained from published tariffs. The mean price per milligram of active principle is calculated using the prices of the generic versions of all the presentations as reported in the Spanish Vademecum. Productivity losses will be calculated based on the human capital approach using information on the minimum and average daily wage in Spain (INE) [23].
Changes in the severity of depression are assessed using the Patient Health Questionnaire 9-item depression module (PHQ-9) [24-26]. The PHQ-9 is a nine-item scale with items scored from 0 (not at all) to 3 (nearly every day) on nine symptoms of depression. Summed scores range from 0 (no depressive symptoms) to 27 (all symptoms occurring daily). Summed scores of 20 to 27 correspond to severe symptoms; 15 to 19 to moderately severe; 10 to 14 to moderate symptoms; 5 to 9 to mild symptoms; and 0 to 4 to minimal symptoms.
The Spanish version of the EuroQol-5D (EQ-5D) is used to measure health-related quality of life [27-29]. The EQ-5D records self-reported problems in five domains (mobility, selfcare, usual activities, pain/discomfort and anxiety/depression) divided into three levels of severity (no problems, some problems, and extreme problems), thus generating 245 possible health states [30]. Each state corresponds to a single index value referred to as the tariff. Value 1.000 is the best health state and value 0.000 corresponds to being dead. The second part records the subject’s self-assessed health on a Visual Analogue Scale (VAS) on which the best and worst imaginable health states score 100 and 0, respectively. QALYs are calculated by multiplying the utility with the amount of time a patient spent in a particular health state. Linear interpolation is used for transitions between health states.
Clinical diagnosis according to DSM-IV diagnostic criteria was confirmed using the research version of the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) [31]. The modules of mood and anxiety disorders were used. A low concordance between diagnosis of MDD by GPs and SCID-I criteria has been described [9]. It was considered important to check the diagnosis with SCID-I. However, the study was naturalistic so the diagnostic according to DSM-IV criteria was informative and not used as an inclusion criteria. GPs were blind to the DSM-IV diagnosis and patient inclusion was performed according to their usual practice.
Disability was assessed using the 12-item interviewer administered version of the World Health Organization Disability Assessment Schedule (12-item WHO-DAS 2.0). Respondents have to indicate the level of difficulty experienced taking into consideration how they usually do the activity, including the use of any assistive devices and/or the help of a person. In each item, individuals have to estimate the level of disability during the previous month using a 5-point scale (none = 1, mild = 2, moderate = 3, severe = 4, extreme/cannot do = 5). The total score is calculated with a syntax provided by the WHO can range from 0 to 100 with higher scores reflecting greater disability. In the ERASMAP study, the Spanish version demonstrated adequate psychometric properties (Cronbach’s α = 0.89) and evidence of unidimensionality [32-34].
The Beck Anxiety Inventory (BAI) is a twenty-one item self-report inventory that evaluates the severity of anxiety. Each question has four possible answer choices that go from 0 (not at all) to 3 (severely) with total scores ranging from 0 (minimal level of anxiety) to 63 (severe anxiety) [35,36].
Chronic physical conditions were assessed using a “yes” or “no” check-list.
Medication Side effects: Evident side-effects are assessed using a brief check-list considering the most common side-effects of antidepressants. For each side-effect the intensity, frequency and causal relation with antidepressant drugs is assessed.
The Beliefs about Medicine Questionnaire (BMQ) assesses the cognitive representations of medication [37,38]. It has two sections, the BMQ-General and the BMQ-Specific. The BMQ-General evaluates general medication beliefs and comprises two 4-item factors: General-Harm (medicines are harmful, addictive or poisonous) and General-Overuse (medicines are overused or excessively trusted by doctors). The BMQ-Specific evaluates representations of specific medication prescribed for the patient, in this case antidepressants. This part was only administered to patients on the antidepressants group. The BMQ-Specific comprises two 5-item factors: Specific-Necessity (the need of antidepressants) and Specific-Concerns (dangers of use of antidepressants).
Sociodemographic characteristics are evaluated at the beginning of the study: age, gender marital status, education and working status.
Statistical analysis
The analysis will be done according to the intention to treat analysis (all the patients will be included in the analysis in the group to which they were allocated independently of the treatment they finally received.
Missing data
Missing data patterns will be evaluated to assess if it is plausible that data is missing at random [39]. To minimize bias resulting from the loss of information not following a completely at random reason, missing values will be imputed using multiple imputation by chained equations. The imputation model will include relevant socio-demographic and prognostic variables associated with the drop-outs and outcome variables and variables to be included in the final cost-effectiveness models [40].
Propensity score calculation
The allocation of patients is done according to the GP’s decisions. Thus, it is expected to have bias, as groups could not be comparable. To minimize the bias resulting from the lack of randomization, we will use a propensity score. First, it will be evaluated where the probability of receiving active monitoring or antidepressants is affected by GP factors (socio-demographic characteristics, job characteristics, training, attitude towards depression, interest on mental health and participation in communication groups) and/or patient factors (socio-demographic characteristics, baseline severity of depression, presence of major depression according to SCID-I criteria, severity of anxiety, comorbid conditions and beliefs about medicines). Second, a logistic regression model will be used to calculate a propensity score. The dependent variable of this model will be the group (active monitoring or antidepressants) and the independent variables will be those that are associated with a higher or lower probability of receiving active monitoring or antidepressants.
Incremental cost-effectiveness ratios calculation
Incremental effects and costs between groups will be modelled using generalized linear models. First, different distribution families and link functions will be tested and Akaike and Bayesian information criterion (AIC and BIC) will be used to decide the model that best fits the distribution of the effects (QALYs and severity of depression) and costs. Second, to select adjustment variables, socio-demographic and baseline clinical variables considered to be relevant will be tested in the models using likelihood ratio tests. All the models will be adjusted for gender, age and the propensity score. Difference in costs and effects will be calculated using the final models in each of the imputed databases and combined as per Rubin’s rules [40]. Incremental cost-effectiveness ratios will be calculated by diving the difference in costs between groups by the difference in effects.
Cost-effectiveness planes and cost-effectiveness acceptability curves
To deal with the uncertainty in the sampling distribution of the incremental cost-effectiveness ratio, resampling techniques with bootstrapping will be used. Replications will be done in each of the imputed databases and then combined. A minimum of 5,000 replications will be generated. Bias-corrected and accelerated (BCa) confidence intervals will be estimated on each of the imputed databases and then averaged [41]. Bootstrapped pairs of cost and effect differences will be plotted on cost-effectiveness planes and used to construct the cost-effectiveness acceptability curves.
Sensitivity analyses
At minimum, the following sensitivity analyses will be performed: 1) an analysis from the healthcare perspective; 2) a per protocol analysis; 3) an analysis using the mean average salary instead of the minimum average salary for productivity losses; 4) an analysis not adjusted for the propensity score; 5) an analysis including only the patients that fulfill DSM-IV criteria for MDD.
Sample size
In cost-effectiveness studies, sample size calculations have been criticized and their usefulness has been questioned [42]. This calculation is influenced by many parameters, some of them related to costs, which must be specified a priori. However, knowledge about costs and deviations is scarce so it is required to make assumptions that affect the calculation of the sample size. Moreover, this calculation requires to decide, in advance, what the maximum acceptable incremental cost-effectiveness ratio or maximum willingness to pay will be. Again, where to put the cutoff points for these parameters is a complicated decision. Finally, it should be borne in mind that the results of the economic evaluation will be presented in cost-effectiveness acceptability curves. These curves are not based on statistical inference so the meaning of sample size calculations could be questioned again.
On the other hand, to deal with the uncertainty in the sampling distribution of the incremental cost-effectiveness ratio, it will be necessary to perform resampling techniques that require a minimum sample size. Previous experience in these randomized studies with naturalistic conditions indicates that a total of 150 patients per arm will suffice [41,43].