EFFORT-D: results of a randomised controlled trial testing the EFFect of running therapy on depression

Abstract Results of a randomised controlled trial testing the EFFect Of Running Therapy on Depression. Background This randomised controlled trial explored the anti-depressive and health effects of add-on exercise (running therapy or Nordic walking) in patients with Major Depressive Disorder (MDD). Methods Patients were recruited at three specialised mental health care institutions. In the intervention group exercise was planned two times a week during 6 months, the control group received care as usual. Observer-blinded measurements included Hamilton-17 depression scores and several health and fitness parameters. Submaximal bicycle-tests were performed at inclusion, 3, 6 and 12 months. The effects of exercise were assessed by effect size, intention-to-treat and analysis per protocol using General Linear Models (GLM) with time x group interactions. Results In total, 183 patients were assessed for eligibility and 135 were excluded (40% of the potential participants declined to participate mainly due to a lack of time and motivation). Together with a drop-out of 55% at 6 months, this reduced the power of the study severely. As a result, statistical analysis was performed only on the first 3 months of the study. Data were ultimately analysed from 46 patients, of which 24 were in the intervention group. Significantly more women were in the intervention group, and depression and fitness were higher in the control group. Participants showed 2–3 points less depression on average after 3 months. However, the GLM showed no effect on depression (Cohen’s d < 0.2, F = .13, p = .73) in both the intention-to-treat and per protocol analyses. However, large effect sizes (Cohen’s d > 0.8) were found for aerobic capacity (VO2max∙.kg− 1, F = 7.1, p = .02*), maximal external output (Wmax∙.kg− 1, F = 6.1, p = .03*), and Body Mass Index (F = 5, p = .04*), in favour of the intervention group. Conclusions In this selective and relative small clinical population with MDD, an anti-depressive effect of the exercise intervention could not be measured and is also unlikely due to the very low effect size. An integrated lifestyle intervention will probably be more effective than a single add-on exercise intervention. However, significantly increased fitness levels may contribute to the alleviation of current cardio-metabolic risk factors or prevention of these in the future. Trial registration: Netherlands Trial Register (NTR): NTR1894 on July 2nd 2009.

treatment for MDD. Language barriers, costs and availability of therapists could be a barriers to CBT and IPT. Regarding medication, only 30% of the patients reach full remission within 12 weeks using these first-line antidepressant treatments [10]. Medication can also have undesirable side-effects, such as increased cardiovascular risk, and is often resisted by patients, leading to poor adherence and suboptimal response.
As an alternative or add-on, most guidelines on the treatment of depression [11][12][13][14] recommend physical activity (PA), such as walking, cycling and exercise (a planned, structured, repetitive and purposive physical activity [15]). The effect of PA on depression has been tested extensively. However, in most randomised controlled trials (RCTs) studying the effect of exercise on depression [16][17][18], the included patients had mild to moderate depression. This is partly because most of these studies relied on self-selected volunteers and/or patients who were included based on a self-rating depression scale (Becks Depression Inventory [19]). Only some of the participants were diagnosed by a trained clinician with an unipolar depression or MDD according to the Diagnostic and Statistical Manual of Mental Disorders (DSM) [20] or the International Classification of Diseases (ICD) [21]. Three such RCTs studied adult inpatients with MDD (total n = 113), including one study that was published in 1985 [22].
Due to ethical considerations when treating patients with MDD, in current clinical practice an add-on approach is preferable to stand-alone exercise treatment. Depression in these patients is often severe and protracted, and they are already using antidepressants. Therefore, when testing the effect of PA in specialised mental health care settings, an add-on designi.e. adding PA to ongoing treatmentsreflects real-world clinical practice better than testing exercise as a monotherapy, which may be appropriate in patients with milder depressive illness. To our knowledge, few recent RCTs [23] have tested the effect of adding PA to ongoing care in clinically diagnosed MDD patients in regular specialised mental health care. To this end, we designed the EFFORT-D study: EFFect Of Running Therapy on Depression. The intervention consisted of running therapy (RT) or Nordic walking (NW) during 6 months and a postintervention follow-up at 12 months. The study design of EFFORT-D has been published previously [24]. The main objective of the study was to assess the effectiveness of RT or NW, in addition to usual care, on MDD in adult inpatients and outpatients. We hypothesised that adding exercise therapy to usual care would result in a larger reduction in depressive symptoms (as measured with the Hamilton Rating Scale of Depression (HAM-D17, see method section), after a 6 month treatment programme as well as after another 6 months follow-up, compared to usual care without exercise therapy. Secondary objectives were to assess the effectiveness of RT or NW on the following outcome measures: 1) Fitness parameters: maximal oxygen uptake (VO 2 max•kg − 1 ), maximal external power output (Wmax•kg − 1 ), heartbeat at rest, grip strength; 2) Metabolic risk factors: body composition (Body Mass Index, fat-free mass, muscle mass, visceral fat, waist circumference), blood pressure, fasting blood glucose, blood fats; 3) Co-morbid symptoms of anxiety and pain; 4) Quality of life; 5) Cost-effectiveness.

Design
We conducted an observer-blinded RCT with adult patients with MDD treated in either an inpatient department or in an outpatient clinic or day hospital.

Setting
The study was carried out between December 2012 and January 2015 in three specialised mental health care institutions in the Netherlands. The setting included patients from two inpatient emergency wards and three specialised outpatient clinics for affective disorders, ensuring the inclusion of patients with more severe, protracted disorders. As a result of unforeseen conditions, which were described previously in a process evaluation study [25], 6 months after the start of the study, one of the specialised mental health care institutions was no longer available as an inclusion location. The two remaining institutions consisted of inpatient units and day hospitals with multidisciplinary teams with psychiatrists, a specialised mental health care general practitioner (GP), a neurologist on demand, psychotherapists, sociotherapists, occupational and physical therapists, a social worker and nurses. Patients were referred by outpatient mental health care services (emergency or otherwise). The patients assigned to two locked emergency wards continued their stay after stabilisation and reduction of suicide risk in an open crisis ward, and often continued their treatment in a day hospital followed by the outpatient programme (psychotherapy, medication and outreaching social nurse).
Patients of the two outpatient clinics were referred by GPs, and were assessed, diagnosed and treated by a psychiatrist, psychotherapist and/or a social psychiatric nurse.

Participants
Eligible participants were patients between 18 and 65 years old with MDD, bipolar depression or seasonal depression not responding to bright light therapy. Patients with severe and acute symptoms of MDD who were referred to an emergency ward were eligible for inclusion and, when stabilised and capable of giving informed consent, were visited by the research assistant. All participants were diagnosed by a clinician before referral to the study and fulfilled the criteria for MDD according to the Diagnostic and Statistical Manual of Mental Disorders, DSM-IV-TR [20]. In case of a HAM-D17 score [26] ≥ 14, a cut-off score for mild depression, the candidate was invited to participate after the protocol and aims of the study were explained. Exclusion criteria were a depressive disorder as part of a psychotic disorder, schizophrenia, schizoaffective disorder, or obsessive-compulsive disorder or anxiety disorder as primary diagnosis. Also excluded were patients with significant cardiovascular disease or other medical conditions as contra-indication for exercise therapy, walking and/or running such as joint and hip pathology, alcohol/drugs dependence as a primary diagnosis, pregnancy, high suicide risk with treatment on a closed ward, or already being physically active on a regular basis (2 or more times a week at a high-intensity level).

Trial staff
The assessment of the HAM-D17 at baseline was performed by the research-assistant prior to randomisation. At the other measurement points, the HAM-D17 was measured by one of the three blinded outcome assessors (psychologists). They were all trained before the start of the study by an expert in HAM-D17 testing to reach consensus. As a control measure during the study, HAM-D17 interrater sessions were performed monthly on a patient with MDD (nonparticipant in the study), and no more than 2 points difference on the HAM-D17 scale between outcome assessors was accepted. If this criterion was not met, additional training was scheduled.
The inclusion procedure was performed by the research assistant, who was trained for this beforehand. The procedure consisted of an eligibility check followed by an appointment for randomisation, online questionnaires, biometrics, blood pressure measurement and a submaximal Åstrand bicycle test. During the first assessment, a lab appointment for blood sample collection was scheduled.
The intervention trainers were a certificated physiotherapist and running therapists who were very experienced in RT and NW as well as in working with patients with severe mental disorders.

Recruitment
Within the three hospital regions, attention was drawn to the study with flyers for patients and professionals, announcements on the website, and bilateral visits of the principal researcher and research assistant to the heads and therapists of affective disorder programmes and intake-teams of the outpatient departments. The included participants received one gift-voucher worth 25 euros as an incentive for future participation in training and measurement activities.

Randomisation and masking
Randomisation took place at each location separately. As a result, each location had an equal distribution of participants in the intervention and control group. The SPSS random generator, SPSS version 14.0 [27] was used to allocate patients. Participants were randomly assigned to exercise twice a week for 6 months (the participant could choose NW if RT was not possible, for instance if they had some joint problems), in addition to their treatment as usual or standard treatment. Both groups were followed for 1 year with measurements at inclusion (T0), 3 months (T3), 6 months (T6) and after 12 months (T12). Figure 1 shows the flow diagram of the study.
At the end of the inclusion procedure, if informed consent was given, 10 closed opaque envelopes with allocation numbers were presented to the participants by the research assistant. The participant could choose an envelope, after which the research assistant told the participant in which arm of the study he or she was included, and the participant was enrolled into the study by the research-assistant. Due to the type of intervention, the participants, research assistant and trainers could not be blinded. Evaluators of the main outcome measure (HAM-D17) were blinded for group allocation. All biometric measurements and the Åstrand bicycle test were performed by the research assistant. The questionnaires (described in the measurements subsection) were self-assessed online by digital code and filled in by the participants.

Intervention
In total, around 40 exercise sessions were scheduled during 6 months. We expected (based on clinical experience) a long remission-to-recovery period in our patient population, from at least 2-3 months. We decided to take a rather long exercise duration of 6 months, because we were also interested in the maintenance of a possible exercise effect which was not described yet in the available literature. These add-on exercise sessions took place twice a week: once a week a supervised group session was offered and once a week the patient exercised individually, with clear instructions on beforehand and an evaluation with selfreport at the beginning of the next supervised session. Each supervised session, in which the trainers worked according to a standardised protocol, lasted 1 hour, including 30 min of running therapy (RT) or Nordic walking (NW). The remaining time was spent on warming-up and coolingdown. Each subject followed an individualised intervention protocol with gradually increasing training intensity. The goal was to achieve a 30 min period of continuous running or Nordic walking in the last sessions (30 min of continuous aerobic exercise twice a week at at least 60% of their estimated maximum heart rate). Given the importance of easy access to the training sites, care was taken that both the timing (flexibility of the trainers) and the location (near to the patients and easy to access) were convenient for participants. These intensive exercise sessions were not offered to the control group. They were only allowed to exercise at low intensity (walking and recreational sports) and participate in psycho-motor therapy in their treatment programme. The intervention group also had psychomotor therapy in the treatment programme and were not encouraged to do intensive exercise between the training sessions. Furthermore, participants in both arms of the study received treatment as usual consisting of pharmacotherapy, cognitive therapy or interpersonal therapy, in addition to psycho-education and lifestyle advice emphasising an active lifestyle with enough walking per day.

Measurements
All outcome parameters measured during baseline were repeated after three, six and 12 months, except for the blood samples, which were taken only at T0 and T6. [24].

Major depressive disorder
The primary outcome was change in severity of depressive symptoms measured with the HAM-D17. This instrument has been shown to have a good interrater reliability of .92 (Pearson) and internal consistency of .82 (Cronbach's alpha) if used by trained and experienced outcome assessors [28]. It is the most frequently used instrument to measure changes in depressive symptoms in clinical trials, especially in patients with more severe depression [29].

Fitness
A submaximal Åstrand bicycle test was performed on a stationary bicycle ergometer (Examiner, Lode BV, the Netherlands). The heart rate during this test was registered with a heart rate monitor (Polar RS 800, Electro Oy, Finland), and the mean heart rate of the last 2 minutes of the test combined with the submaximal workload was used to estimate the maximal external power output (Wmax) and the maximal oxygen uptake (VO 2 max). The estimated Wmax in Watt and VO 2 max in ml•min − 1 , both related to body weight (Wmax•kg − 1 ; VO 2 max•kg − 1 ), were used as outcome measures.
• The heartbeat at rest was measured by the heart beat monitor.
• Grip-strength was tested according to protocol twice in both hands (highest value was registered) using a hydraulic hand dynamometer (Jamar J00105, Sammons Preston Rolyan, Bolingbrook, USA).

Metabolic parameters
• A bio-impedance scale was used to measure weight and body composition (Omron HBF-510, Omron Healthcare Europe BV, the Netherlands). Height was measured according to protocol (Seca 214, Hamburg, Germany) and BMI was calculated by standard formula. Waist circumference was measured twice with a tape measure (Seca 201, Hamburg, Germany) at the midpoint between the lower border of the ribs and the upper border of the pelvis.
• Systolic and diastolic blood pressure were measured twice at rest using an electronic blood pressure meter (Omron M6 comfort, Omron Healthcare Europe BV, the Netherlands) with an adequate cuff size. The mean value of both measurements was registered. • Blood samples were taken at T0 and T6 to determine fasting blood glucose and blood fat. All blood samples were kept frozen and analysed together in one laboratory session at the end of the study.

Psychological and social measurements
• Co-morbid symptoms of anxiety were measured by the Beck Anxiety Inventory (BAI), a 21-item multiple-choice self-report inventory that measures the severity of generalised anxiety and panic symptoms in adults and adolescents on four levels. The BAI has been shown to be highly internally consistent (Cronbach's alpha = .94) and was scored as sufficient on tests of convergent and discriminant validity [30].
• Co-morbid pain was measured with the Graded Chronic Pain Scale (GCPS), a 7-item scale measuring aspects of pain, physical ability and social interference, resulting in a 5-class hierarchical scale ranging from 0 (no pain problem) to IV (high disability/severely limiting). (Internal consistency in Cronbach's alpha: .71-.74, Pearson's correlation coefficient .45-.58). • Quality of life (QoL) data were collected using the EuroQuol 5 dimension questionnaire (EQ-5D), a standardised instrument that measures five dimensions of health (one item each): mobility, self-care, usual activities, pain/discomfort and anxiety/depression, rated from 1 (no problems) to 3 (many problems). The added value of the EQ-5D is the calculation of an index score ranging from 0 (worst QoL) to 1 (perfect QoL) using the Dutch value set based on time-trade-off weightings of a representative Dutch sample [31]. • Cost-effectiveness: Healthcare use and work productivity were evaluated by the Trimbos/iMTA Questionnaire for Costs associated with Psychiatric Illness (TIC-P), a 29-item list which focuses on establishing costs related to loss of productivity at work and to health care utilisation. A feasibility and validation study showed that the TIC-P is a valid instrument for measuring medical consumption and productivity losses [32].

Other variables
Physical activity • SQUASH: This questionnaire measures physical activity (days, time and effort of PA) in four domains of daily life during a normal week in the past months and enables computation of the intensity of energy expenditure. Intensity is defined as the ratio of work metabolic rate to a standard resting metabolic rate (MET) [33]. The SQUASH has been shown to have a Spearman's correlation coefficient of .58 for total reproducibility and .45 for total relative validity compared with an accelerometer, and is suitable for population studies [34]. This questionnaire was only used retrospectively to determine whether patients were too physically active at T0 and eventually remove them before statistical analysis.
Medication • Type and duration of medication was derived from patient records retrospectively.

Sample size
It was expected that patients in the usual care group (control group) would respond with a mean reduction in HAM-D 17 of six points, because they were all following the standard MDD treatment protocol (this was based on clinical experience). Adding exercise to usual care (intervention group) was expected to result in a decline of at least eight points (2 points more than the control group) on the HAM-D 17 score. To detect this difference, with an α (two-tailed) of 5% and a power (1-β) of 80%, using two equal groups and a standard deviation of 5 points, 100 patients were needed in each group. Taking 30% drop-out into account, 140 patients had to be included in each group [24].

Data-analysis
The statistical analysis was performed using SPSS version 24 [35]. The normality and homogeneity of the data were tested and confirmed by Kolmogorov-Smirnov tests. Independent Chi 2 -tests and Fisher's exact (Fe) tests (categorical variables) and analysis of variance (ANOVA) (continuous variables) were used for comparison of the intervention and control group characteristics at baseline. Student's Ttests with 95% confidence interval were used for independent samples. Intention to treat (ITT) analysis was performed using General Linear Models (GLM) repeated measures with time x group interactions to test the differences in intervention and control group at different times [36]. For the per protocol (PP) analysis, the intervention participants were selected and divided into three groups based on the number of sessions they participated (0-10, 10-20 sessions and > 20 sessions). Participants with more than 10 sessions were included.

Ethical approval
The

Participant flow during the study
In total 183 patients were assessed as eligible to participate within the 2-year time-frame of the EFFORT-D study.
A relatively high number (n = 135) of these eligible participants were excluded for various reasons, see Fig. 1. The largest group of excluded patients (40%) were those who expected that they were not able to organise their daily life in such a way that they could participate in the exercise intervention twice a week. This expectation was based on the following reasons: the programme was too time consuming, they were too tired and they were unable to commit to the study and the exercise intervention. Secondly, 20% of the outpatients referred by a GP were admitted to the affective disorder programs after completing the intake procedure. However, by the time they were rated on the HAM-D17 (on average 10 weeks later), their score was already below 14, which made them ineligible to participate. Other causes for exclusion were medical contra-indication (25%) and already exercising twice a week for at least 30 min (15%). The planned dose of exercise was ultimately administered to 25 participants in the intervention group; the analysis was performed on a group of 24 participants (see Fig. 1). Between allocation to the intervention group (T0) and the first follow-up measurement (T3), 10 of the 25 initial participants (40%) dropped out. The dropout rate increased to 56% at T6 and 80% at T12. In the control group, a similar drop-out pattern was seen: 43% at T3, 65% at T6 and 82% at T12 (Fig. 1). Of the 48 participants in both arms included in the study, two highly physically active participants (METS/week > 750 = extremely high PA) were excluded retrospectively from the analyses based on SQUASH outcomes. Ultimately, 46 remained for further analysis, with 24 participants in the intervention arm.
Due to low attendance at the blood lab (60% at T0 and 83% at T6), only the blood samples at T0 could be analysed.

Participant characteristics at baseline
At baseline, 45% of the included participants were treated inpatients (17% were from the crisis care ward, 28% were from the day hospital) and 55% were outpatients. Table 1 shows the participant characteristics at baseline.
The baseline measurements showed that the intervention group included almost twice as many women than the control group, which has a large influence on parameters such as e.g. fitness (VO 2 max and Wmax) and muscle mass. The control group showed a significantly higher HAM-D17 score. In the total group, the criterion of hypertension (mean systolic blood pressure ≥ 140 mmHg) was met with a borderline mean diastolic blood pressure just below 90 mmHg. The mean BMI indicates overweight, and the mean waist circumferencea measure for central (visceral) adipositaswas above the norm: 95.0 (SD = 14.4) cm in females with ≥88 cm as cut-off score and 102.4 (SD = 13.3) in males with ≥102 cm as cutoff score, based on the criteria of the International Diabetes Federation (IDF) Epidemiology Task Force Consensus Group [38]. On the TIC-P, the control group had significantly more GP visits 6 weeks before inclusion. The majority of the participants were apparently treated along the depression protocol by a GP or mental health professional before inclusion, with on average 72% of the total group receiving antidepressants. Interpretation of the medication use indicated that some patients suffered from MDD with psychotic symptoms (melancholic depression) or/and were already in a following step of the MDD protocol (added lithium treatment). The severity of MDD was also underlined by the high mean rate of work sickness leave and inability to perform household tasks. The average delay between the intake procedure after GP referral, followed by an indication for the affective disorders programme until inclusion into the study for these outpatients, was 10.4 weeks. The participants in the emergency wards had to stabilise before inclusion, after which they were able to give informed consent and were included with an average delay of 3.5 weeks. As expected, due to the severity of their depression, the average HAM-D17 score at inclusion for inpatient or day hospital participants (25.7) was higher than for outpatients (20.5).

Effectiveness of the intervention
Due to the high number of drop-outs, statistical analysis was performed for the primary outcome (HAM-D17), the submaximal Åstrand bicycle test (VO 2 max in ml•min − 1 •kg − 1 and Wmax in W•kg − 1 ), and the biometric values between T0 and T3 only. Intention to treat and per protocol analysis using GLM (interaction time by group) showed no significant additional reduction in depressive symptoms in favour of the intervention group. Depression on the HAM-D17 decreased in both groups on average by 2-3 points. Both fitness parameters of the submaximal Åstrand bicycle test, the BMI and visceral fat (the last one only in PP analysis) showed significant improvements compared to the control group. Although no significant difference between both groups was seen in waist circumference and visceral fat mass (in ITT analysis), these outcome variables showed a large clinical effect (Cohen's d > 0.8) independent of the HAM-D17 scores. Heartbeat at rest and diastolic blood pressure, as a possible expression of autonomic nervous system modulation by exercise, showed a moderately large effect in favour of the intervention group (Table 2). For both the intervention and control group, Figs. 2 and 3 illustrate the difference in Wmax•kg − 1 and VO 2 max•kg − 1 between T3 and T0 in relation to the difference in HAM-D17 between T3 and T0. These figures show that the fitness of the intervention group increased greatly independent of the decrease in depression in both groups.
Analysis of effectiveness on blood fats, fasting glucose, BAI, GCPS, EQ-5D and TIC-P were not performed due to missing data.

Discussion
The EFFORT-D RCT examined the effect of an add-on exercise intervention in patients with MDD treated in specialised mental health care in the Netherlands. To the best of our knowledge, this study is one of the few [16] RCTs with 'real-life' inpatients and outpatients with MDD which investigated the effect of an add-on aerobic exercise intervention on the course of depression. Moreover, few previous studies have used the evaluation of fitness parameters based on a submaximal Åstrand bicycle test [22,23,39], which is more objective and accurate than self-rated fitness [40].
Our main hypothesisthat a significant reduction in depressive symptoms would result from add-on exercise in the intervention group relative to the control groupwas rejected. Both groups did improve at 3 months follow up, but there was no significant difference between the two arms. Given the small numbers of patients we were able to include, the lack of power precludes firm  conclusions about the effectiveness of the intervention, and this RCT therefore can be characterised as a "failed trial". However, the very small effect sizes (Cohen's d was 0.18 in the ITT analysis and 0.10 in the PP analysis) suggest that a larger trial is unlikely to find a clinically relevant effect of add-on exercise on depression. The effects of the intervention on fitness (VO 2 max in ml•min − 1 •kg − 1 and Wmax in W•kg − 1 ) were large, suggesting a considerable increase in fitness in the intervention group. The effects on metabolic risk factors, such as BMI, waist circumference and visceral fat, were large (Cohen's d effect size > 0.8 in favour of the intervention group). This is in line with another recent 6 weeks addon exercise study in MDD patients, in which physical fitness and metabolic syndrome factors improved in the exercise group [23]. A consistent component of significant changes in BMI and waist circumference in the EFFORT-D study is probably caused by a significant decrease of visceral fat. Given the relationship between white adipose tissue as a main player in the neuroimmune interactions between obesity and depression [41], a direct effect of exercise on the supposed chronic low-grade inflammatory state in these patients would be plausible. Unfortunately, lack of blood samples for CRP measurement precluded testing this. Regarding parameters reflecting a possible modulation of the autonomous nervous system, in which an overactive sympathetic activation is supposed to contribute to cardiovascular risk (CVR) factors [7], such as blood pressure and heartbeat at rest showed a moderately large effect size in the intervention arm. A considerable proportion (40%) of the initially eligible depressed patients declined to participate in the study, and analysis of the drop-outs indicated low motivation to participate in this type of exercise intervention. Once included, adherence to the intervention was a main concern for our trial staff. Despite increasing effort in motivating participants and individual adaptations in exercise training, without violating the protocol, this tendency persisted. The fact that the number of drop-outs was equally high in both arms and that no inpatients completed the entire intervention for 3 months might be explained by the severity of MDD in this specific population. Low adherence to the exercise intervention is assumed to be associated with MDD symptoms such as fatigue, loss of energy and initiative [25]. A process evaluation [25] of the EFFORT-D study mainly pointed out that a lack of intrinsic motivation, assumed as part  of the depressive symptomatology was a key factor in failing to continue the exercise. [42,43]. This has been proposed as a potential key factor in the commencement and maintenance of PA in patients with severe mental disorders such as schizophrenia and affective disorders. In similar depressive populations as ours, two recent studiesaddon exercise compared to treatment as usual or two-armed exercise versus stretching RCTsreported the same problems with inclusion and motivation [44,45]. In these studies, 25 to 50% of the eligible participants lacked motivation to participate. In another recent pilot RCT examining the feasibility of an unsupervised, facility-based exercise programme in depressed patients, the most helpful aspect reported was connecting participants to fitness centre resources. Adding physical activity counseling as intervention however, did not increase exercise adherence and had no effect on depression scores [46]. This raises the question of whether this type and frequency of add-on exercise intervention, with also partly unsupervised sessions, is appropriate for patients with more severe MDD. Furthermore, it is very likely that RT and NW are not the preferred types of exercise for all patients in this population. This suggests that exercise cannot be a standard intervention for all patients with MDD, and that more tailored interventions are needed [47]. The severity of MDD in this study population, together with its influence on several domains of daily life (somatic health, work, relations and social functioning), leads to the conclusion that an integrated lifestyle intervention appealing to these domains will probably be more effective than a single add-on exercise intervention. Therefore, more research is needed into potential negative influencers, patient profiling and for individually customised forms of exercise.
As a result of the 3-month exercise period, the fitness of the intervention group improved substantially: 15% increase in the Wmax. kg − 1 and 19% increase in VO 2 max. kg − 1 . In another study in MDD patients, using 1 month of add-on exercise therapy and antidepressants (sertraline), an increase of 8% in cardiorespiratory fitness (VO 2 max) [48] combined with a reduction of sertraline use was found in the exercise group compared with the control group. The fact that less increase in cardiorespiratory fitness was found compared to our study could be explained by the shorter training period of 1 month. In addition, the baseline fitness of our participants was almost 10% lower relative to the fitness of an healthy but untrained Dutch working population [49], which could also play a role. This is because lower initial values leave more room for improvement. Our biometric findings in BMI, body composition (waist circumference and visceral fat) at baseline show a rather high risk of metabolic syndrome. This is in accordance with studies into the metabolic burden and CVR in affective disorders, in which a robust association was found between metabolic syndrome and MDD [2]. Patients with MDD are therefore assumed to be at high risk for cardiovascular morbidity and mortality. In Fig. 2 Intention-to-treat difference scores of the W•kg − 1 plotted against difference scores of the Hamilton Rating Scale of Depression (HAM-D17), and labelled for intervention and control group contrast to earlier research indicating a high prevalence of diabetes Type II in MDD (DM-T2) [50], we found no indication of a high prevalence of diabetes Type II in the participants of the EFFORT-D study at baseline.
A recent meta-analysis showed a modest inverse dependent relationship between symptom severity and CVR, which was stronger in men than in women [51]. In the EFFORT-D study, however, the increase in Wmax•kg − 1 and VO 2 max•kg − 1 and was independent of the HAMD-17 results. Furthermore, a recent population based metaanalysis showed the protective effect of PA on depression [52]. The question remains whether this protective effect also applies to patients who recently remitted from a depressive episode.
In closing, although our primary hypothesis was rejected in this RCT, our results showed a potentially lower cardiovascular risk profile as a result of the intervention, which is a benefit when considering the high cardiovascular risk of depressive patients.

Limitations and strengths
The most important limitation of this RCT is that only about 25% of the intended participants were included, which clearly had a negative impact on the power of the study. The underlying reasons were described in a separate process evaluation paper [25]: a lack of motivation of the patients to participate (or continue to participate) in the study and a dynamic reorganizational surrounding (budgetary cuts and a merger) were the main causal factors. The lack of motivation was assumed to be related to the severity of the patients' depression in combination with (based on the patient records) the psychosocial circumstances which caused and maintained the depressive episode. In addition, excessive loss to follow-up made it impossible to perform the intended statistical analysis of follow-up data at 6 and 12 months. However, the non-significant result of the additional exercise intervention on depression that we found after 3 months of exercise in this type of patients is in accordance with a recent robust meta-analysis in this field, with DSM and ICD diagnosed MDD patients [53].
A further limitation was insufficient data for the analysis of several laboratory parameters and the results of questionnaires on quality of life, anxiety and pain symptoms, and cost-effectiveness. For example, due to the high proportion of no-shows at the blood lab, it was impossible to compare metabolic and inflammatory factors in the blood samples at T0 and T6, as intended in the design of the EFFORT-D study.
One strength of this study is its high-quality design: it was a RCT using the semi-structured Hamilton interview to diagnose the severity of MDD and the Åstrand submaximal bicycle test to objectively measure physical fitness. In addition, the EFFORT-D study is one of the few studies into MDD in 'real-life' patients, offering a useful new clinical perspective of the role of exercise in MDD. The objective measurement of the fitness parameters using bicycle-testing instead of self-rating by patients may also contribute to more scientific evidence for CVR in MDD [54] and prevention in the future.

Clinical implications
This RCT showed that low motivation for exercise and low adherence after starting exercise greatly reduces the number of MDD patients eligible for an add-on intervention. Given the non-significant results on depression scores, the clinician is confronted with the question of whether RT or NW are still considered positive interventions in 'real-life' patients in specialised mental health care. Our data on health parameters, however, showed a significant effect on risk factors for cardiovascular disease, so it seems logical to emphasise these positive health effects in psychoeducation to motivate patients to exercise instead of promising them reduction of depressive symptoms on the short term. This might prevent even more feelings of failure in clinical patients, which are already a burden related to their MDD.

Recommendations
Future studies into exercise for MDD patients should focus on participation from the first moment of contact with the patient. Risk factors for drop-out and low adherence are increasingly known and could be identified in individual patients, thus offering a risk-profile leading to more tailored attention to and guidance of the patient. In addition, more research is needed to identify factors that precipitated non-adherence. Participation might also be improved by using electronic devices (such as pedometers or actometers), social media and visits to the patients' homes (agreed to in advance). Furthermore, with regard to the response to the questionnaires, the online versions were not very appropriate for the MDD patients in the EFFORT-D study, leading to many incomplete questionnaires. Since all measurements performed in the presence of the research assistant or blind outcome assessors were more complete, filling in questionnaires in the presence of a trial staff member is recommended instead of self-rating online versions at home.

Conclusions
The current study underlines the importance of an addon exercise intervention in MDD patients to reduce their cardiovascular risk by enhancing their fitness. For inpatients and day-hospitalised patients with MDD, and those with a potentially more resistant depressive symptoms or with a higher cardiovascular risk, exercise could prevent future health problems. Due to a low statistical power and lack of follow-up at six and 12 months, conclusions about the anti-depressive effect of this add-on exercise intervention were not possible. However, the effect size on depression was so low that even a higher number of participants would not have improved the effect-size to clinical relevance. This is in line with results of previous meta-analyses when the results of the methodologically weaker studies are excluded. It could be more appropriate for clinicians to psycho-educate their MDD patients and to emphasise the useful effects of exercise on cardiovascular risk factors, rather than promising them unlikely positive results on depressive symptoms in the short term.