We analysed 16 comparisons from 14 publications, targeting the largest size and highest quality meta-analysis. Consequently, this review found that CCBT has apparently a moderate post-treatment effect size (SMD −0.48, 95% CI −0.63 to −0.33) for adult depressive symptoms compared with control conditions, indicating almost the same result as those of past meta-analyses. Nevertheless, we further found a possibility that this result may have to be re-considered downward in terms of practical implementation and research methodology. Thus, we would like to examine these two aspects.
Considering the lack of the endurance of effectiveness, functional improvement and the high dropout rate, our result inevitably casts doubt on the actual practicability of the current CCBT for depression.
To begin with, the attenuation of long-term effectiveness with CCBT seems to be a serious issue from a clinical point of view. In fact, although one of the past reviewers
 implied this tendency, long follow-up has not been clearly reviewed in meta-analyses until today. In this context, ours is the first review to meta-analyse long follow-up outcomes. This was paradoxical from the perspective that it has been reported that the effect of standard face-to-face CBT on depression does not usually attenuate sharply after intervention even without maintenance sessions. For example, the latter view has been advocated by the Annual Review of Psychology
, which mentions that the effectiveness of CBT appears to be at least more enduring than that of antidepressants for depressive patients. However, it is still unclear why such variance could arise according to differences in modality, while van Londen et al. raised this question in the context of bibliotherapy
In addition to long follow-up outcomes, it also has not been meta-analysed until our study about whether CCBT can contribute to functional improvement, even though this outcome is critically important in view of evaluating cost-utility, which is referred to as a distinctive advantage of CCBT
. In our analysis, CCBT intervention did not provide a significant effect in terms of function. There are a few possible reasons for this. Firstly, current CCBT may not be fundamentally good enough to improve function. The attainment of social functions such as returning to work has been commonly recognised as being more difficult than simply reducing depressive symptoms
. Secondly, we may have to consider the scale-sensitivity of function. Revicki et al. also referred to the property of generic measures that improvements in those scores are less sensitive in less depressive patients
. They suggested that such generic scales were even more unlikely to change amongst mild-to-moderate depressive patients than in severe depressive patients, often resulting in little change in utility and problematic utility assessment.
The third issue with practical implementation is that more than half of included studies had high overall dropout rates. It is clear that the higher dropout is unavoidable, especially for depression remediation, in that poor motivation is one of the fundamental symptoms. Indeed, even in the NHS, the dropout rate from CCBT is also high, with up to 50% of users starting the programme for depression not completing it, and it seems that this needs to be addressed as a serious issue
Despite the above substantial limitations of CCBT, it is still used on the premise that it is significantly effective, at least as measured immediately following treatment with it. However, by addressing methodological issues, our analysis further revealed some findings that may raise a more fundamental question of whether CCBT is really effective for adult depression even following treatment.
The first finding is the ambiguous definition of control conditions. In all previous systemic reviews of CCBT, there was little clarification of the influence of grouping results from studies with TAU and the waitlist as controls. Unlike research on medications or psychotherapy, all RCTs on CCBT effectively did not restrict the usage of medications for waitlist groups. Therefore, we had held that this confusion between groups without sufficient presentation is a considerable problem, and set up a protocol to separate subjects on waitlists from those undergoing TAU. However, we found that the proportion of patients taking medication at baseline for TAU groups was in the range from 0% to 76%, and the range for control groups was from 37% to 74%. When considering the virtually undistinguishable rates of medication intake, we concluded that it was difficult to clearly separate TAU from waitlist data, and that is why we classified TAU and waitlist subjects into the same control group in a post-hoc decision, adding a subgroup analysis on the influence of doing this.
In the subgroup analysis, our results showed that the effects were significantly greater when the control group was a waitlist as opposed to TAU. Only a meta-regression
 had an identical finding to ours, although the analysis was conducted by using only four (10.2%) reliable studies with depression-specific CCBT intervention. In general, this type of difference seems rational because TAU is more therapeutically intensive than a waitlist. However, another likely cause is that the reason for this is due to the tendency for it to be fundamentally easier for an intervention group to indicate a greater effect size relative to a waitlist than active placebo in psychotherapy research
. Therefore, it has been recently recommended to not use waitlists in research designs because of overestimation of intervention. Either way, this issue should be treated more carefully in general RCT settings as well as in RCTs of CCBT.
The second issue was that a high attrition rate was also considered to lead to a significant bias despite the conduct of ITT throughout all included studies. In practice, Cochran states that attrition rates higher than 20% may even affect outcomes analysed using ITT
. Also, extremely uneven attrition between or among branches of intervention can be an impermissible cause of bias
. Only one meta-analysis by Waller and Gilbody has dealt with this attrition issue, finding that subjects treated with CCBT dropped out approximately twice as frequently as control subjects, but this finding was not statistically significant
In relation to the high dropout rate, we focused on the fact that a variety of imputation techniques were implemented for ITT in order to cover attrition in each study, but there was no research on CCBT which examined this risk by this kind of imputation. Rickels and Schweizer mentioned that ITT takes account of dropouts, usually by LOCF
. However, Shao et al.
 and Unnebrink et al.
 claim that old-type imputations, such as LOCF, mean imputation and worst observation carry-forward (WOCF), can cause significant differences in results when the attrition rate is higher than 20%. By contrast, modern imputation can be thought as being more appropriate. Moreover, there can be significant differences even among imputations, and if so, this issue is serious for research especially where there is a high level of attrition. For example, Warmerdam demonstrated that newer imputation led to significantly different results
. Therefore, we investigated the probability of bias due to the method of imputation. In fact, when only trials with modern imputation techniques were included, the effect size decreased from moderate to mild. The influence of imputation has not been seriously discussed in psychotherapy, including self-help. In particular, research on CCBT should give more consideration to this because of its high attrition rate relative to other psychotherapies.
Thirdly, our study was the first to detect significant publication bias specific to CCBT, and this suggested the necessity of careful re-consideration in evaluating the usefulness of CCBT. Indeed, the trim-and-fill method suggested that the SMD reduced from −0.48 [95% CI −0.63 to −0.33] to −0.32 [95% CI −0.49 to −0.16], but still indicating significant effectiveness at least at post-treatment.
Finally, we cannot overlook the fact that there has been a remarkable dominance of self-rating scales used as the primary endpoints of past CCBT research. In our analysis, self-rating scales were used as the primary outcome in all studies. Although all the adopted scales were academically reliable as screening tools, excessive expectations as to self-rating measures could lead to significant bias in the results because self-report ratings from depressed patients are not necessarily a reliable or definitive estimate of the severity especially during the acute phase including before symptomatological improvement
Our sensitivity analysis also demonstrated that the effect size at post-treatment reduced from moderate to small without BDI. This can be explained by the characteristic of BDI that the score tends to be significantly influenced by cognitive factors rather than other instrumentals due to the different conceptualisation of depression among scales
[60, 61]. Indeed, CCBT is more likely to improve BDI scores than other measures probably because CCBT programmes strategically target cognitive change. Further, it has been also discussed that BDI is inaccurate as a way of appraising treatment outcomes due to overreactivity
[62, 63]. The frequent use of BDI can be theoretically justified as an efficacy study aiming at the efficacious maximisation of intervention. Even so, in terms of generalisablity, we may need to keep in mind the risk of overestimation when using self-rating scales, including BDI, when actually adopting CCBT for clinical use.
It is seemingly reasonable to expect that self-help CCBT can be a clinically- and cost-effective intervention, considering prior wholly-supportive reviews; however, the use of CCBT, even for mild to moderate depression, may be less practical and efficacious than believed at present. This can be supported by the poor results of three available cost-utility analyses of depression CCBT
[23–25]. Nevertheless, it would be too extreme to conclude that CCBT is an inefficacious intervention for adult depression for a few reasons. Firstly, we could distinguish indications for which CCBT is appropriate. In fact, it has been reported that applying CCBT to patients with a personality suitable for it
 or to those from a technologically-literate generation
 may contribute to better outcomes. Also, further development of CCBT in terms of sophistication and attractiveness accompanying the rapid progress of information technology
 might enhance the effectiveness of and adherence to CCBT, such as in the format of a therapeutic computer game
Our review has a few limitations. Firstly, we should have ideally recalculated the effect size (SMD) of each outcome from the original research data in order to enhance the review quality
. However, we could not do this due to physical and time restrictions. Secondly, we could not include unpublished data or data from on-going trials even though we attempted to collect them using several ways.