This is an archived version of the Handbook. For the current version, please go to or search for this chapter here.

16.7.2  Multiplicity in systematic reviews

Adjustments for multiple tests are not routinely used in systematic reviews, and we do not recommend their use in general. Nevertheless, issues of multiplicity apply just as much to systematic reviews as to other types of research. Review authors should remember that in a Cochrane review the emphasis should generally be on estimating intervention effects rather than testing for them. However, the general problem of multiple comparisons affects interval estimation just as much as hypothesis testing (Chen 2005, Bender 2008).


Some additional problems associated with multiplicity occur in systematic reviews. For instance, when the results of a study are presented, it is not always possible to know how many tests or analyses were done. It is likely that in some studies interesting findings were selected for presentation or publication in relation to statistical significance, and other ‘uninteresting’ findings omitted, leading to misleading results and spurious conclusions. Such selective reporting is discussed in more detail in Chapter 8 (Section 8.14).


Adequate planning of the statistical testing of hypotheses (including any adjustments for multiple testing) should ideally be done at the design stage. Unfortunately, this can be difficult for systematic reviews when it might not be known, at the outset, which outcomes and which effect measures will be available from the included studies. This makes the a priori planning of multiple test procedures for systematic reviews more difficult or even impossible. Moreover, only some of the multiple comparison procedures developed for single studies can be used in meta-analyses of summary data. More research is required to develop adequate multiple comparison procedures for use in systematic reviews (Bender 2008).


In summary, there is no simple or completely satisfactory solution to the problem of multiple testing and multiple interval estimation in systematic reviews. However, the following general advice can be offered. More detailed advice can be found elsewhere (Bender 2008).