Results should be collected only for the outcomes specified to be of interest in the protocol. Results for other outcomes should not be extracted unless the protocol is modified to add them, and this modification should be reported in the review. However, review authors should be alert to the possibility of important, unexpected findings, particularly serious adverse effects.
Reports of studies often include several results for the same outcome. For example, different measurement scales might be used, results may be presented separately for different subgroups, and outcomes may have been measured at different points in time. Variation in the results can be very large, depending on which data are selected (Gøtzsche 2007), and protocols should be as specific as possible about which outcome measures, time-points and summary statistics (e.g. final values versus change from baseline) are to be collected. Refinements to the protocol may be needed to facilitate decisions on which results should be extracted.
Section 7.7 describes the numbers that will be required in order to perform meta-analysis. The unit of analysis (e.g. participant, cluster, body part, treatment period) should be recorded for each result if it is not obvious (see Chapter 9, Section 9.3). The type of outcome data determines the nature of the numbers that will be sought for each outcome. For example, for a dichotomous (‘yes’ or ‘no’) outcome, the number of participants and the number who experienced the outcome will be sought for each group. It is important to collect the sample size relevant to each result, although this is not always obvious. Drawing a flow diagram as recommended in the CONSORT Statement (Moher 2001) can help to determine the flow of participants through a study if one is not available in a published report (available from www.consort-statement.org).
The numbers required for meta-analysis are not always available, and sometimes other statistics can be collected and converted into the required format. For example, for a continuous outcome, it is usually most convenient to seek the number of participants, the mean and the standard deviation for each intervention group. These are often not available directly, especially the standard deviation, and alternative statistics enable calculation or estimation of the missing standard deviation (such as a standard error, a confidence interval, a test statistic (e.g. from a t-test or F-test) or a P value). Details are provided in Section 7.7. Further considerations for dealing with missing data are discussed in Chapter 16 (Section 16.1).