This is an archived version of the Handbook. For the current version, please go to or search for this chapter here.  Tools for assessing methodological quality or risk of bias in non-randomized studies

Chapter 8 (Section 8.5) describes the ‘Risk of bias’ tool that review authors are expected to use for assessing risk of bias in randomized trials. This involves consideration of six features: sequence generation, allocation sequence concealment, blinding, incomplete outcome data, selective outcome reporting and ‘other’ potential sources of bias. Items are assessed by: (i) providing a description of what happened in the study; (ii) providing a judgement on the adequacy of the study with regard to the item. The judgement is formulated by answering a pre-specified question, such that an answer of ‘Yes’ indicates low risk of bias, an answer of ‘No’ indicates high risk of bias, and an answer of ‘Unclear’ indicates unclear or unknown risk of bias. The tool was not developed with NRS in mind, and the six domains are not necessarily appropriate for NRS. However, the general structure of the tool and the assessments seems useful to follow when creating risk of bias assessments for NRS.


For experimental and controlled studies, and for prospective cohort studies (see Box 13.1.a and Section 13.2.2), the six domains in the standard ‘Risk of bias’ tool could usefully be assessed, whether allocation is randomized or not. This is the minimum assessment review authors should carry out and more details will usually be required. An additional component is to assess the risk of bias due to confounding. The depth of this assessment is likely to depend on the heterogeneity between studies and whether the review authors propose a quantitative synthesis (see Section 13.6). If studies are heterogeneous and no quantitative synthesis is proposed, then a less detailed assessment can nevertheless serve the purposes of illustrating the heterogeneity and informing interpretation of the findings of the review.


Many instruments for assessing methodological quality of non-randomized studies of interventions have been created, and were reviewed systematically by Deeks et al. (Deeks 2003). In their review they located 182 tools, which they reduced to a shortlist of 14, and identified six as potentially useful for systematic reviews as they “force the reviewer to be systematic in their study assessments and attempt to ensure that quality judgements are made in the most objective manner possible”. However, all six required a degree of adjustment as they neglected to elicit detailed information about how study participants were allocated to groups, which in terms of the risk of selection bias is likely to be critical. Not all of the six tools were suitable for different study designs. In common with some tools for assessing the quality of randomized trials, some did not distinguish items relating to the quality of the study and the quality of reporting of the study. The two most useful tools identified in this review are the Downs and Black instrument and the Newcastle-Ottawa Scale (Downs 1998, Wells 2008).


The Downs and Black instrument has been modified for use in a methodological systematic review (MacLehose 2000). The reviewers found that some of the 29 items were difficult to apply to case-control studies, that the instrument required considerable epidemiological expertise and that it was time consuming to use. The Newcastle-Ottawa Scale, which has been used in NRSMG workshops to illustrate issues in data extraction from primary NRS, contains only eight items and is simpler to apply (Wells 2008).  However, the items may still need to be customized to the review question of interest. Review authors also need to be aware of differences in epidemiological terminology in different countries; for example, the Newcastle-Ottawa Scale uses the term ‘selection bias’ to describe what others may call ‘applicability’ or ‘generalizability’.


Acknowledging the importance of distinguishing between ‘what researchers do’ and ‘what researchers report’, review authors may also find it helpful to consider items included in reporting statements for randomized trials (Moher 2001) and observational epidemiological studies (Vandenbroucke 2007) in order to highlight gaps in reporting (and execution) in NRS (Reeves 2004, Reeves 2007).