Systematic Reviews and Meta-Analyses of Crime Prevention Evaluations
Summary and Keywords
Systematic reviews and meta-analyses have become a focal point of evidence-based policy in criminology. Systematic reviews use explicit and transparent processes to identify, retrieve, code, analyze, and report on existing research studies bearing on a question of policy or practice. Meta-analysis can combine the results from the most rigorous evaluations identified in a systematic review to provide policymakers with the best evidence on what works for a variety of interventions relevant to reducing crime and making the justice system fairer and more effective. The steps of a systematic review using meta-analysis include specifying the topic area, developing management procedures, specifying the search strategy, developing eligibility criteria, extracting data from the studies, computing effect sizes, developing an analysis strategy, and interpreting and reporting the results.
In a systematic review using meta-analysis, after identifying and coding eligible studies, the researchers create a measure of effect size for each experimental versus control contrast of interest in the study. Most commonly, reviewers do this by standardizing the difference between scores of the experimental and control groups, placing outcomes that are conceptually similar but measured differently (e.g., such as re-arrest or reconviction) on the same common scale or metric. Though these are different indices, they do measure a program’s effect on some construct (e.g., criminality). These effect sizes are usually averaged across all similar studies to provide a summary of program impact. The effect sizes also represent the dependent variable in the meta-analysis, and more advanced syntheses explore the role of potential moderating variables, such as sample size or other characteristics related to effect size.
When done well and with full integrity, a systematic review using meta-analysis can provide the most comprehensive assessment of the available evaluative literature addressing the research question, as well as the most reliable statement about what works. Drawing from a larger body of research increases statistical power by reducing standard error; individual studies often use small sample sizes, which can result in large margins of error. In addition, conducting meta-analysis can be faster and less resource-intensive than replicating experimental studies. Using meta-analysis instead of relying on an individual program evaluation can help ensure that policy is guided by the totality of evidence, drawing upon a solid basis for generalizing outcomes.
Challenges to Understanding What Works
Determining what works in policing, courts, corrections, neighborhood prevention, and other domains for intervention is challenging. To determine what works in terms of criminological interventions (i.e., designed to reduce crime), it is necessary to assemble all the prior evaluation studies bearing on the question. It would be wonderful if all these studies were similar in design, methods, and results. One study would represent all the other studies well, as there would be no variability in findings.
However, evaluation studies use a variety of designs that sometimes conflict. It is sometimes tempting to pick out the study that seems most influential or important (or that is most recent) and use that to guide decision making. A single experiment certainly can be influential and may provide good answers for decision makers in the jurisdiction it was implemented in. If widely publicized, the study may spur other researchers to conduct a new wave of theoretical and methodological studies. But it seems sensible that an evidence-based approach to what works in crime and justice go beyond the selective consideration of one or a few influential studies to consider all the evidence relevant to the question.
There are other challenges to answering policy-relevant questions such as: what works to reduce crime in communities? Are there effective programs in reducing offender recidivism? The studies that broach these questions are often scattered across different disciplines, or sometimes disseminated in obscure or inaccessible outlets. And they can be of such questionable quality that interpretation is risky at best. Compounding these challenges, political and special interest groups selectively use evidence to promote a position. Compounding all of this is the fact that a bad piece of research can be transmitted in seconds around the globe through the Internet.
Systematic Reviews and Meta-Analyses
How then can policy and practice be informed by such a fragmented knowledge base with evaluative studies that vary so widely in quality? What study, or set of studies (if any at all) ought to be used to influence policy? What methods should be used to appraise and analyze a set of separate studies addressing the same question? Systematic reviews and meta-analyses are one development in the social sciences that addresses this knowledge cumulation challenge. Indeed, systematic reviews and meta-analyses are being used to synthesize the research to answer other questions: for example, to look at the major perspectives of important crime theories. Systematic reviews broadly describe all literature reviews that use explicit and transparent processes to identify, retrieve, code, analyze, and report on existing relevant research studies. Although meta-analysis may be used in other contexts (e.g., to analyze data from a multisite trial), in this context, meta-analyses are those systematic reviews that use quantitative methods to statistically summarize the results of separate but similar studies. Such analyses can greatly assist policymakers in identifying effective programs and interventions. All solid systematic reviews and meta-analyses will use a transparent, documented process to search for relevant studies that address a question of policy or practice, give a critical appraisal, and then come to a judgment about what works using explicit, replicable methods. In contrast to traditional literature reviews, such reviews detail each stage of the decision process, including the question that guided the review, the criteria for studies to be included, and the methods used to search for and screen evaluation reports. It also details how analyses were done and how conclusions were reached.
When done well and with integrity, a systematic review can provide the most comprehensive assessment of the available evaluative literature addressing the research question, as well as the most reliable statement available about what works. When the literature base includes studies that provide enough detail on the findings to use in a quantitative analysis, a meta-analysis can be conducted to examine the strength and consistency of the evidence. Meta-analysis is the statistical synthesis of data from separate but similar (i.e., comparable) studies, leading to a quantitative summary of the pooled results (Chalmers, Hedges, & Cooper, 2002). This chapter focuses exclusively on such quantitative systematic reviews or meta-analyses. First, we provide an overview of meta-analysis, including a brief history, a description of the steps involved, and three examples of meta-analysis in criminal justice. The chapter concludes with a discussion of the benefits and challenges of using meta-analysis in policymaking.
A Brisk History and Overview of Meta-Analysis
In the 1970s, as the traditional literature review was coming under heavy criticism, the modern statistical foundation for quantitative reviewing was being developed (e.g., Glass, McGaw, & Smith, 1981; Hedges & Olkin, 1985). In 1976 Gene Glass coined the term meta-analysis to describe quantitative approaches to reviewing studies. He and Mary Lee Smith deserve much credit for popularizing this approach by applying this technique to research on the effects of psychotherapy (Smith & Glass, 1977) and class size (Glass & Smith, 1978).
Glass (1976) popularized a standardized effect size measure for expressing the difference between experimental and control groups in standard deviation units. Using this numeric effect size as a dependent variable, Smith and Glass (1977) could quantify over 400 psychotherapy experiments. They concluded, contrary to some of the notable narrative reviews on the issue (e.g., Eysenck, 1961), that subjects exposed to psychotherapy had—on average—a strong, beneficial effect when compared to control group subjects. Using the standardized effect-size measure—or common metric—moved the emphasis of the review from statistical significance (which can be misleading) to the actual magnitude of effect the experimental treatment achieved. The common metric expresses the difference between the groups in a manner that is independent of statistical significance.
The Smith and Glass (1977) findings led to extensive use of meta-analysis in the fields of psychology and education. Its popularity soon spread to other fields, particularly medicine and business, with the technique receiving national press coverage (e.g., Mann, 1994; Strauss, 1991). Other researchers were simultaneously developing their own statistical approaches to synthesis (e.g., Hedges & Olkin, 1985; Hunter, Schmidt, & Jackson, 1982; Rosenthal, 1991).
Meta-analyses became an important part of the “what works” debate in criminology, especially regarding the evidence of offender treatment effects (also known as “rehabilitation”) (Petrosino, 2005). Although earlier comprehensive systematic reviews had for the most part concluded that treatment of offenders was not effective, the application of meta-analysis led to a convergence of small, positive effects for rehabilitation programs. This body of evidence led to a strong refutation of the “nothing works” belief that had swept through much of the criminological field during the early 1980s (Petrosino, 2005).
In the 21st century meta-analyses are a critical component of the systematic reviews being published by the international Campbell Collaboration (C2), with the criminology-related syntheses being published by the C2’s Crime and Justice Group. To further show the support for meta-analyses in criminology, the most comprehensive evidence-based registry, CrimeSolutions.gov (funded by the U.S. Department of Justice) recently added reviews of criminal justice practices that include meta-analytic procedures.
Steps for Conducting Systematic Review and Meta-Analysis
Most meta-analyses of research on the effects of criminological interventions follow a similar path. After identifying and coding eligible studies, the researchers create a measure of effect size for each experimental-versus-control contrast of interest in the study. Most commonly, reviewers do this by standardizing the difference between scores of the experimental and control groups, placing outcomes that are conceptually similar but measured differently (e.g., such as re-arrest or reconviction) on the same common scale or metric. Though these are different indices, they do measure a program’s effect on some construct (e.g., criminality). These effect sizes are usually averaged across all similar studies to provide a summary of program impact. The effect sizes also represent the dependent variable in the meta-analysis, and more advanced syntheses explore the role of potential moderating variables such as sample size or other characteristics on effect size.
There are many ways to organize the steps involved in conducting a meta-analysis. One such way of organizing was delineated by Boruch, Petrosino, and Morgan (2015):
Specify the Topic Area
At this stage, the meta-analyst must identify the question that specifies the intervention, the outcomes, and the population of interest. For example, for a current project in progress, our team is focused on the impact of school-based law enforcement in primary and secondary schools on behavior, safety perceptions, and learning outcomes (Petrosino et al., 2014).
Develop a Management Strategy and Procedures
Managing a meta-analysis requires a strategy that does not differ in principle from the management requirements of a field study (Boruch et al., 2015). This includes identifying who will do what tasks, when, with what resources, and under what ground rules. Any proposal for funding to support a meta-analysis will essentially serve as a management strategy and will detail the procedures to be used. Independent of funding, a plan (known as a protocol) for conducting the review is required by C2, which lays out the plan for the meta-analysis and indicates the timeline for completing it, as well as submitting deliverables such as the final draft. Such protocols, especially when published electronically by organizations such as the Campbell and Cochrane collaborations, also provide a level of transparency so that one can determine if and how review teams deviated from the plan. The time required and difficulty encountered in doing a review and the funding and other resources needed to complete one are influenced heavily by the size and complexity of the studies examined for the review (Boruch et al., 2015).
Specify the Search Strategy
It is through careful and comprehensive searches of the literature by which meta-analysts retrieve the data for their analyses. It is recognized as good practice to go beyond searching peer-reviewed journals to consider other sources by which one would identify research studies. This could include government or unpublished technical reports, dissertations, and conference presentations. An advantage in identifying criminological research is that the two most common bibliographic databases, Criminal Justice Abstracts and the National Criminal Justice Reference Service (NCJRS) abstracts, contain abstracts that go beyond peer review journals to capture a wide range of unpublished and hard to get documents known as gray or fugitive literature.
In addition to searching bibliographic databases such as Criminal Justice Abstracts, researchers can employ several other techniques to ensure they are covering the research literature appropriately. This can include inspecting the reference pages of the included studies for potential hits on a topic. Often, studies cite earlier studies that were a precedent for their work that is relevant to the meta-analysis. Other search techniques include contacting researchers in the field and undertaking visual inspection of table of contents and abstracts (known as “hand searches”) of certain high-yield journals (especially since keyword searches of electronic databases can miss relevant studies).
Figure 1 provides a summary of more common techniques. Beyond identifying the target for the literature search, the way the search is conducted should be specified (e.g., what keywords will be used with what electronic search engine and with what electronic databases?). For example, in searching for randomized controlled trials of individually focused criminological intervention, Petrosino (1995) found the following keywords yielded the most studies: random, experiment, controlled, evaluation, impact, effect, and outcome.
Develop Eligibility Criteria for Studies in the Review
A good meta-analysis will clearly define the eligibility criteria so that it is clear what studies are included or left out. For example, are only randomized controlled trials to be included, or will the meta-analysis also contend with studies that use quasi-experimental approaches? Another common discussion for eligibility is the age of the studies. Are studies published 20 years ago or more still relevant to 21st-century policy consideration? Should the meta-analysis be limited to just those studies taking place in the United States? Figure 2 provides an example of the eligibility criteria for a published meta-analysis of Scared Straight programs (Petrosino et al., 2003).
The meta-analyst must be consistent and apply these criteria to all studies. It is also true that further winnowing of the sample may take place as one begins to do the more intensive coding or extracting of information from a study report. For example, it may become clear that the study had broken down so completely in the field that the data are just not trustworthy to include in a meta-analysis.
Cook (2014) and others have supported the idea that meta-analysis theoretically provides stronger external validity than a single study, as it includes data collected from studies reported over several years, conducted in a variety of geographic locations, using various outcome measures by different research teams.
Extracting Data From the Studies
As the meta-analysis team begins retrieving eligible studies, data need to be extracted from each one. For Wilson (2009) coding for a systematic review is akin to “interviewing the studies.” Normal practice is to develop a coding instrument (part of the review protocol) to identify the information that one would collect from each study report. The coding instrument will usually include items that cover characteristics of the report, the investigators, the intervention, the population, the design and methodology, and the outcomes. Figure 3 provides some examples from a coding instrument used in a review that the WestEd Justice & Prevention Research Center is completing on the effect of school-based law enforcement programs.
Farrington (2003) discussed descriptive validity, which is a common problem facing the meta-analysis. In short, published reports sometimes do not provide the key details and data necessary (e.g., to compute effect sizes or to understand the nature and quality of the implementation).
Compute Effect-Size Estimates
Effect size is the quantitative representation of the relationship between the treatment and control group difference on the outcome of interest. A common metric of effect size is Cohen’s d, also known as the weighted mean difference effect size. In a simple two-group controlled trial, computing Cohen’s d would take the difference between treatment and control groups and then divide that difference by the square root of a pooled estimate of variance within the intervention groups. One advantage to using Cohen’s d is that a variety of conversion formulae are available to help the analyst provide an estimate of Cohen’s d from a variety of reported data (significance levels, proportions, t-test or chi-square value, etc.). Standard meta-analysis texts, such as Lipsey and Wilson (2001), have appendices detailing these conversion formulas. Also note that a variety of specialized software programs for meta-analysis have been developed to assist researchers in organizing their data, computing effect sizes, and conducting meta-analysis (e.g., Borenstein, Hedges, Higgins, & Rothstein, 2005).
Cohen’s d is not the only effect-size metric. Odds ratios are common in the health and medical fields and are becoming used more in social science, including criminology (Boruch et al., 2015). Figure 3 provides the results from the Scared Straight review and used odds ratios as the common metric. Graphic portrayals of meta-analytic results using forest graphs such as used in Figure 3 are also becoming more common and make it easier for readers to understand the results.
Develop an Analysis Strategy
Most meta-analyses are guided by two major aims: (1) finding out the overall effect of the intervention and (2) finding out whether there is variation in effect-size estimates across the studies and what explains this variation. For example, in a meta-analysis of Scared Straight programs, Petrosino et al. (2003) located nine randomized trials that compared Scared Straight to a no-treatment control group. Two of these did not report any data that could be used in the meta-analysis, and attempts to retrieve them from the original study investigators were unsuccessful. The meta-analysis therefore includes seven experimental studies, and Figure 3 provides those results. This is an example of a meta-analysis focused on the overall effect. In the case of Scared Straight, the study authors did not find enough studies to make moderating analyses feasible (Petrosino et al., 2003).
To analyze variation, meta-analysts use some of the coded characteristics of the studies to determine whether effect-size changes differ substantially with changes in the values of these coded characteristics. Losel and Beelman (2003), for example, undertook a meta-analysis of 84 reports on randomized trials that were designed to estimate the effect of child skills training on antisocial behavior. They conducted a variety of analyses. They found that studies with smaller samples tended to be associated with larger effect sizes. Counter to common notions, the amount of child skills training the youth received (treatment dosage) was not related to the outcome. Interventions were associated with more positive effect sizes when they were implemented by the study authors or research staff (including supervised students).
Interpret and Report the Results
The type of document that the meta-analysis is reported in can vary depending on funder considerations, if the work is being submitted to a journal or through an editorial group such as the Campbell Collaboration, and the goals of the meta-analysis team (e.g., academicians who need to publish in top-tier journals). In a perfect world, at least one report would be a transparent and explicit summary of what was done and found in the meta-analysis. This would facilitate an external party determining if they can replicate the meta-analysis. Such reviews are routinely published by C2. In this perfect world, it would also be ideal to have a short, plain language summary to share with people who not researchers. This can enhance the impact of the meta-analysis beyond the research and academic world to those working in policy and practice.
A Few Examples of Meta-Analysis in Criminal Justice
An Example of Meta-Analysis of a Single Program: Does “Scared Straight” Work?
Petrosino et al. (2003) reported on the effects of Scared Straight and other juvenile awareness programs. These prison programs are meant to deter juvenile delinquents or children at risk for delinquency by making them aware of the grim realities of prison life. Many of these programs feature a rap session in which prisoners brutally describe what institutional life is like in order to deter youngsters from committing crimes. Although researchers have long believed that this type of program was ineffective and possibly harmful, it has remained in use and has even experienced something of a revival in the 21st century. Although other reviewers had included Scared Straight as one of several programs included in their reviews, there was no existing systematic review at the time focusing solely on evaluations of this program.
Petrosino et al. (2003) conducted a rigorous search for randomized experiments that examined the effects of the Scared Straight program on subsequent measures of crime. Their methods included electronic searches of abstracting or bibliographic databases, contact with colleagues and research centers, handsearch, and tracking citations listed in existing reviews. Their techniques located nine randomized experiments reported between 1967 and 1992, including five unpublished studies. All the experiments included a no-treatment control group, and seven of the nine reported data that could be statistically combined in the meta-analysis.
A common approach to analyzing data in meta-analysis is to use a forest plot of the odds ratio for each study. An odds ratio is simply the number of events (such as the number of juveniles failing or being arrested) divided by the number of non-events (number of juveniles succeeding or not being arrested). An odds ratio of 1.0 means that the program did not increase or decrease a juvenile being successful (not arrested). A 1.0 is a precise no-difference effect, or effect of zero. Odds ratios above 1.0 mean that the program increased the failure rate; similarly, odds ratios below 1.0 mean the program was successful in reducing subsequent arrests.
Figure 3 presents the forest plot for the seven experimental studies of Scared Straight and other juvenile awareness programs. All seven report negative effects for the treatment group. In other words, juveniles in the Scared Straight program did worse than juveniles who did not participate. Petrosino et al. (2003) concluded that Scared Straight methods were not effective in deterring subsequent crime and likely had a negative effect on juveniles.
The negative effects for the Scared Straight studies are rare in criminological intervention with juveniles. For example, Lipsey’s (1992) meta-analysis of nearly 400 experimental or well-controlled quasi-experimental evaluations of preventative or treatment interventions for juvenile delinquency showed that two-thirds (64%) of the studies indicated beneficial impact on subsequent offending.
An Example of a Meta-Analysis Comparing One Program With Others: Does D.A.R.E. Work?
One of the most popular school-based drug prevention programs in the world going into the 21st Century was Drug Abuse Resistance Education (D.A.R.E.). Initiated in 1983 as a joint project between the Los Angeles Police Department and Unified School District, the core program used uniformed police officers to deliver a 17-week drug prevention curriculum (lasting one hour per week) to fifth and sixth grade students (i.e., 10–12-year-olds). Several early evaluations were positive, and the program quickly expanded with federal funding throughout three-fourths of the nation’s school districts (e.g., Rosenbaum & Hanson, 1998).
Given the federal investment in the program, it was only natural that decision makers would want to know whether D.A.R.E. worked to reduce drug use and led to better attitudes toward the police. The National Institute of Justice issued a solicitation for an evaluation of the research on D.A.R.E., and after a peer review process, selected the Research Triangle Institute (RTI) in North Carolina to conduct the study (Ennett, Tobler, Ringwalt, & Flewelling, 1994). RTI followed the tenets of systematically reviewing evidence. They were explicit in their procedures, used methods to reduce bias, and presented a detailed report outlining what they did and why they did it. Although there were many uncontrolled studies on D.A.R.E., their extensive searches turned up only eight evaluations that used either a randomized field trial or rigorous quasi-experimental procedures. They examined the following outcomes: self-reported drug use, attitudes toward police, attitudes toward drugs, and knowledge about drugs. For each of these measures, they created a standardized effect size expressing the difference between the experimental and control groups.
Their results showed that D.A.R.E. had positive impacts on knowledge; however, the findings were less persuasive when it came to attitudes or behavior. Given that the researchers at RTI used effect-size rather than odds ratios, it was difficult to understand how D.A.R.E. was faring without a basis for comparison. To remedy this, they worked with Nancy Tobler, who had conducted several earlier meta-analyses of school-based drug prevention programs. Using the Tobler database, the RTI researchers identified programs delivered to fifth and sixth graders (such as the core D.A.R.E. curriculum) and classified them as interactive or non-interactive. Interactive programs were those that involved role playing and modeling and did not rely on straight lectures providing information. Non-interactive programs involved little more than providing information to youngsters about the harmful effects of drugs. Although the authors did not attempt to define how interactive D.A.R.E. was, the program was weighted toward the officer delivering a standardized curriculum in the classroom and likely fell somewhat in between the interactive and non-interactive groupings.
The comparison data were telling. Although D.A.R.E. did better on some measures than non-interactive programs, the evidence showed that drug prevention defined as interactive was far more effective with fifth and sixth grade students than D.A.R.E. This was true across measures of attitude, knowledge, and self-reported drug use. Even though self-reported drug use (which included tobacco, alcohol, and marijuana) were small for all groups, the positive impact for interactive programs was three times the size of D.A.R.E. Without these comparison data, it is unlikely that the review would have generated much controversy (e.g., Elliot, 1995). But given the results, some questioned whether the federal investment in D.A.R.E. was worth it all and whether these more effective alternatives should be supported.
An Example of a Meta-Analysis of a Class of Interventions
Petrosino, Turpin-Petrosino, and Guckenburg (2010) examined the effects of juvenile system processing on delinquency. Less serious juvenile offenders can be handled with considerable discretion. Juvenile system practitioners can opt to bring the child formally through the juvenile justice system (official processing), divert the child out of the system to a program or service, or release the child to parents or guardians with no further action.
To some observers’ surprise, at least 29 randomized trials were mounted since 1972–2009 that have compared assignment of juveniles to an official system processing condition (i.e., petitioned before the court, appearance before a judge, case moving forward in the system) with at least one release or diversion program condition.
Across these 29 experiments, there is considerable variation. The selective reader could cite any single study—or selective number of studies—as evidence for a position that processing has a deterrent effect and reduces subsequent delinquency. Indeed, about 10 studies show positive results for processing. Relying on a selective gathering of evidence may lead decision makers to opt for processing juvenile offenders formally through the court system as a deterrent measure.
The totality of the evidence reviewed by Petrosino and his colleagues, however, paints a different picture. The assembly of evidence suggests that across all 29 studies, the effect size was –.11. Although this is a negative effect, indicating that processing led to an increase in delinquency, this would be considered by most readers to be a small effect size. But keep in mind that juvenile system processing is a more expensive option for most jurisdictions than simple release and likely more expensive than almost all but the most intensive diversion programs. If there is no deterrent impact of official judicial processing but in fact a small negative effect, and if it is a more expensive option, a judge, citizen, or policymaker could clearly ask if it would be better to divert or release less serious juvenile offenders.
Strengths and Challenges of Meta-Analysis
Meta-analysis can combine the results from the most rigorous evaluations to provide policymakers with the best evidence on what works for a variety of interventions relevant to reducing crime and making the justice system fairer and more effective. Drawing from a larger body of research increases statistical power by reducing standard error; individual studies often use small sample sizes, which can result in large margins of error. In addition, conducting meta-analysis can be faster and less resource-intensive than replicating experimental studies. Using meta-analysis instead of relying on an individual program evaluation can help ensure that policy is guided by the totality of evidence, drawing upon a solid basis for generalizing outcomes.
For example, Sherman and Berk (1984) conducted the seminal Minneapolis Domestic Violence Experiment, reporting that arresting misdemeanor domestic violence offenders was the most effective option for police than the traditional strategies of separating the offender and victim for eight hours or attempting an informal mediation between the parties at the scene. If policymakers were to rely solely upon the Minneapolis study, many jurisdictions would continue to mandate arrest for police officers responding to misdemeanor (non-felony) domestic violence calls. In fact, the number of departments adopting such a policy after the Sherman and Berk (1984) report was staggering (e.g., Sherman & Cohn, 1989). But there were five immediate replications of the Minneapolis study, and the results from those raised serious questions by researchers and policymakers about whether arrest is an effective response to all misdemeanor domestic violence cases (e.g., Sherman, 1992). To conclude that arrest works because of the earlier Minneapolis experiment without considering the results of these subsequent replications would seem misinformed.
However, meta-analysis is not without criticism. The most frequent criticism leveled is commonly referred to as the “apples and oranges” critique (e.g., Lipsey & Wilson, 2001). This criticism charges meta-analysis with mixing vastly different studies together (e.g., by including heterogeneous study findings [Eysenck, 1994] or by including studies of differing methodological quality) to produce a single estimate of treatment effect. Gorman (1995) criticized the earlier meta-analysis of D.A.R.E. by claiming that the review team mixed together apples, oranges, and a few poorly executed studies (or lemons!). But some have argued that the apples and oranges criticism is not appropriate if the goal of the review is to broadly analyze fruit (e.g., Rosenthal & DiMatteo, 2001).
There have been several advances in methods to address the apples and oranges criticism, specifically study heterogeneity and methodological variability issues. Setting sensible eligibility criteria can reduce some of this variability before the sample of studies is collected and analysis begins. Moreover, reviewers now code the methodological, contextual, and treatment characteristics—often in excruciating detail—and explore how these variations impact estimates of treatment effect in the meta-analysis (e.g., Lipsey & Wilson, 2001). Another common method in meta-analysis is to conduct statistical tests of homogeneity to determine if the effect sizes obtained from the sample of studies are significantly different from what would be expected by chance or sampling error. If the test of homogeneity is significant, then the meta-analyst should determine if there are meaningful subgroup or moderating influences in the database of studies (e.g., Cooper & Hedges, 1994).
Because of these methods, it is now uncommon to uncover meta-analyses that report only a single overall effect size to represent a heterogeneous sample of studies. Note that systematic reviews and meta-analyses attempt to address the apples and oranges criticism with explicit and transparent methods. Narrative and traditional reviews are also subject to the apples and oranges criticism but lack an arsenal of methods to respond to it.
Another challenge to doing meta-analysis is that they require periodic updating as new studies may be published that are relevant to the research question addressed. This problem is exacerbated by the time lag between the time the searches are done and the point at which the analyses are published. Organizations such as the international Campbell Collaboration require periodic updates although getting them done in practice has been elusive.
The Importance of Meta-Analysis to Policymakers
When meta-analysis is used, estimates of the average impact across studies, as well as how much variation there is and why, can be provided. By using meta-analysis, we can generate clues as to why some programs are more effective in some settings and not others. With the current emphasis on making evidence-based policy, 21st-century criminal justice policymakers are under more pressure to use research in their decision making. Meta-analysis can provide policymakers with reliable and comprehensive evidence about what works to reduce crime or improve justice. It is important that decision makers become more familiar with this method and more reliant on the information it produces.
Links to Digital Materials
Blaya, C., Farrington, D. P., Petrosino, A., & Weisburd, D. (2006). Revues systematiques dans le champ criminologique et le group crime et justice de la collaboration Campbell (Systematic reviews in criminology and the Campbell Crime and Justice Group). International Journal on Violence and Schools, 1, 72–80.Find this resource:
Boruch, R. F., Petrosino, A., & Morgan, C. (2015). Systematic reviews, meta-analyses, evaluation syntheses. In J. Wholey, H. Hatry, & K. Newcomer (Eds.), The handbook of practical program evaluation (4th ed., pp. 673–698). San Francisco: Jossey-Bass.Find this resource:
Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.). (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York: Russell Sage Foundation.Find this resource:
Farrington, D. P., & Petrosino, A. (2000). Systematic reviews of criminological interventions: The Campbell Collaboration Crime and Justice Group. International Annals of Criminology, 38, 49–66.Find this resource:
Farrington, D. P., Petrosino, A., & Welsh, B. C. (2001). Systematic reviews and cost-benefit analyses of correctional interventions. Prison Journal, 81, 339–359.Find this resource:
Higgins, J. P. T., & Green, S. (Eds.). (2011). Cochrane handbook for systematic reviews of interventions (Version 5.1.0 [updated March 2011]). The Cochrane Collaboration. Retrieved from http://training.cochrane.org/handbook.
Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: SAGE.Find this resource:
Petrosino, A., & Julia, L. (2007.) Improving evidence about what works in corrections: Systematic reviews and meta-analyses. Western Criminology Review, 8(1), 1–15. Retrieved from http://www.westerncriminology.org/documents/WCR/v08n1/petrosino.pdf.Find this resource:
Wolf, F. M. (1986). Meta-analysis: Quantitative methods for research synthesis. Beverly Hills, CA: SAGE.Find this resource:
Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2005). Comprehensive meta-analysis (Version 2). Englewood, NJ: Biostat.Find this resource:
Chalmers, I., Hedges, L., & Cooper, H. (2002). A brief history of research synthesis. Evaluation & The Health Professions, 25, 12–37.Find this resource:
Chalmers, I., Hedges, L V., & Cooper, H. (2002). A brief history of research synthesis. Evaluation and the Health Professions, 25, 12–37.Find this resource:
Cook, T. D. (2014). Generalizing causal knowledge in the policy sciences: External validity as a task of both multiattribute representation and multiattribute extrapolation. Journal of Policy Analysis and Management, 33, 527–536.Find this resource:
Cooper, H. C., & Hedges, L. V. (Eds.). (1994). The handbook of research synthesis. New York: Russell Sage Foundation.Find this resource:
Elliot, J. (1995). Drug prevention placebo: How D.A.R.E. wastes time, money, and police. Reason, 26(10), 14–21.Find this resource:
Ennett, S. T., Tobler, N. S., Ringwalt, C., & Flewelling, R. (1994). How effective is drug abuse resistance education? A meta-analysis of Project DARE outcome evaluations. American Journal of Public Health, 84(9), 1394–1401.Find this resource:
Eysenck, H. J. (1961). The effects of psychotherapy. In H. J. Eysenck (Ed.), Handbook of abnormal psychology (pp. 697–725). New York: Basic Books.Find this resource:
Eysenck, H. J. (1994). Systematic reviews: Meta-analysis and its problems. British Medical Journal, 309, 789–792.Find this resource:
Farrington, D. P. (2003). Developmental and life course criminology: Key theoretical and empirical issues. Criminology, 41, 201–235.Find this resource:
Glass, G. V. (1976). Primary, secondary and meta-analysis of research. Educational Researcher, 5, 3–8.Find this resource:
Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. London: SAGE.Find this resource:
Glass, G. V., & Smith, M. L. (1978). Meta-analysis of research on the relationship of class size and achievement. San Francisco: Far West Laboratory for Educational Research and Development.Find this resource:
Gorman, D. (1995). The effectiveness of DARE and other drug use prevention programs. American Journal of Public Health, 85(6), 873–874.Find this resource:
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic.Find this resource:
Hunter, J. E., Schmidt, F. L., & Jackson, G. B., (1982). Meta-analysis: Cumulating research findings across studies. Beverly Hills, CA: SAGE.Find this resource:
Lipsey, M. W. (1992). Juvenile delinquency treatment: A meta-analytic inquiry into the variability of effects. In T. D. Cook, H. Cooper, D. S. Cordray, H. Hartmann, L. V. Lipsey, W. Mark, et al. (Eds.), Practical meta-analysis. Thousand Oaks, CA: SAGE.Find this resource:
Losel, F., & Beelman, A. (2003). Effects of child skills training in preventing anti-social behavior: A systematic review of randomized evaluations. Annals of the American Academy of Arts and Science, 587, 84–109.Find this resource:
Mann, C. (1994). Can meta-analysis make policy? Science, 266, 960–962.Find this resource:
Petrosino, A. J. (1995). The hunt for randomized experiments: Search and retrieval techniques for a ‘what works?’ meta-analysis. Journal of Crime and Justice, 18(2), 63–80.Find this resource:
Petrosino, A. J. (2005). From Martinson to meta-analysis: The role of research reviews in the US offender treatment debate. Evidence and Policy, 1(2), 149–172.Find this resource:
Petrosino, A., Guckenburg, S., & Fronius, T. (2014). Protocol for a systematic review: Policing schools’ strategies to reduce crime, increase perceptions of safety, and improve learning outcomes in primary and secondary schools. Oslo, Norway: The Campbell Collaboration.Find this resource:
Petrosino, A., Turpin-Petrosino, C., & Guckenburg, S. (2010). Formal System Processing of Juveniles: Effects on Delinquency. Campbell Systematic Reviews. Retrieved from www.campbellcollaboration.org/lib/download/761/.Find this resource:
Petrosino, A., Turpin-Petrosino, C., & Buehler, J. (2003). Scared Straight and other juvenile awareness programs for preventing juvenile delinquency: A systematic review of the randomized experimental evidence. Annals of the American Academy of Political and Social Science, 589, 41–62.Find this resource:
Rosenbaum, D. P., & Hanson, G. S. (1998). Assessing the effects of school-based drug education: A six-year multi-level analysis of project D.A.R.E. Journal of Research in Crime and Delinquency, 35(4), 381–412.Find this resource:
Rosenthal, R. (1991). Meta-analytic procedures for social research (2d ed.). Beverly Hills, CA: SAGE.Find this resource:
Rosenthal, R., & DiMatteo, M. R. (2001). Meta-analysis: Recent developments in quantitative methods for literature reviews. Annual Review of Psychology, 52, 59–82.Find this resource:
Sherman, L. W. (1992). Policing domestic violence. New York: Free Press.Find this resource:
Sherman, L. W., & Berk, R. A. (1984). The specific deterrent effects of arrest for domestic assault. American Sociological Review, 49, 261–272.Find this resource:
Sherman, L. W., & Cohn, E. (1989). The impact of research on legal policy: The Minneapolis Domestic Violence Experiment. Law and Society Review, 23(1), 117–144.Find this resource:
Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy outcome studies. American Psychologist, 4, 753–760.Find this resource:
Strauss, S. (1991). Meta-analysis: Lies, damn lies and statistics. Globe and Mail, November 2, D10.Find this resource:
Wilson, D. (2009). Systematic coding. In H. Cooper, L. Hedges, & J. Valentine (Eds.), The handbook of research synthesis and meta-analysis (pp. 159–176). New York: Russell Sage Foundation. Retrieved from http://www.jstor.org/stable/10.7758/9781610441384.13.Find this resource: