Correctional Service Canada
Symbol of the Government of Canada

Common menu bar links

Compendium 2000 on Effective Correctional Programming

Warning This Web page has been archived on the Web.

CHAPTER 26

Development of a Program Logic Model to Assist Evaluation

JAMES McGUIRE1


THE IMPORTANCE OF EVALUATION

Today it is virtually a universal requirement of services provided from public funds that they be evaluated and that information be available regarding their overall effectiveness. This is a product of several interacting forces. One of them has its origins in a growing popular and political desire for the most prudent fiscal management of government expenditure. Over recent decades, the adoption of monetarist policies by some countries, coupled with the escalating costs of public services has led to a drive towards greater accountability. Often, this has been associated with efforts to reduce taxation and to measure the effectiveness (“value for money”) of services funded from it.

A second factor has been growing awareness that a significant volume of research has been published on many aspects of human services. Yet simultaneously, portions of it have not been adequately synthesized into an accessible format. Were this to become avail-able, it would be much easier for research findings to be used to inform professional practice, service management, and policy formation by government departments.

Third, the possibility of conducting that work has been facilitated by the development of new methods of statistical review of research findings, which though first developed at the beginning of the twentieth century only came to be extensively used from approximately 1980 onwards. The findings of such meta-analytic reviews have had particular significance for correctional services in dispelling the therapeutic nihilism of the phrase “nothing works”.

Interest in evaluation can be viewed as one component of evidence based practice. It may come as a surprise to find that the applied sectors of a discipline should be anything other than evidence-based. Regrettably that has been the position in several fields for a considerable time. But during the last three decades, many academics and practitioners have themselves become acutely aware of the discrepancies to which it leads. The emphasis on furnishing an evidence base for interventions, particularly in the field of health care, came from within the medical profession itself.

In a seminal paper, the medical epidemiologist Cochrane (1979) raised the question of whether medicine and related health research fields could genuinely claim to have sound empirical foundations, as there was no systematic record of the outcomes of their interventions. Cochrane bemoaned the fact that no critical summary existed of research findings, for example, of randomized controlled trials relevant to a given question about the efficacy of interventions. In the wake of this challenge medical researchers became progressively more conscious of these limitations. Mulrow (1987) examined a set of 50 review articles published in medical journals over a selected twelve-month period (June 1985 to June 1986). Her survey found major shortcomings in the manner in which reviews were conducted and reported. Remarkably, only one of the reviews clearly specified the source of information on which it was based. Only three reviews employed quantitative methods of synthezising the information obtained from the original articles they had surveyed. Mulrow concluded that there was a need for sizeable improvement in the manner in which reviews, which play such an essential part in the advancement of knowledge, were carried out.

Concerns such as these played a driving role in the 1993 inauguration of the Cochrane Collaboration, an international network of researchers and reviewers co-ordinated through 15 separate sites in Europe, North and South America, Australia, and South Africa. Over recent years, this has led to a considerable degree of activity in attempting to remedy the deficits identified. Between 1994 and 1999, more than 50 Review Groups were established through the Collaboration, each covering a specific field or branch of inquiry. In each case, their task was to locate, evaluate and integrate the results of well-designed intervention studies, usually (randomized controlled trials) RCTs. By 1999 the available set of outcome studies, assembled in the Cochrane Database of Systematic Reviews, and derived from detailed searches of over 1,100 research journals, contained more than a quarter of a million entries. These are accessible to researchers and other users through the Cochrane Library, established on an Internet web-site and updated on a quarterly basis.

The pursuit of a more systematic basis on which to draw conclusions concerning outcomes has not been restricted to the field of health. In education, pioneering work was done in attempting, for example, to clarify the relationship between class size and educational achievement, which despite earlier efforts to detect clear trends had remained unresolved (Glass, McGaw, & Smith, 1981). In social work, despite early reviews that questioned aspects of its effectiveness (Fischer, 1973, 1978), later overviews reported more encouraging results (MacDonald, Sheldon, & Gillespie, 1992; Russell, 1990). There is presently a significant drive to establish it on a firmer and more extensive empirical basis (MacDonald, 1999).

CORRECTIONAL SERVICES

The origins of the present era of interest in evaluations of criminal justice services is often traced to the period 1974-1976, when research reviews were conducted on both sides of the Atlantic. The perceived primary objective of most intervention research in this field is to discover methods of reducing offender recidivism. In the United States, a review was centrally commissioned and published subsuming 231 treatment studies (Lipton, Martinson, & Wilks, 1976). Martinson (1974) increased the public significance of these findings prior to their final appearance by the extensive media attention arising from the appearance of his paper. In the United Kingdom, Brody (1976) reviewed 100 studies of the impact of different types of court sentences and other interventions. The drawing of clear conclusions was hampered in both reviews by the poor quality of the research. But the available findings appeared to point towards very little if any discernible impact in terms of reduced rates of reoffending. Martinson's general summary of findings was that “treatment”, by which he meant any added ingredient in criminal justice agencies such as provision of counselling, education, vocational training or psychological therapy, added nothing to the available network of criminal justice sentences, sanctions, and other formalized legal procedures.

These conclusions were questioned by several critics, principally on the grounds that the reviewers had ignored more positive evidence (Palmer, 1975; Ross & Gendreau, 1980). The latter publication was an edited book containing reports on effective services in which there was evidence of reductions in recidivism. Ironically, Martinson himself (1979) reversed the thrust of the negative conclusions he had initially drawn. Current evidence reviewed elsewhere in this Compendium has firmly demonstrated the invalidity of his earlier claims. On the basis of many hundreds of evaluations and a series of integrative meta-analytic reviews, there is a consensus that knowledge in this field has advanced considerably concerning the ingredients of effective correctional programs.

Recently, a new initiative has been launched, known as the Campbell Collaboration (a parallel to the Cochrane review process), with its primary centre of activity at the University of Pennsylvania (Boruch, Petrosino, & Chalmers, 1999). The focus of its work will be upon social and educational as opposed to medical and health-related interventions. It appears feasible that this collaboration may also bridge the gap between the work of Cochrane Review Groups and of the large-scale reviews undertaken by specialists in the field of offender treatment (Petrosino, Boruch, Rounding, McDonald, & Chalmers, 1999). A new database of studies, the Social, Psychological, and Criminological Trials Register (SPECTR) has been compiled; specialized conferences have been held; and a new set of reviews begun to be commissioned.

That there is now extensive interest in evaluation of criminal justice services and interventions with offenders can hardly be in doubt. In the United Kingdom, all the agencies working with offenders have begun to pursue this agenda. The provision of programs designed to reduce recidivism was introduced as a key performance indicator by the prison service in 1996. The pressure to adopt evidence-based practice and conduct evaluations has also been felt very keenly in probation and other community-based criminal justice services. One initial source of the latter was a report by the Audit Commission (1989), a body that monitors the spending of other government agencies and local authorities. This report pointed out that while there were many imaginative and apparently valuable schemes in operation within probation services, little work was done to examine them systematically and identify the most useful forms of work. During the later 1990s, interest in this accelerated. The British government's Inspectorate of Probation embarked on an effective practice initiative to establish the extent to which activities within probation services met the overall goals of public protection and reduction of reoffending. Subsequently, the Audit Commission (1996) published a similarly influential report on youth justice services, questioning the pattern of spending and emphasizing the need to evaluate interventions and other aspects of service provision.

EVALUATION FRAMEWORKS

The present position regarding evaluation is such that it would now be seen as unacceptable to embark on any new departure within corrections without incorporating evaluation proposals in the project specification. Given that premise, this chapter is focused on the logic of the evaluation process, the types of procedures that flow from it, and their assembly in a coherent framework.

In different circumstances or viewed from different perspectives, the goals of evaluation can vary considerably. To practitioners, some approaches to evaluation often appear to be very mechanical or abstract. They seem divorced from the more complex and disorderly real-world setting in which most work with offenders is carried out. At the same time, practitioners often want to carry out an evaluation, yet have only minimal interest in providing results that would interest the wider scientific community. The motives for evaluation are different again when, for example, the managers of a correctional program or those who provided its funding want to evaluate its benefits, to assist them in making decisions over its future.

Posavac and Carey (1997) have subsumed the numerous objectives of evaluation into three broad categories. Formative evaluations are conducted with the aim of strengthening plans for service provision, shaping the nature of services, or improving their efficiency. Summative evaluations are focused upon outcomes, and inform decisions about whether to continue with programs or to choose between alternative forms of service. It should be noted that the outcomes of such evaluations themselves rarely determine decisions about the fate of a program, which will be taken on the basis of a wider range of information. Monitoring is a process of using feedback (and of creating systems that will generate it) to ensure the quality of a program is maintained.

Stecher and Davis (1987) have described evaluation processes applied to social programs, such as offender services, and have proposed a taxonomy of five different approaches to the task. The categories they describe overlap with each other to some extent, but there are important differences between them, stemming principally from different aims that evaluation may be intended to serve. The five approaches are:

  • Experimental. Here, an attempt is made to view a program from outside, and to be as rigorous as possible in the evaluation. The overall aim is to reach conclusions that can be generalized widely in a scientific sense, and that will be of interest to the research community. The outcomes of such an evaluation may be intended to serve the purpose of contributing to the wider field of knowledge of which the study forms a part. The potential audience for such knowledge is worldwide.
  • Goal-oriented. With this, a program's stated aims are examined, criteria for evaluating their achievement are identified in consultation with project staff, and outcomes evaluated accordingly. This involves a process of interaction between researchers and practitioners. The resultant findings are unlikely to be generalizable, but could nevertheless be of external interest when compared with projects with similar objectives.
  • Decision-focused. Adopting this approach, particular attention is paid to discerning the loci of decision-making within an agency or service, and to providing information that will assist program managers in decision-making. This approach bears the strongest similarity to audit as used by service managers, but goes beyond the mere collection of quantitative data (such as numbers of admissions to a penal institution) by examining the processes and decisions influencing such flows.
  • User-oriented. This is intended to supply items of information that will influence the direct use of a program in some respect. It may entail obtaining user feedback on several aspects of a program's performance. “Users” in this case might refer to a range of people or groups. For correctional programs it could include courts, service managers, practitioners, government agencies, the public, or offenders themselves.
  • Responsive. Here, an attempt is made to describe programs from the perspectives of all those involved, and to collect information that will meet each of their needs. This is typically more qualitatively based, but may also employ data sources found in the other four types of approaches.

It is possible in practice to combine these orientations and carry out evaluation with a number of aims simultaneously in mind. If this is done, it is important to have clear guidelines as to the various kinds of data to be collected, the rationales for doing so, and the eventual uses to which any evaluative information will be put. Posavac and Carey (1997) describe a fuller list of eleven different types of evaluation model: traditional, social-science, industrial inspection, “black box”, objectives-based, goal-free, fiscal, accountability, expert opinion, naturalistic, and improvement-focused. In many respects these are sub-divisions of some of the approaches in the above list.

Many evaluation projects in correctional services often take a hybrid form in terms of the foregoing scheme. It is more than likely that several aims will be embodied within them simultaneously. Thus while some attempt might be made to secure results that can be generalized, the likelihood of being able to achieve this is often low, given that the practical day-to-day concerns of agencies are the provision of services to courts and clients. It is the continuing tension between these two sometimes competing concerns that makes evaluation of services recurrently problematic.

For example, the concept of interactive evaluation embodied in the goal-oriented approach may appear alien to those who favour a more distant and detached attitude towards estimating effectiveness. It may be thought that there is a danger that evaluators will be seduced into employing only such measures as are guaranteed to produce good results for program leaders. Evaluators may wish to debate whether the objectives set for a program are the most suitable ones given other aspects of its context. It may be only by considering this that they can account for the program's overall effects.

To circumvent some of these difficulties Posavac and Carey (1997) advocate the use of an improvement-focused model of evaluation. “Improvements can be made in programs where discrepancies are noted between what is observed and what was planned, projected, or needed” (1997, p. 27). In this sense, all evaluation is integral to program delivery and constitutes a feedback loop to its design and delivery. This type of relationship is shown in Figure 26.1.

Figure 26.1 Evaluation provides a feedback loop to programs (adapted from Posavac and Carey, 1997)

.
.
.

The most appropriate resolution of all conflicts that may arise when planning an evaluation, and the best combination of approaches, has to be decided on a project-by-project basis by those conducting the evaluative work. All of this suggests that the first kind of issue to be addressed before embarking on an evaluation is that of why it is being undertaken. Who is asking for it to be done? What is it for?

PROGRAM AIMS AND OBJECTIVES

It is another pre-requisite of effective evaluation that there be some objective against which a correctional service or program can be evaluated. Preferably, such objectives should be stated in a form that renders them suitable to an evaluation process. Goals of public policy or of large governmental agencies are commonly stated in fairly general terms, making reference for example to “community safety”; a composite of many factors which requires further analysis to yield outcomes that could be methodically assessed. This applies equally to the kinds of products of official boards or working committees known collectively as “mission statements”. Diffuse, inadequately specified goals are not amenable to proper evaluation.

At the level of intervention programs or projects, it should be feasible to provide objectives that are clear and explicit. The process of arriving at this is beneficial for almost every aspect of the working of an agency and delivery of its services. Clear goals can be communicated to personnel such that each member of staff fully understands his or her task. This supports the achievement of proclaimed objectives both directly, by enabling staff to grasp the requirements of their roles, and indirectly, through its effect on morale and organizational cohesiveness. Without clear aims, there will be difficulties at every level. Explicit, clearly defined objectives are also essential to the process of evaluation. There is a useful acronym here encapsulated in the concept of SMART objectives (specific, measurable, achievable, realistic and time-limited). The closer a program's objectives come to meeting these criteria, the easier it will be to evaluate them.

Correctional programs in particular should be scrutinized for the clarity of their objectives. Once these are agreed, they simultaneously furnish a rationale for other components of the service. Researchers in the field of criminal justice have recognized the importance of this. Criteria for accreditation of programs almost universally include the stipulation that a program should be founded upon an explicit model of change. This presents a target for intervention and a rationale for the methods to be employed. It is thus inextricably linked to the statement of objectives of any program. To sum up, then, the second key question evaluators must ask themselves therefore is: What are the objectives of the program to be evaluated?

RESEARCH LOGIC AND THE DESIGN OF EVALUATIONS

Research work is usually considered to be the exclusive preserve of specialists. This image probably derives from the physical and biological sciences, where costly and elaborate equipment is required for the conduct of most experiments. But large-scale social science studies too can be expensive and may use complex methods of data analysis (sometimes yielding research results that are as robust as those of the “hard” sciences; Hedges, 1987). Whatever the field, research by its very nature is generally seen as an activity separate from the work undertaken by most practitioners, and not accessible to them.

The fundamental principles of research and evaluation are fairly simple: they are attempts to answer questions. Their intricacies arise from two inter-connected problems. First, it is surprisingly difficult to ask questions that are sufficiently clear to allow meaningful answers to be given (Dillon, 1990). Second, unless considerable care is taken in thinking about what given answers will mean, the process of interpreting them can be formidably difficult.

All the complexities of research methods flow from attempts to observe these fundamental points. Research design is a set of established rules or principles that safeguard against the numerous errors that might be made along the way. If research is to be valid and its results are to make sense, careful thought must be given to designing it. Only then will the information obtained provide clear and accurate answers to the questions posed.

Evaluation is commonly based on some notion of change over time. A fundamental assumption then is that information will be gathered on at least two points in time, usually at the beginning and at the end of an intervention. These can be designated in various ways, most commonly, by the phrases pre-test and post-test respectively, but occasionally using some other nomenclature such as T1 and T2. Evaluation studies in criminal justice will usually also have a follow-up point (T3), and in some research there may be several such points (e.g., 12, 24 or 60 months after the intervention). For correctional interventions, it was argued some time ago that a minimum acceptable follow-up period is two years (Logan, 1972).

For the foregoing reasons, controlled experimental designs are unanimously favoured as the most rigorous and robust form of evaluative research. By systematically controlling for a range of factors collectively known as extraneous variables, such studies allow for the best tests of hypotheses and the drawing of clear conclusions. Ideally, members of the different samples (experimental conditions) in such a study should be allocated on a random basis, creating what is known as a randomized controlled trial or RCT. In an RCT, members of different groups are matched on all variables other than their presence in experimental or control conditions. Their random allocation to these samples means that any differences then found are due to the researcher-controlled variables that differentiated the groups (i.e., provision of some form of treatment or training).

For evaluating the effectiveness of treatment with offenders, the best designed research involves working along these lines to make controlled comparisons between parallel groups. There are usually two kinds of groups. One, the experimental group, receives the treatment that is the object of the study and which is hypothesized by the investigator to have some desired effect. The details of this treatment should be clearly specified. The other, the control group, should be carefully matched with the first in background characteristics that may be relevant to the outcome. These might include age, gender, ethnicity, numbers or types of previous convictions, and other key demographic or criminological variables. Members of this group do not receive any treatment, and care should be taken to ensure the two groups do not interact. Hence, in well-designed research, the only difference between the two groups will be in the independent variable: the intervention used with one group and not the other. The logic of sound design is thus that any obtained difference in outcome -- which is then designated the dependent variable -- can only be explained in terms of this planned difference in the independent variable.

In more elaborate research designs, a third group is added: the attention control or placebo group. This is intended to evaluate the possible impact of being involved in an experimental trial. It is well known that attention and interest can themselves influence people taking part in research. Observed changes may be due to this rather than to the intervention as such. Inclusion of a placebo group helps the researchers to evaluate the potential importance of this factor. The placebo group should receive the same level of input in terms of time as the experimental group, but in research terms this input should be inert, that is it should not contain the methods of intervention whose hypothesized impact is being evaluated.

In summary, there are several factors to take into account in well-conducted evaluation. Figure 26.2 illustrates some of the characteristics of an idealized design for evaluation in a correctional setting.

Figure 26.2 An idealized experimental trial in corrections research

.
.
.

Most evaluation research inevitably falls below the standard implied in Figure 26.2. Not only are the phenomena under investigation intrinsically very complex; many variables are simply beyond the control of researchers. These difficulties notwithstanding, much research fails to observe the principles implicit in this design. Reviewers in the academic journals repeatedly criticize the poor quality of published studies for their lack of methodological rigour. Given the number of variables that can detract from sound design, the task faced by all evaluators is one of minimizing the number of them that might otherwise explain the findings. The purpose of good experimental designs is to reduce or eliminate the effects of such variables.

In research terms these factors are called threats to validity. The validity of an evaluation experiment is the extent to which any effects that are observed in the treatment group can be attributed to the effect of the intervention, and the intervention only. Cook and Campbell (1979) have categorized different types of validity in research and identified various kinds of threats to each of them. There are two main types of validity, internal and external, alongside other types that have to do with the valid use of inferential statistics.

Internal validity is a measure of the extent to which, within any single experiment or evaluation, the influence of extraneous variables has been reduced. This may be threatened by several obstacles, including:

  • the possibility that experimental and control groups were not matched in crucial ways;
  • the fact that there was contamination between the groups, or between one group and outside factors;
  • the possibility that historical factors and events in the individuals' lives differentially affected members of the experimental, control or placebo groups;
  • different loss or attrition rates in groups between the beginning and the end of an evaluation;
  • changes in the way assessment and evaluation instruments may have worked at different points in time (calibration error).

External validity refers to the extent to which the results of a research study can be generalized outside the experimental sample: to other groups, in other places at other times. There are three sub-types of this form of validity, known as population, ecological, and temporal validity respectively. There are threats to this type of validity also. They include:

  • the use of biased or unrepresentative samples;
  • experimenter effects and the influence of demand characteristics on participants' expectations;
  • multiple-treatment interference effects;
  • usage of analogue participants.

Random-allocation experiments still form only a small pro-portion of published reports in correctional research. An exception is the study by Ross, Fabiano, and Ewles (1988) of the Reasoning and Rehabilitation program in which a group of offenders was randomly sub-divided into three sub-samples. One group attended R&R, which was the treatment of interest. The second attended a lifeskill program, which in effect acted as a placebo; while the third that were placed under conventional probation supervision acted as a no-treatment control. In this study as in others, “no-treatment” refers to minimal contact that contains no identified program. This is sometimes depicted as “business-as-usual” in correctional intervention experiments.

The reason for the relative scarcity of randomization is of course that decisions to allocate offenders to different disposals are predominantly made by courts of law. Comparisons between samples of offenders sentenced in different ways, or between those who voluntarily participate in a program and those who decline, cannot be true experiments: the respective groups are non-equivalent. Where this occurs, researchers resort to using what are known as quasi-experimental designs (Cook & Campbell, 1979) in which samples are constructed on a non-random basis. McGuire, Broomfield, Robinson, and Rowson (1995) used a design of this kind for evaluation of probation-based group programs.

In reviewing a range of evaluation studies in correctional settings, Sherman, Gottfredson, MacKenzie, Eck, Reuter, and Bushway (1997) developed a scientific method score in which studies were allocated to one of five groups depending on the level or quality of design used in the evaluation. Scores are allocated as follows:

  1. Correlational designs. These are the weakest forms of evidence, in which there is only an association between program participation and alterations in rates of offending at a specific point in time.
  2. Single-group pre-post designs, in which program participants are assessed prior to and after participation in the program; or non-equivalent control group designs in which they are compared with a control sample which may differ from them in some important respects.
  3. Equivalent control group design. Here, the experimental or treatment group is compared with a sample that is broadly equivalent on key variables and also on pre-assessment measures.
  4. Control of extraneous variables. In these studies there is closer matching of groups, for example in scores on predictor instruments, and major external influences are controlled.
  5. Randomized experimental design. Individuals are drawn at random from an initial sample for allocation to experimental and control groups.

RESEARCH DESIGN LIMITATIONS

The idealized type of design outlined earlier, which receives the highest value within the experimental evaluation framework (as categorized by Stecher & Davis, 1987), and would receive a rating of 5 in Sherman et al's (1997) scoring system, has its epistemological foundations in the way research might be con-ducted in controlled laboratory conditions. This creates a dilemma: such findings can rarely be extrapolated to the more chaotic setting of correctional environments. Conversely, experiments conducted there almost always have many uncontrolled variables. According to Robson (1993), it is almost as if the respective requirements of internal and external validity work against each other. The better controlled a study is, the safer are the conclusions to be drawn from it: but they may not be applicable elsewhere.

Recognition of the gap between well-controlled evaluative trials and implementing their findings in practice has been a topic of major controversy in mental health research (Dobson & Craig, 1998; Persons & Silbersatz, 1998). It has been argued that a distinction must be made between treatment efficacy and service effectiveness. The former is based on evidence that an intervention worked when used in the limited conditions of an RCT. The latter refers to evidence of genuine success of the intervention in real-world conditions. There is an agreed need to find ways to bridge the gap between the two. One proposed solution is to conduct more evaluations that have greater ecological validity. While many correctional evaluations do not achieve the standards of an RCT, paradoxically they may have other advantages. Evaluating a program in the actual conditions in which it will have to survive in practice is a better all-round test of its feasibility as well as its potential effectiveness.

Another possibility often advanced to span the research-practice divide is to make more extensive use of single-case research designs. These represent a fusion of experiment with practice in which an intervention is evaluated with one individual (or a small sample or case series). The design logic runs as follows. If the introduction of an intervention (that is, an attempt to change an individual's behaviour) is uniquely associated with changes in the target variable (that is, there are no changes in it at other times), then the likelihood is reduced that other explanations for the change are true. There are several varieties of such designs, and a number of studies have been published employing them with offenders (McGuire, 1992). Space does not permit more detailed coverage of this issue here. Single-case research designs are described in some detail in books such as those by Barlow and Hersen (1984), and Kratochwill and Levin (1992).

WHAT TO MEASURE AND HOW

The range of information that can be assembled in an evaluation is potentially very wide. Data can be classified in many ways. They may be quantitative or qualitative. They can be defined according to the domain of information-gathering (e.g., demographic/background, behaviour/experience, knowledge, opinion/value; Patton, 1987). Alternatively they may be conceptualized in terms of the kind of method used to obtain them (such as interviews, observation, psychometric, criminological, econometric). The following are some of the principal types of data likely to be sought in correctional evaluations. Few studies would be likely to include all of them.

Demographic and criminological data

Most evaluations of correctional services will likely include some descriptive data on offenders themselves. The main types typically reported include: gender, age, ethnic group, employment and socio-economic status, marital status, years of formal education, family background, and other important developmental information (such as history of contact with welfare services or other agencies). Typical criminological indicators used in research include numbers and types of previous convictions; age at first conviction; sentencing history (numbers and types of court disposals), and changes in patterns of reoffending over time. What is of prime interest of course is whether or not the latter indicators are subsequently influenced by participation in the program being evaluated.

Audit information

Evaluators will generally seek access to information about the organization and delivery of a project. This could encompass data such as numbers of referrals made, numbers of offenders sentenced, numbers commencing in the program, attendance, absconding and completion rates, amounts of time spent on various activities, staff-prisoner ratios, total and per capita running costs. Managers routinely seek evidence of this kind for internal purposes. It will inform an agency's policies on resource levels, including allocation of personnel.

Risk-needs assessment

Over recent years there has been an emphasis on the importance of risk-needs principles in the design of correctional programs. This has been facilitated by the increasing availability of well-validated assessment and predictor scales. They include for example the Level of Service Inventory (Revised) (Andrews & Bonta, 1998); the Manitoba-Wisconsin Risk-needs Classification System (Bonta, 1996); the Offender Group Reconviction Scale (Copas, 1995; Taylor, 1999); or the Violence Risk Appraisal Guide (Quinsey, Harris, Rice, & Cormier, 1998). While some of these are purely actuarial in methodology, others entail particular formats for using actuarial and clinical information in specific combinations. Recently, some commentators have noted the need to include other historical and situational variables, in a procedure called anamnestic risk assessment (Melton, Petrila, Poythress, & Slobogin, 1998).

Recidivism

The measurement of criminal recidivism holds a pivotal place in evaluation of correctional interventions and is usually seen as the ultimate test of their effectiveness. Some commentators have described the search for methods of reducing recidivism as the “secular grail” of research in this field (Lab & Whitehead, 1990). But recidivism itself can be measured in various ways and inconsistencies in this area have been a cause of much misunderstanding and controversy. Depending on the targeted age-group, correctional context or other factors, outcome criteria may vary considerably. The data chosen may consist of rates of: arrest; re-conviction; parole violation or breach of supervision or probation; reincarceration following new convictions; recall to prison whilst on license; or re-admission to secure hospital.

Also, most research on reoffending focuses simply on the event itself, gauged by one of the preceding methods. Relatively few studies take into account its type or level of seriousness; or with repetitive offending, its distribution over time. One approach to the latter is the use of comparative survival rates (time to reconviction) for different cohorts of offenders. Weekes, Millson, and Lightfoot (1995) used this type of data to evaluate the relationship between offender performance on a substance abuse pre-release program and rates of return to custody following release. Henning and Frueh (1996) also used it to evaluate a cognitive self-change program for violent offenders. Given the effort and time-investment involved, still fewer studies examine the relation of crimes to other events and circumstances in offenders' lives. As Motiuk, Smiley, and Blanchette (1996, p. 12) have remarked, “... research into program effectiveness must look deeper into the nature of recidivism.” To incorporate such factors into evaluations, in-depth information would have to be collected, based on interviews with clients or examination of court depositions.

An example of research of this kind is the work of Zamble and Quinsey (1997). These authors have reported on a follow-up study of 311 men discharged from Canadian prisons who reoffended, and compared them with a much smaller sample (n = 36) who did not. Recidivists reported more problems in the period after release, but had fewer or less effective skills for coping with them. Recidivists more often experienced, and had poorer strategies for managing, negative emotional states such as anger, anxiety and depression. They also thought more frequently about substance abuse and possible crimes, and less about employment and about the future in an optimistic light. They experienced greater fluctuation in emotional states in the 48 hours preceding a reoffence. These findings have potentially enormous value for the design of relapse prevention and other types of both pre-release and post-release intervention with high-risk individuals.

When interpreting recidivism rates, care must be taken to exclude pseudo-reconvictions that are the result of offences committed before the commencement of an intervention (Lloyd, Mair, & Hough, 1994). Ideally, comparisons should be made between the actual recidivism rate for a group of offenders and their projected rate based on predictor scales, as well as with suitable control groups.

Participant feedback

Some evaluations are based on offender or consumer feedback. Measures of attendance are one crude form of this. If offenders have choices over whether or not to attend a program, their level of participation may be one signal of its success or failure. Trends in attendance rates were used as a measure of the effectiveness of a life skills program introduced in a probation centre in the United Kingdom (Priestley, McGuire, Flegg, Barnitt, Welham, & Hemsley, 1984). Verbal or written feedback concerning responses to a program can be collected without too much difficulty using interviews or questionnaires. An example is the evaluation of the Edmonton Institution for Women's Peer Support Program (Eamon, McLaren, Munchua, & Tsutsumi, 1999). Though data of this kind are sometimes perceived as “soft” and unreliable, they can yield valuable information concerning responsivity and may provide explanations for differential impact of program components or degrees of attendance or completion.

Intervening variables

More elaborate evaluations of correctional programs are likely to focus on the extent of change in variables targeted by the program. In most evaluations, the variables so assessed will be ones hypothesized to mediate between the interventions being applied (the independent variable), and actual changes in offenders' behaviour (the dependent variable). Hence, attempts may be made for example to assess knowledge, attitudes, thinking patterns, affective states, behavioural skills, and personality dimensions; or features of lifestyle, such as numbers of criminal associates or levels of conflict with significant others. The choice of measures used will depend on the selected targets of change in a given pro-gram. Cognitive skills programs for example are designed to engender changes in such variables as social problem-solving, impulsivity, anger management, social skills, or locus of control. These and other variables can be assessed by an assortment of self-report and observational scales. Robinson, Grossman and Porporino (1991) and Robinson (1995) used this approach in the evaluation of CSC cognitive skills training programs.

Numerous self-report inventories and rating scales exist for assessment of a range of dynamic risk or criminogenic needs factors. Many (though by no means all) of these can be assessed by means of a “psychometric” approach. In selecting specific measures for this purpose, a fairly standard set of criteria is employed. Psychometric assessments are usually judged in terms of their reliability (their freedom from various kinds of measurement error), construct validity (the extent to which a scale measures what it is supposed to measure) and predictive validity (the extent to which it predicts performance on some criterion), amongst other indices. By comparison with colleagues in the fields of education or mental health, correctional researchers still have far fewer well-tested psychometric instruments at their disposal. Evaluation of change in subtler variables such as egocentrism, victim empathy or socio-moral reasoning remains difficult in the absence of well-established measures. There is however a steadily growing literature on the most effective methods of accomplishing this.

A comprehensive plan for data collection in evaluation of a correctional program might therefore entail the following:

  • Compilation of descriptive data on individuals referred to the program, in terms of a standard set of demographic and criminological information; alongside comparisons with other offender groups to provide information on selection and targeting.
  • Audit data concerning rates of referral, commencement, attendance, dropout and completion.
  • Analysis of changes between pre-test and post-test on self-report or observational measures. Inter-group comparisons between program completers and offenders in other experimental or control conditions or other correctional disposals.
  • Examination of inter-correlations between offender characteristics; and outcomes.
  • Follow-up of survival rates at designated intervals (e.g., 6, 12, 24 or 60 months); involving comparisons with related program and sentence types, and with pre-selected predictor scores.
  • Given adequate sample sizes, analysis of the impact of the program employing multiple regression analyses or structural equation models. Examination of inter-relationships between offender or setting characteristics, program variables, pre-to-post test changes and recidivism outcomes.

This range of data is likely to be collected in relatively large-scale, resource-intensive evaluations employing an experimental paradigm as described by Stecher and Davis (1987; see above), or characterized as the social science research evaluation model by Posavac and Carey (1997). For other types of evaluation, depending on the objectives set, quite different types of data would be required. If the objective were to discover reasons for program attrition, for example, an exploratory interview-based study would be more appropriate. If it were to examine reasons for practitioners' allocation of offenders to different programs, again a different evaluation approach would have to be adopted.

PROGRAM INTEGRITY: LINKING PROCESS AND OUTCOME

It is commonly acknowledged that there is a close association between process and outcome in interventions. Large-scale literature review of offender treatment has illustrated the importance of focusing on the manner of delivery of programs. It is vital, if a program is to achieve its declared aims, that it should be executed properly. Accomplishing this involves a number of elements. Cumulatively, these elements are known as program integrity.

Programs in many fields including corrections are known to have failed because their integrity of delivery was compromised. For a variety of reasons programs may become distorted or corrupted, and if this occurs they will be unlikely to achieve their appointed goals. Hollin (1995) has described phenomena such as program drift and program degradation in relation to offender services. More recently Gendreau, Goggin, and Smith (1999) have drawn attention to the importance of program implementation processes and have argued that this has been a comparatively neglected feature in the process of translating research findings into practice. For all these reasons, comprehensive evaluations should include some focus on how integrity may be monitored and safeguarded.

Program integrity

There is however no universally agreed definition of these concepts, though Gendreau and Andrews (1996) have identified a number of separate elements that can be considered to compose it. For present purposes, a distinction will be made between two main aspects of integrity. The term program integrity will be taken to refer to external, organizational features of a program that are essential for its proper delivery along the lines planned by its designers and managers. This refers to the presence of trained staff, appropriate referrals, adequate resources, clear objectives, managerial support, and agency policies concerning these issues and others.

Treatment integrity

This concept refers more specifically to internal aspects of the program's mode of delivery: the direct, face-to-face interaction between program staff and offenders. Treatment integrity or fidelity (Moncher & Prinz, 1991) designates the process by which the theoretical model of the problem being addressed, and of the ways in which it is believed it can be remedied, are visible in the process through which offenders are offered assistance and expected to change.

MONITORING PROCESS

It is important that a set of monitoring process be adopted within agencies implementing a program. These can be of two principal sorts.

The first will entail systems of recording and monitoring not unlike those that would be utilized in a systematic audit. Data would be held on staff selection processes; staff training events; employment stability and continuity; offender targeting and selection processes; offender attendance and completion rates; reliable availability of material resources; frequency of program planning sessions; frequency of program review sessions; frequency of staff supervision sessions; attendance at relevant staff meetings. Program staff would be provided with adequate time for planning and review. Cumulatively, the total program time will be a multiple of the actual session delivery times. Policy documents related to these features of the program would be available for inspection on request.

The converse of this, is that rates of non-attendance, attrition, session cancellations, absence of review documents or reports, may be indicators of deteriorating or non-existent program integrity. For thorough evaluation, it is necessary to develop and establish systems for logging and monitoring data of this kind to create a system of integrity checks. In addition, within an agency decisions should be made regarding which per-son has the responsibility for collection, managing and acting on this information. Arrangements should be made such that the person so designated has adequate time for these tasks, and a position of sufficient influence to enable him or her to address any deficiencies effectively.

Second, there is a parallel need to establish procedures for monitoring treatment integrity. This is a subtler and less easily recorded feature of programs. The most clear-cut and publicly accountable way of achieving it is through video-recording the sessions. Staff member and supervisor should then jointly review the tapes at a pre-agreed frequency. Alternatively, an external assessor or program auditor may view the tapes on a sampling basis, and prepare reports on the treatment integrity of sessions.

The presence of treatment integrity is generally judged in terms of two component criteria: adherence to the program model as described in the manual, and style of delivery. Relevant information in evaluating the former includes whether objectives are clearly stated for the program, session, or exercise; whether the contents are being covered; whether session contents and exercises are appropriately used; and whether program tasks are being accomplished. Specific items may be added as a function of the type of program involved. For style of delivery, information will be needed on whether the nature of any tasks is clearly explained, and whether participants' understanding of them is checked. Observational data may be needed on the levels of warmth or liveliness shown by program staff; alongside evidence of offender engagement and participation. For programs delivered to offenders in group settings, information may be needed concerning the creation of an appropriate learning ethos within them (Platt, Perry, & Metzger, 1980).

ACCREDITATION OF CORRECTIONAL PROGRAMS

The contemporary trend in a number of correctional services is towards placing the provision of programs, and the process of auditing and monitoring them, on a formal, mandatory basis. This has led to the establishment of procedures for program accreditation.

In many respects this development mirrors practices that have been present in other spheres of public service for some time, most notably in education. It is taken for granted that college courses or professional training diplomas will be submitted to external scrutiny before they are deemed to be adequate to their purpose. To check that the designated services remain intact and that the required standards of teaching are maintained, the process is repeated at regular intervals.

Recently this type of system has been introduced by both prison and probation services in the United Kingdom. A new set of jointly agreed prison-probation accreditation criteria has been published (Home Office Probation Unit, 1999), building on an earlier set prepared by the prison service (HM Prison Service, 1998). This requires both that all offender programs be inspected and approved by a central, independent panel of expert consultants, and that the delivery of a program at any given site be subjected to a further process of annual auditing. The set of criteria issued by the panel consists of the following 11 items (See Chapter 1 of this Compendium for more details):

  • Model of change. There should be specification of a clear theoretical model describing how the program will have an impact on factors linked to offending behaviour.
  • Dynamic risk factors. Program materials should identify factors linked to offending which if changed will lead to a reduction in risk of reoffending.
  • Range of targets. Given the complexity of factors linked to criminal acts programs should focus on multiple treatment targets in an integrated, multi-modal format.
  • Effective methods. The methods of change utilized in the program should have empirical support concerning effectiveness and be sequenced in an appropriate way.
  • Skills orientated. The skills targeted by the program should have explicit links to risk of reoffending and its reduction.
  • Intensity, sequencing, duration. The mode of delivery of sessions should be appropriate in the light of available evidence and the program's objectives and contents.
  • Selection of offenders. The population of offenders for whom the program is designed should be clearly specified, as should procedures for targeting, selection, and exclusion.
  • Engagement and participation. The program should be designed with reference to the concept of responsivity and materials, methods and manner of delivery planned accordingly.
  • Case Management. The program should be inter-linked with other elements of the offender's supervision and case management, and guidelines provided for implementation within services.
  • Ongoing monitoring. Procedures and processes should be established for collection and review of integrity monitoring data.
  • Evaluation. There should be a framework and agreed methods for evaluation of the overall delivery and impact of the program.

Lipton, Thornton, McGuire, Porporino, and Hollin (2000) have discussed the implementation and impact of this process itself. It is integral to such systems that procedures be in place for the collection of data for both ongoing monitoring of process, and evaluation of outcomes. In the United Kingdom, a system is currently being developed for the management of all data generated by the application of programs in offender services. The importance of such a system for our present purposes is the prospect it creates of considerably facilitating the entire process of program evaluation.

ECONOMETRICS OF CORRECTIONAL PROGRAMS

As stated at the beginning of this chapter, one of the principal reasons why it has become imperative to evaluate correctional pro-grams is a concern with their impact relative to the resources invested in them. It is incumbent on managers of services to ensure that facilities are used in the most efficient way possible. To do this, monetary costs are computed for all forms of investment in programs, whether of practitioners' time, provision of physical resources, or learning materials. This may be used to inform two types of evaluation (Posavac & Carey, 1997). The first is known as a cost-benefit analysis. This entails calculation of the expenditures required in the provision of a program or service, and a comparison made with the sum of the direct and indirect benefits of the program (to the extent that these can be computed in monetary terms). The second type of study is a cost-effectiveness analysis. Here, the focus is upon whether objectives were achieved, including ones whose monetary value may be difficult to estimate. Comparisons are then made between the resource costs of different types of programs; cost-effectiveness refers to the relationship between the two. Though comparatively few studies of either kind have been reported in criminal justice research, they have a potential significance far beyond the number of them published.2

General estimates that will allow global comparisons between different forms of criminal justice provision are not difficult to make. Official data can be used to compare costs of imprisonment versus community sentences.

A PROGRAM LOGIC MODEL FOR EVALUATION

The ground covered in this chapter can be summarized in a step-wise sequence for planning evaluations, the program logic model. The crux of the model is a recognition of the pivotal relationship between the objectives of evaluation, and those of the program or service to be evaluated, on the one hand; and of the approach, design and methodology of the evaluation process on the other. Clarification of the first significantly elucidates the nature of the second, and in many instances virtually dictates it. Figure 26.3 illustrates this relationship.

Following this model, evaluators are recommended to ask several types of question prior to commencing work. They relate respectively to the objectives of the evaluation itself, and to the objectives of the program or service being evaluated. When these have been considered, a choice of evaluation framework can then be made that is most apposite for achieving both sets of goals.

Figure 26.3 A program logic model for evaluation

.
.
.

The outcome of that process should in turn make certain types of research design more obvious choices. Additionally, having clarified the objectives and the questions to be answered, evaluators can consider how the validity of any conclusions can be assured and threats to validity minimized. These decisions will determine the best methods of data collection.

Different elements of evaluation design, then, are inter-dependent. Note that of course these questions are being addressed here at a conceptual level. No account is being taken of numerous practical issues that might affect the feasibility or otherwise of different options. Realistic evaluation is an attempt to converge the principles of sound evaluation with the realities of program delivery whilst emerging with something that can shed light on previously unanswered questions.

SYNTHESIZING EVALUATION DATA

Meta-analytic review involves integration of the data from separate primary studies (intervention experiments or evaluations) into a higher-order statistical analysis. However, reviewers of research repeatedly comment that the process of conducting reviews and interpreting trends within them is dogged by the poor quality of many evaluation studies or reports. In several reviews (Lipsey, 1992; Lipton, Pearson, Cleland, & Yee, 1997; Sherman et al., 1997), procedures are introduced for categorizing program evaluations according to their design quality. Given the vagaries of real-world evaluation, there will probably always be difficulties in achieving the maximal standards of research design. But this by no means invalidates the rationale for con-ducting evaluations and attempting to do so as well as possible. On the contrary, that rationale is now stronger than ever.

FURTHER SOURCES

Numerous aspects of program evaluation cannot be covered in a single chapter. However, many useful texts and sourcebooks exist on research and evaluation. For a general introduction to practitioner research see Robson (1993); for a general introduction to criminological research see Jupp (1989). There is a wide range of books on research designs in psychology and behavioural sciences; see for example Shaugnessy and Zechmeister (1997). Another useful resource is the nine-volume Program Evaluation Kit produced by Sage Publications.


1 University of Liverpool

2 For more information on this subject, please see Cost-effective correctional treatment from Shelley Brown, Chapter 27 of this Compendium.


REFERENCES

Andrews, D. A., & Bonta, J. (1998). The psychology of criminal conduct. 2nd edition. Cincinnati, OH: Anderson.

Audit Commission (1989). The probation service: Promoting value for money. London, UK: Her Majesty's Stationery Office.

Audit Commission (1996). Misspent youth: Young people and crime. Abingdon, UK: Audit Commission Publications.

Barlow, D. H., & Hersen, M. (1984). Single case experimental designs: Strategies for studying behavior change. New York, NY: Pergamon.

Bonta, J. (1996). Risk-needs assessment and treatment. In A. T. Harland (Ed.) Choosing correctional options that work: Defining the demand and evaluating the supply. Thousand Oaks, CA: Sage.

Boruch, R. F., Petrosino, A. J., & Chalmers, I. (1999). The Campbell collaboration: A proposal for systematic, multi-national and continuous reviews of evidence. Background paper for the Cochrane Collaboration meeting, London, UK: School of Public Policy, University College London, July.

Brody, S. (1976). The effectiveness of sentencing. Home Office Research Study No. 35. London, UK: HMSO.

Cochrane, A. L. (1979). 1931-1971: A critical review, with particular reference to the medical profession. Medicines for the Year 2000. London, UK: Office of Health Economics.

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston, MA: Houghton Mifflin.

Copas, J. (1995). On using crime statistics for prediction. In M. A. Walker (Ed.), Interpreting Crime Statistics. Oxford, UK: Clarendon Press.

Dillon, J. T. (1990). The practice of questioning. London, UK: Routledge.

Dobson, K. S., & Craig, K. (Eds.) (1998). Empirically supported treatments: Best practice in professional psychology. Thousand Oaks, CA: Sage.

Eamon, K. C., McLaren, D. L., Munchua, M. M., & Tsutsumi, L. M. (1999). The Peer Support Program at Edmonton Institution for Women. Forum on Corrections Research, 11(3), 28-30.

Fischer, J. (1973). Is casework effective? A review. Social Work, 18, 5-20.

Fischer, J. (1978). Does anything work? Journal of Social Service Research, 3, 213-243.

Gendreau, P., & Andrews, D. A. (1996). Correctional Program Assessment Inventory. 6th edition. Saint John, NB: University of New Brunswick.

Gendreau, P., Goggin, C., & Smith, P. (1999). The forgotten issue in effective correctional treatment: Program implementation. International Journal of Offender Therapy and Comparative Criminology, 43, 180-187.

Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Newbury Park, CA: Sage Publications.

Hedges, L. V. (1987). How hard is hard science, how soft is soft science? The empirical cumulativeness of research. American Psychologist, 42, 443-455.

Henning, K. R., & Frueh, B. C. (1996). Cognitive-behavioral treatment of incarcerated offenders: An evaluation of the Vermont Department of Corrections' Cognitive Self-Change Program. Criminal Justice and Behavior, 23, 523-542.

HM Prison Service (1998). Criteria for accrediting programs 1998-99. London, UK: Offending Behaviour Programs Unit, HM Prison Service.

Hollin, C. R. (1995). The meaning and implications of program integrity. In J. McGuire (Ed.), What works: Reducing reoffending: Guidelines from research and practice. Chichester, UK: John Wiley & Sons.

Home Office Probation Unit (1999). What works initiative: Crime reduction programme. Joint prison and probation accreditation criteria. London, UK: Home Office.

Jupp, V. (1989). Methods of criminological research. London, UK: Unwin Hyman.

Kratochwill, T. R., & Levin, J. R. (1992). Single-case research design and analysis: New directions for psychology and education. Hillsdale, NJ: Lawrence Erlbaum Associates.

Lab, S. P., & Whitehead, J. T. (1990). From ‘nothing works' to ‘the appropriate works': The latest stop on the search for the secular grail. Criminology, 28, 405-417.

Lipsey, M. W. (1992). Juvenile delinquency treatment: A meta-analytic inquiry into the variability of effects. In T. Cook, D. Cooper, H. Corday, H. Hartman, L. Hedges, R. Light, T. Louis, & F. Mosteller (Eds), Meta-analysis for explanation: A casebook. New York, NY: Russell Sage Foundation.

Lipton, D. S., Martinson, R., & Wilks, J. (1976). The effectiveness of correctional treatment: A survey of treatment evaluation studies. New York, NY: Praeger.

Lipton, D. S., Pearson, F. S., Cleland, C., & Lee, D. (1997). Synthesizing correctional treatment outcomes: Preliminary CDATE findings. Paper presented at the 5th Annual National Institute of Justice Conference on Research and Evaluation in Criminal Justice, Washington, DC, July.

Lipton, D. S., Thornton, D., McGuire, J., Porporino, F., & Hollin, C. R. (2000). Program accreditation and correctional treatment. Substance Use & Misuse, 35, 1705-1734.

Lloyd, C., Mair, G., & Hough, M. (1994). Explaining reconviction rates: A critical analysis. Home Office Research Study No. 136. London, UK: Her Majesty's Stationery Office.

Logan, C. H. (1972). Evaluation research in crime and delinquency: A reappraisal. Journal of Criminal Law, Criminology and Police Science, 63, 378-387.

MacDonald, G. (1999). Evicence-based social care: Wheels off the runway? Public Money & Management, January-March, 25-32.

MacDonald, G., Sheldon, B., & Gillespsie, J. (1992). Contemporary Studies of the effectiveness of social work. British Journal of Social Work, 22, 615-643.

Martinson, R. (1974). What Works? Questions and answers about prison reform. The Public Interest, 10, 22-54.

Martinson, R. (1979). New findings, new views: a note of caution regarding sentencing reform. Hofstra Law Review, 7, 243-258.

McGuire, J. (1992). Interpreting treatment-outcome studies of anti-social behaviour: Combining meta-analyses and single-case designs. (Abstract) International Journal of Psychology, 27, 446.

McGuire, J., Broomfield, C., Robinson, C., & Rowson, B. (1995). Short-term impact of probation programs: An evaluative study. International Journal of Offender Therapy and Comparative Criminology, 39, 23-42.

Melton, G., Petrila, J., Poythress, N., & Slobogin, C. (1998). Psychological evaluations for the courts: A handbook for lawyers and mental health practitioners. 2nd edition. New York, NY: Guilford Press.

Moncher, F. J., & Prinz, R. J. (1991). Treatment fidelity in outcome studies. Clinical Psychology Review, 11, 247-266.

Motiuk, L. L., Smiley, C., & Blanchette, K. (1996). Intensive programming for violent offenders: A comparative investigation. Forum on Corrections Research, 8(3), 10-12.

Mulrow, C. D. (1987). The medical review article: State of the science. Annals of Internal Medicine, 106, 485-488.

Palmer, T. (1975). Martinson re-visited. Journal of Research in Crime and Delinquency, 12, 133-152.

Patton, M. Q. (1987). How to use qualitative methods in evaluation. Newbury Park, CA: Sage.

Persons, J. B., & Silbersatz, G. (1998). Are the results of randomized controlled trials useful to psychotherapists? Journal of Consulting and Clinical Psychology, 66, 126-135.

Petrosino, A. J., Boruch, R. F., Rounding, C., McDonald, S., & Chalmers, I. (1999). A Social, Psychological, Educational and Criminological Trials Register (SPECTR) to facilitate the preparation and maintenance of systematic reviews of social and educational interventions. Background paper for the Cochrane Collaboration meeting, London, UK: School of Public Policy, University College London, July.

Platt, J. J., Perry, G. M., & Metzger, D. S. (1980). The evaluation of a heroin addiction treatment program within a correctional environment. In P. Gendreau & R. R. Ross (Eds.), Effective correctional treatment. Toronto, ON: Butterworths.

Posavac, E. J., & Carey, R. G. (1997). Program evaluation: Methods and case studies. 5th edition. Upper Saddle River, NJ: Prentice Hall.

Priestley, P., McGuire, J., Flegg, D., Barnitt, R., Welham, D., & Hemsley, V. (1984). Social skills in prisons and the community: Problem-solving for offenders. London, UK: Routledge.

Quinsey, V. L., Harris, G. T., Rice, M. E., & Cormier, C. A. (1998). Violent offenders: Appraising and managing risk. Washington, DC: American Psychological Association.

Robinson, D. (1995). The impact of Cognitive Skills Training on post-release recidivism among Canadian federal offenders. Research Report R-41. Ottawa, ON: Correctional Service of Canada.

Robinson, D., Grossman, M., & Porporino, F.J. (1991). Effectiveness of the Cognitive Skills Training Program: From pilot project to national implementation. Research in brief B-07. Ottawa, ON: Correctional Service of Canada.

Robson, C. (1993). Real world research: A Resource for Social Scientists and Practitioner-Researchers. Oxford, UK: Blackwell.

Ross, R. R., Fabiano, E. A., & Ewles, C. D. (1988). Reasoning and rehabilitation. International Journal of Offender Therapy and Comparative Criminology, 32, 29-35.

Ross, R. R., & Gendreau, P. (Eds.) (1980). Effective correctional treatment. Toronto, ON: Butterworths.

Russell, M. N. (1990). Clinical social work. Newbury Park, CA: Sage Publications.

Shaughnessy, J. J., & Zechmeister, E. B. (1997). Research methods in psychology. 4th edition. New York, NY: McGraw-Hill.

Sherman, L., Gottfredson, D., Mackenzie, D. L., Eck, J., Reuter, P., & Bushway, S. (1997). Preventing crime: What works, what doesn't, what's promising. Washington, DC: Office of Justice Programs.

Stecher, B. M., & Davis, W. A. (1987). How to focus an evaluation. Newbury Park, CA: Sage.

Taylor, R. (1999) Predicting reconvictions for sexual and violent offences using the revised offender group reconviction scale. Research Findings No.104. London, UK: Home Office Research, Development and Statistics Directorate.

Weekes, J. R., Millson, W. A., & Lightfoot, L. O. (1995). Factors influencing the outcome of offender substance abuse treatment. Forum on Corrections Research, 7(3), 8-11.

Zamble, E., & Quinsey, V. L. (1997). The criminal recidivism process. Cambridge, UK: Cambridge University Press.

--------------------

Previous PageTop Of Page Table Of ContentsNext Page