Correctional Service Canada
Symbol of the Government of Canada

Common menu bar links

Research Reports

Warning This Web page has been archived on the Web.

The Statistical Information on Recidivism - Revised 1 (SIR-R1) Scale : A Psychometric

Mark Nafekh
Laurence L. Motiuk

Research Branch
Correctional Service of Canada
November 2002


The Statistical Information on Recidivism - Revised 1 (SIR-R1) scale combines 15 items in a scoring system that yields probability estimates of re-offending within three years of release. Each item is a measure of a demographic or criminal history characteristic, statistically scored. The present study re-examined the SIR-R1 for reliability, predictive validity and practical utility on federally sentenced male offenders. The study also examined the creation of a proximal measure for federally sentenced Aboriginal and women offender populations.

The aforementioned proxy scale was also re-calibrated and tested for any predictive gain over the SIR-R1.

A re-examination of the SIR-R1 was conducted on the population of federally sentenced non-Aboriginal males released from a federal institution between 1995 and 1998 who were available for a three-year follow-up period (N = 6,881). Measures of reliability, predictive validity and practical utility included Cronbach's alpha reliability coefficient, Relative Improvement Over Chance (RIOC), Receiver

Operating Characteristics (ROC) and Prevalence Value Accuracy (PVA) plot analysis.

Results showed that the SIR-R1 is internally reliable and valid in predicting general and violent recidivism in the male federal offender population. In conjunction with other studies, the SIR-R1 has proven to be a consistent and valid predictor of post-release outcome over time. Practical utility tests also demonstrated the SIR-R1 scale was an efficient actuarial tool. Empirically derived costs associated with using the scale (false-positive and false-negative predictions) were found to be 17% better than chance.

Currently, the SIR-R1 is not administered to federally sentenced women and Aboriginal offenders. Practice guidelines were set after construction studies were unable to confirm predictive validity for these two special groups. Consequently, a proxy measure of the SIR-R1 (SIR-Proxy) scale was developed for this investigation to assess the applicability of this type of scale to federally sentenced women and Aboriginal offenders. The SIR-Proxy was found to be highly correlated to the SIR-R1, yielding equivalent or better results on tests of reliability, predictive validity and practical utility. For Aboriginal male offenders, the SIR-Proxy was not predictive of post-release outcome (i.e. return to federal custody with a new offence within 3 years of release). However, for federally sentenced women, the SIR-Proxy was predictive of post-release outcome. Consequently, the SIR-Proxy could serve as a basic tool to guide a more comprehensive actuarial instrument for federally sentenced women offenders.

Finally, a re-calibration of the proxy scale was undertaken and examined for any predictive gain over the SIR-R1. The re-calibration process replicated that of the original SIR-R1, in that the Burgess method was used to score individual scale items on half the sample of male offenders. The re-calibrated scale was then tested on the other half. Test results were consistent in that the re-calibrated SIR-

R1 scale was an effective actuarial tool for male non-Aboriginal and women offenders, but not for Aboriginal offenders. However, there was no significant gain in predictive accuracy of the re-calibrated scale over the SIR-R1.

The study reaffirms the application of the SIR-R1 to federally sentenced male non-Aboriginal offenders. Results also suggest that there is good potential for improved predictive accuracy for a similar scale developed for women offenders.

For male Aboriginal offenders, more comprehensive research is required to aid the development of an actuarial tool that supports re-integration efforts for that particular group.

Table of Contents

List of Tables

List of Figures


The General Statistical Information on Recidivism Scale (GSIR; Nuffield, 1982) was developed as part of the "Parole Decision Making Project" initiated by the National Parole Board in 1975 and has been endorsed as a component of Pre-Release Decision Policies for male offenders (National Parole Board, 1988).

Joan Nuffield developed the GSIR in 1982 as a predictive tool measuring recidivism among offenders released from Canadian penitentiaries. Recidivism was defined as re-arrest for an indictable offence during a post-release follow-up period of three years. The GSIR was constructed by weighting items that had a statistically significant relationship with recidivism. Scores were assigned to each of the 15 items and their sub-levels using a weighted Burgess method, also referred to as the simple summation technique. Elements of the GSIR were scored based on differences between the offender re-arrest rate within each item and that of the overall sample. Scores were then clustered to create five groups, roughly of equal size, representing risk categories ranging from 'very good' to 'poor'. The group containing the lowestscores of the sample (and therefore the ‘most likely to succeed’ group) represented the 'very good' risk category and, consequently the 'poor' risk category contained those with the highest scores.

In 1996, the GSIR was revised to improve face validity and reflect changes in legislation. Item 13 of the GSIR (Previous Convictions for Sex Offences) scored those with previous convictions for sex offences as lower risk than those without. In a 3.5-year follow up study of sex offenders, Motiuk and Brown (1996) found previous sex offences to be one of the most salient factors for sexual recidivism. They concluded that more longitudinal research is required to firmly establish relevant risk factors for sexual recidivism. This suggested the scoring of item 13 of the GSIR was a statistical artifact of the original calibration; not many repeat sex offenders were released on parole in the 1970's. Item 13 of the GSIR was therefore modified to reflect these findings. Specifically, the scoring was reversed such that repeat sex offenders are assessed as higher risk.

As mentioned, the GSIR was also revised to reflect changes in legislation. Scoring guidelines pre-dated the Young Offender's Act. The Corrections and Conditional Release Act (CCRA, 1992) changed the definitions of mandatory supervision and statutory release. Item scores were also changed in such a way that a positive score indicated a higher probability of success rather than failure.

The modified GSIR, reflecting the above improvements and legislative changes, became the Statistical Information on Recidivism - Revised (SIR-R1) scale.

Today, the SIR-R1 is an evaluation tool used by the Correctional Service of Canada (CSC) in the intake assessment, intervention and decision components of the re-integration process. It is normally completed at the beginning of an offender's sentence and is also used in re-assessing offender re-integration potential (Motiuk & Nafekh, 2001). Like its predecessor, the SIR-R1 combines measures of demographic characteristics and criminal history in a scoring system that yields probability estimates of success or failure within three years of release (CSC, Standard Operating Practice #61, 700-04).

Since their development, numerous studies have demonstrated these measures to be established, stable tools capable of forecasting post-release recidivism of federal offenders (Bonta, Harman, Hann & Cormier, R. B., 1996; Hann & Harman, 1988; Hann & Harman, 1992; Luciani, Motiuk & Nafekh, 1996; Motiuk & Belcourt, 1995; Wormith & Goldstone, 1984).

Hann & Harman (1989) validated the GSIR on a sample of 534 male inmates who were admitted on warrant of committal and released in 1983-1984. With a 2.5-year follow-up period, it was found that the GSIR was able to distinguish high-risk offenders from low risk offenders. In 1992 results were replicated, expanding the sample to 2, 998 male offenders and extending the follow up period to 3 years (similar to the follow up period of the original Nuffield study). Again, the GSIR had retained the predictive accuracy obtained in the initial conceptualization (Hann & Harman, 1992).

Motiuk and Porporino (1989) examined a sample of 231 offenders who either had successfully completed their parole or mandatory supervision or had their parole or mandatory supervision revoked during 1985. These researchers found that the GSIR accurately identified offenders who failed on parole, as parole failures in their sample increased proportionally with the level of risk designated by the

GSIR. Similarly, Grant et al. (1996) found that the GSIR is a relatively effective tool in assisting case management officers in predicting success on day parole. In a sample of 444 offenders released on ordinary day parole between 1990 and 1991, 11% of the low risk offenders failed while on day parole compared to 25% high-risk failures.

Research has also demonstrated that the SIR-R1 differentiates between various offender groups. For instance, Motiuk and Belcourt (1996a) calculated proxy SIR-R1 scores for a sample of 424 detained offenders who had been released from custody for at least one year. These researchers found that the percentage of cases in the "poor risk" SIR category was greater for offenders detained after having their "one-chance" parole revokedcompared to the other two groupings.

The differences in SIR-R1 risk group categories among the three detained groups were also found to be statistically significant.

Luciani, Motiuk and Nafekh (1996) found convergent validity between the Custody Rating Scale (CRS) and the SIR-R1. The CRS is a security classification instrument that predicts initial security placement of federal offenders. In brief, the CRS bases classification predictions on a combination of Institutional Adjustment and Security Risk ratings. For a sample of 3,656 offenders, the correlation between the SIR and the CRS were statistically significant. Thus, SIR-R1 groupings moved from 'very good' risk to 'poor' risk as security classification moved from minimum to maximum.

Studies have repeatedly validated the GSIR and the SIR-R1 for predicting general recidivism. However, research shows results tend to be less adequate when utilizing these measures to predict future violent behavior. The primary difficulty in utilizing the SIR-R1 for violent behavior prediction has been attributed to relatively low base rates. Low violent recidivism rates place more stringent demands on the SIR-R in terms of predictive value. This situation narrows the improvement over chance, which the SIR-R1 may provide in attempting to isolate a small group of violent offenders1.

In Nuffield’s (1982) original sample, the violent recidivism rate was 12.6%. None of the 15 separate SIR items showed a significant correlation with violent recidivism. Bonta (1992) hoped to produce a more adequate assessment of the SIR in predicting violent behavior. With a violent recidivism rate (similar to Nuffield’s definition) of 18.6% in a release sample of 3,267 federal inmates results showed a modest yet limited improvement over chance levels of prediction.

Serin (1996) also examined the ability of the SIR to predict violent recidivism in a sample of 79 offenders from the Ontario Region (Serin, 1996). Seventy-five percent of the sample were classified as violent offenders (robbery, assault, manslaughter, sexual assault, and murder) and the overall violent recidivism rate after a 5 year follow-up period was 10%. Results showed that, using a statistical index of association that controls for base rates and selection ratios (Relative Improvement Over Chance or RIOC) the SIR had a weak association with violent recidivism (RIOC = 9%).

The presence of primarily static risk variables in the SIR raises concerns as to whether changes in individual or situational determinants are reflected in the predictive accuracy of the scale. Current research strategies aimed at improving the ability to predict offender behavior include incorporating dynamic risk factors into the risk assessment model (Andrews, 1983 ; Andrews and Bonta, 1995; Baird, Heinz, & Bemus 1979; Grant, Motiuk, Brunet, Lefebvre & Couturier, 1996; Motiuk and Porporino, 1989).

1 Current analyses allow for predictive validity and practical utility tests that are independent of base rates, such as Receiver-operating Characteristics (ROC) and Prevalence-Value Accuracy Plots (PVA).

The Present Study

The purpose of this study is to assist the Correctional Service of Canada's (CSC's) continuing offender intervention andre-integration efforts. An actuarial tool such as the SIR-R1 can be scrutinized via a validation process and tailored for use in populations such as women and male Aboriginal offenders. In combination with the professional judgments of all those involved in the reintegration process, it is hoped that the scale will support the Service in it's mission. Using the Scale to assist in identifying offenders to whom available programming resources should be directed, helps the Service in contributing to the protection of society by adequately preparing high-potential candidates for safe release. In identifying the risk groupings upon intake, the scale also assists CSC's effectiveness in exercising reasonable, safe, secure and humane control.

This study examined the ability of the SIR-R1 to predict general recidivism (return to federal custody with a new offense) among non-Aboriginal male offenders. The predictive accuracy of the scale was examined using an assortment of statistical techniques commonly employed in previous studies (Nuffield, 1982,

Hann & Harman, 1992, Bonta et al,1996). Although some techniques have been hailed as better than others, multiple procedures were examined; namely Pearson correlation coefficients, the Relative Improvement Over Chance (RIOC), the Receiver Operating Characteristics (ROC) and Prevalence Value Accuracy (PVA) plot analysis.

The study also examined the extension of a proximal measure to use with the federally sentenced Aboriginal and women offender populations. As the SIR-R1 scale is not currently applied for the women and male Aboriginal federal offender populations (Standard Operating Practice 700-4), a proxy of the SIR-R1 was created for all three groups. The proxy was then compared against actual SIR-R1 scores for the male non-Aboriginal sample to test for accuracy.

The present study also re-calibrated the SIR-R1 and compared changes in scoring for the different SIR-R1 items and groupings with the original calibration.

Statistical analyses tested for any predictive gain over the SIR-R1.


Sample Composition

For the purposes of this research paper, all available data for federally sentenced offenders were extracted from CSC’s automated database (Offender Management System; OMS). As of May 2000, information pertaining to risk and need was available for 8,434 offenders released from federal institutions between 1995 and 1998 and available for a follow up period of 3 years. Of those, 4.06% (342) were women offenders, 14.36% (1,211) were male Aboriginal offenders, and 81.59% (6,881) were non-Aboriginal male offenders.


I) The SIR-R1

The SIR-R1 combines 15 items in a scoring system that yields probability estimates of re-offending within three years of release (Appendix A). Each item is a measure of a demographic or criminal history characteristic, statistically scored using the Burgess method. This method applies positive or negative scores to individual items, based on differences between endorsed item and population success rates. Simple summation of SIR-R1 item scores yields a total ranging from -30 (poor risk) to +27 (very good risk). Total scores are then clustered into five SIR-R1 groupings, ranging from very good (4 out of 5 offenders predicted to succeed) to poor (1 out of 3 predicted to succeed).

II) The SIR-Proxy

The proxy measure of the SIR-R1 scale was computed primarily using data drawn from the Offender Intake Assessment (OIA). These data provided information on each offender’s criminal history, social situation, and other factors equivalent or approximate to individual items of the SIR-R1. A detailed description outlining the methodology used in developing the SIR-Proxy is discussed in the procedures section that follows.

III) The Recalibrated SIR

The recalibrated SIR-R1 for federally sentenced male offenders was derived by randomly dividing the sample into two equal groups; the first sub-sample (N = 4,045) was used for the purpose of re-calibration. The Burgess method was used to re-weight the scale items on this sample. Next, the re-calibrated SIR was validated using the second equal sized sub-sample. Cronbach's alpha reliability coefficient, Pearson correlation coefficients, Relative Improvement over Chance (RIOC), Receiver Operating Characteristics (ROC) and Prevalence-Value-

Accuracy (PVA) statistics were used to measure the reliability, predictive validity and practical utility of the recalibrated SIR.


I) The SIR-R1

SIR-R1 scores and risk groupings for the federally sentenced male non-Aboriginal population released between 1995 and 1998 were obtained from CSC's automated Offender Management System(OMS). Offender identifiers for this group were matched to those in OMS data relations containing SIR-R1 information.

II) The SIR-Proxy

The primary source of information used to develop the SIR-Proxy was data derived from the Offender Intake Assessment (OIA) process. The OIA is a comprehensive and integrated evaluation of the offender at the time of admission to the federal system (Motiuk, 1997). It involves the collection and analysis of information on each offender’s criminal and mental health history, social situation, education, and other factors relevant to determining criminal risk and identifying offender needs. Briefly, the OIA consists of two core components: Criminal Risk Assessment (CRA), and Dynamic Factors Identification and Analysis (DFIA). In addition, a suicide risk potential with nine indicators is included in the assessment process.

The Criminal Risk Assessment (CRA) component of the OIA provides specific information pertaining to past and current offences. The CRA is based primarily on the criminal history record but may also include case-specific information regarding any other pertinent details pertaining to individual risk factors. Based on these data, the OIA provides an overall global risk rating for each offender at admission to federal custody.

The Dynamic Factors Identification and Analysis (DFIA) involves the identification of the offender’s criminogenic needs. More specifically, it considers a wide assortment of case-specific aspects of the offender’s personality and life circumstances, and data are clustered into seven target domains with multiple indicators for each: employment (35 indicators), marital/family (31 indicators), associates/social interaction (11 indicators), substance abuse (29 indicators), community functioning (21 indicators), personal/emotional orientation (46 indicators), and attitude (24 indicators)2.

Using the DFIA, offenders are rated on each target domain along a four-point continuum. Ratings are commensurate with the assessment of need, ranging from "asset to community adjustment"(not applicable to substance abuse and personal/emotional orientation), to "no needfor improvement", to "some need for improvement", to "significant need for improvement". After careful consideration of all indicators in each need domain, case management officers provide an estimate of overall need level. This is provided for each of the seven target areas.

The 15 items of the SIR-R1 were matched to specific dichotomous OIA indicators. Endorsed OIA items were given the equivalent SIR-R1 score. For example, SIR-R1 item 15 - (employment status at arrest) was scored accordingly on the SIR-Proxy with a +1 if item 16 in the employment domain of OIA (was employed at time of arrest) was endorsed. Of all items on the SIR-R1, item 11 (number of dependents at most recent admission) was not approximated. See Appendix B for actual vs. computed scores broken down by SIR-R1 item.

III) The Recalibrated SIR

The SIR-Proxy was used as the instrument for recalibration. First, the male offender release cohort was randomly divided into two equal sized groups. Next, the Burgess method was used to recalibrate the SIR-Proxy items. This method scores endorsed items based on differences between the overall sample recidivism rate and that of those endorsing a particular item. The scoring technique assigns a +/- 1 for every 5% between the overall and 'item-associated' rates. This scoring system begins with differences greater than or less than the

2 See Correctional Service Canada's Standard Operating Practice 700-04 for a complete listing of indicators. overall sample mean plus/minus 5%. For example, the recidivism rate (return with a new offence) of the random sample was 23.58%. Offenders who endorsed item 16 of the OIA for this sample (employed at time of arrest) had a recidivism rate of 14.58%. Thus item 15 of the recalibrated SIR was scored as follows:

Item 15 score = ((23.58-5)- 14.58)/5 = 0.8; rounded up = 1

Next, risk groupings were established by ranking the recalibrated scores into five equal clusters. Corresponding cut-off scores were created based on these groupings.

Finally, the recalibrated item scores and risk groupings were applied to the second equal sized sample. A comparison of SIR-Proxy and recalibrated items can be found in Appendix C.


I) Validation of the SIR-R1

As the SIR-R1 scale is not currently administered to women and male Aboriginal federal offenders (SOP 700-4) the SIR-R1 was re-examined for federally sentenced male non-Aboriginal offenders (N = 6,881). The scale was assessed in terms of reliability, validated in its ability to predict any return to federal custody with a new offence, and evaluated for practical utility. A variety of statistical techniques were used to provide a basis of comparison with other studies and to reflect current practice.

To assess the internal consistency of the SIR-R1, Cronbach's alpha reliability coefficient was used. Standardized and raw alphas were 0.75 and 0.77 respectively. The ability of the SIR-R1 to scrutinize between successful and non-successful cases was then examined via techniques used in previous studies.

Simple Pearson Correlation Coefficients indicated a strong relationship with general recidivism amongst the SIR-R1 groupings (r = 0.36,p<.0001). Next, the Relative Improvement Over Chance (RIOC) is a method of summarizing, with a single index, the degree to which the values in a two-by-two table deviate from chance assignment and corrects for maximum percent (Farrington & Loeber, 1989). Table 1 illustrates the four components. Note that specificity refers to those offenders who are paroled and succeed on conditional release. Sensitivity refers to those denied parole who would fail.

Table 1: Components of the SIR-R1 Decision Matrix

  Success Fail
Deny Parole False Positive True Positive (Sensitivity)
Parole True Negative (Specificity) False Negative

Typically, a cutoff risk percentage of 50% is assumed to satisfy the underlying assumptions behind a RIOC analysis; that is if the success rate is expected to be above 50% within a particular risk grouping, all in that group are paroled.

Accordingly, Table 2 below is collapsed in Table 3 to facilitate calculation of the RIOC statistic.

Table 2: SIR-R1 Risk Groupings by Outcome (General Recidivism)

SIR-R1 Risk Group Successes Failures Total
Poor 866 673 1,539
(rate) (56%) (44%) (100%)
Fair/Poor 583 265 848
(rate) (69%) (31%) (100%)
Fair 768 246 1,014
(rate) (76%) (24%) (100%)
Good 765 142 907
(rate) (84%) (16%) (100%)
Very Good 2,382 141 2,523
(rate) (94%) (6%) (100%)
Total 5,364 1,467 6,831
(rate) (79%) (21%) (100%)

Note: Percentages may not add due to rounding

Table 3: Decision Cut-Offs for SIR-R1 Risk Groupings

  Successes Failures Total
Deny Parole 1,149 (21%) 938 (14%) 2,387 (35%)
Parole 3,915 (57%) 529 (8%) 4,444 (65%)
Total 5,364 (79%) 1,467 (21%) 6,831 (100%)

Note: Percentages may not add due to rounding

The resulting RIOC statistic for the SIR-R1 was 24% for male non-Aboriginal offenders.

A popular alternative to the RIOC statistic for assessing predictive validity is the Receiver Operating Characteristic, or ROC (Swets, 1986). The advantage of ROC over the preceding measures is its independence of base rates and selection ratios.

ROC was used to calculate true positive and false positive rates for the SIR-R1 cutoff scores corresponding to each risk category. Plotting the associated rates along an XY axis produced an ROC curve. The area under the curve or AUC (between 0 and 1) measures the probability that non-recidivists would score higher on the SIR-R1 scale than recidivists. An AUC of 1 indicates perfect discrimination between recidivists and non-recidivists, while an AUC of 0.5 or less indicates the scale has no power to discriminate. AUC results for federally sentenced male non-Aboriginal offenders were good at 0.745. (See Figure 1)

Next, Prevalence-Value Accuracy (PVA) analysis was performed on the SIR-R1. Where the ROC analysis tests predictive accuracy, PVA tests the practical utility of a measure. Practical utility is evaluated by incorporating outcome rates and the cost of misclassifications into a quantifiable formula. In this study, this formula is a function of general recidivism rates and associated costs of false-positive and false-negative predictions. By plotting minimum misclassification over a range of success rate and false-positive/false-negative ratio combinations,

PVA analysis derives a cost-surface. Analogous to the area under the curve

(AUC) for ROC analysis, the volume beneath this cost surface (cost-volume index) is an index of test performance (Remaley et al., 1999). A perfect test would have no misclassification costs and would therefore have a volume of 03.

The cost-volume index for the SIR-R1 was .06946, or 17% more efficient than chance prediction.

3 A test that cannot differentiate between success and failure (i.e. a chance test) would have a volume of 0.08334.

Finally, the analyses explored the ability of the SIR-R1 to predict other outcome measures; return to federal custody with a new violent offence and return to federal custody with a sex offence. Briefly, for violent recidivism the AUC was good for the SIR-R1 at 0.71. For sexual re-offending, low recidivism rates and small sample sizes within risk groupings may explain the inability of the SIR-R1 to discriminate between recidivists and non-recidivists (AUC = 0.54).

As demonstrated, the SIR-R1 is internally reliable. The data confirm the accuracy of the scale in identifying offenders likely to return to federal custody with a new offence within three years. The SIR-R1 is also practical in that misclassification costs are less than those associated with pure chance.

IIa) The SIR-Proxy: Derivation and Validation

Internal consistency of the SIR-Proxy was tested using Cronbach's alpha reliability coefficient. This yielded standardized and raw alphas of 0.77 and 0.78 respectively. The SIR-Proxy was next measured against the SIR-R1 to determine how accurately it reflects actual SIR-R1 scores and groupings. Simple Pearson Correlation Coefficients indicated strong correlations with SIR-R1 total scores (r =.90). The cutoff scores for the SIR-Proxy groupings were established to reflect the same sample distribution amongst actual SIR-R1 groupings. The resulting SIR-Proxy groupings were also highly correlated with the actual SIR-R1 groupings (r =.85).

When validated on outcome as a measure of predictive validity, the SIR-Proxy fared the same as the SIR-R1. This was not surprising given the SIR-Proxy's high correlation to the SIR-R1. Results of the various statistics in comparison to the SIR-R1 are presented in Table 4.

Table 4: SIR-R1 versus SIR-Proxy

  SIR-R1 SIR-Proxy
Cronbach's alpha reliability coefficient Alpha = 0.77 Alpha = 0.78
Relative Improvement Over Chance (RIOC) RIOC = 0.28 RIOC = 0.29
Area Under the Curve (AUC) and correlation : General Recidivism AUC = 0.745 r=0.36*** AUC = 0.752 r=0.36***
Area Under the Curve (AUC) and correlation : Violent Recidivism AUC = 0.708 r=0.14*** AUC = 0.726 r=0.14***
Area Under the Curve (AUC) and correlation : Sexual Recidivism AUC = 0.540 r=0.01NS AUC =--r=0.01NS

Notes: NS=not significant, *p<.05; **p<.01; ***p<.001

Practical utility tests revealed no significant difference between the SIR-Proxy and the SIR-R1. A topographical view of the PVA plots illustrates how all combinations of recidivism rates and unit cost ratios are virtually the same for all misclassification costs (see Figure 3). A more quantitative analysis revealed there is no more than a 4% difference in minimum misclassification costs between the two measures at all points on the cost surface. Note that all points on the cost surfaces of both measures were associated with minimum misclassification costs that were lower than those of a chance test.

IIb) The SIR-Proxy: Application to Women and Male Aboriginal Offenders

The SIR-R1 is not currently applied to the women and male Aboriginal federal offender populations. However, as shown, it is possible to successfully approximate the SIR-R1 for both these populations. The SIR-Proxy, derived for women and male Aboriginal federal offenders, can thus be tested.

i) Women Offenders

Validation methods performed on the SIR-Proxy for women offenders paralleled those of the above analyses. Using post-release outcome as a measure of predictive validity, results yielded significance for women offenders. The SIR-Proxy score was correlated with general recidivism (r = 0.32, p<.0001) and the area under the curve showed the scale groupings accurately discriminated between recidivists and non-recidivists (AUC = 0.767). For violent re-offending, the SIR-Proxy discriminated between groups with an AUC of 0.725. No validity measures were tested against the SIR-Proxy to discriminate for sexual re-offending amongst women offenders since there were no observable outcomes as such for this group.

ii) Male Aboriginal Offenders

SIR-Proxy scores were correlated with general recidivism at r = 0.32 (p<.0001) for the male Aboriginal group. ROC analysis revealed that the scale's ability to discriminate between recidivism groupings ranged from weak to poor (AUC = 0.683 for general recidivism, AUC = 0.645 for violent re-offending and AUC = 0.599 for sexual re-offending). Results were disparate to those found in an earlier study of male Aboriginal offenders released in 1983/844. The AUC for the 1983/84 sample was moderate at 0.708 (Hann & Harman, 1993). The discrepancy in validity between the current and previous sample could possibly be attributed to the changing nature of the federal Aboriginal male population. Aboriginal youth are one of the fastest growing demographic sectors in Correctional Service of Canada's (CSC's) offender population. Significant differences between younger and older Aboriginal offenders have been identified; namely in the areas of static and dynamic risk and admitting offence (Nafekh, 2002). Consequently, differences in these areas would also be reflected in SIR-Proxy scores for older and younger Aboriginal populations.

4 Robert G. Hann & William G. Harman, Predicting Release Risk for Aboriginal Penitentiary Inmates 1993, No. 1993-12 18

III) The Re-calibrated SIR

Results revealed there was no significant gain in predictive accuracy of the recalibrated SIR over the SIR-Proxy or the SIR-R1. The AUC for the recalibrated SIR was 0.754, compared to 0.745 and 0.752 for the SIR-R1 and SIR-Proxy respectively. A test to determine whether there were gains in efficiency showed no advantage as such. The cost volume index was 0.06819 for the SIR-R2 (only 2% and 0.2% less than that of the SIR-R1 and the SIR-Proxy respectively).

Results of the recalibrated SIR for women were similar to that of the males, as there was no significant difference of predictive accuracy and efficiency between the re-calibrated version and the SIR-Proxy. The AUC for the recalibrated SIR was 0.784 compared to 0.767 for the SIR-Proxy, and there was no more than a

0.1% difference in cost volume indices for the two scales.

The ability of the recalibrated SIR to predict general recidivism for the Aboriginal male population was found to be marginally better than the SIR-Proxy. The AUC was 0.718 and the cost volume index was 5.5% less than that of chance decisions. These results were not surprising given that the SIR-Proxy was found to be weak for this population.


Results show that the SIR-R1 was internally reliable and accurately identified risk groupings in the male non-Aboriginal federal offender population released between 1995 and 1998. This assists the Service in exercising reasonable, safe, secure and humane control. Using the SIR-R1 scale upon intake also helps identify those offenders to whom available programming resources should be directed; thus, preparing high potential candidates for safe release. Findings support previous studies and reaffirm the wealth of validity data for the SIR-R1.

In making use of current estimation and evaluation techniques, the report derived a SIR-Proxy that was shown to be just as effective as the SIR-R1. In addition, there are no misclassification costs associated with the SIR-Proxy that are above those of the SIR-R1. Hence, given the SIR-Proxy was derived primarily from the Offender Intake Assessment data base in CSC’s Offender Management System (OMS), it is conceivable that the SIR-R1 be replaced by the SIR-Proxy via an automated process.

The report also concludes that, when applied to women offenders, the SIR-Proxy accurately discriminates amongst the five risk groupings. Past research has not supported the use of the SIR-R1 or GSIR with the federal women offender population (Blanchette, 1996, Hann & Harman, 1989b). As the original SIR scale was developed on a sample of men, the results suggest that there is good potential for improved predictive accuracy for a scale developed in a parallel fashion on a sample of women offenders.

For the male Aboriginal population, the study found results contrary to those for non-Aboriginal males and women. The finding that the SIR-Proxy is weak in accurately discriminating between risk groupings for Aboriginal males also contradicts previous findings (see Hann & Harman, 1989). This could possibly be attributed to trends affecting the nature of this population. For example, the increase in Aboriginal youth within CSC’s offender population may also be reflected in differences between SIR-R1 scores of current and previous samples.

In addition, significant differences between age groupings within the male Aboriginal offender population have previously been identified (Nafekh, 2002). It has also been noted that this population is not homogenous as they differ in aspects ranging from cultural diversity to constitutional and legal status (National Parole Board, 1988). Predictive accuracy of an actuarial tool, such as the SIR-R1, may be maximized by means of a very specialized process. Such a process should include consultation with Elders and other experts, and take into consideration all factors and trends relevant to the male Aboriginal offender population.

Results of SIR-Proxy re-calibration efforts resulted in little gain in predictive accuracy or efficiency. Given the static nature of the scale, trends in the federal offender population, and the ability of the scale to retain its’ predictive accuracy over time, it is likely that the SIR-R1 need never be re-tooled for the male non-Aboriginal population.

In conclusion, the study shows that the SIR-R1 continues to assist the Service in achieving its' mission. The study also identifies a comparable scale, the SIR-Proxy, for which scores and risk groupings can be automatically calculated from Offender Intake Assessment information. The SIR-Proxy is currently not an adequate tool for use with Aboriginal males. The Proxy could, however, serve as a basic tool that would serve to guide a more comprehensive assessment of reintegration potential for the women offender population. Finally, the utility of the SIR-R1/SIR-Proxy could be extended to assist in the successful reintegration of offenders. Such an undertaking would require the addition of dynamic risk factors into the SIR-R1 model. These risk factors would reflect individual or situational determinants that may contribute to the success of the offender upon their release into the community.


Andrews, D. A., & Bonta, J. (1995). LSI-R: The Level of Service Inventory - Revised. Toronto, ON: Multi-Health Systems, Inc.

Andrews, D.A. (1983). The assessment of outcome in correctional samples. M.L. Lambert, E.R. Christensen, & S.S. Dejulio (Eds.), The measurement of psychotherapy outcome. NY, Wiley.

Baird, S. C., Heinz, R. C., & Bemus, B. J. (1979). The Wisconsin Case Classification and Staff Development Project: A Two Year Follow-up Report. Wisconsin: Division of Corrections.

Blanchette, K. (1996). The Relationship between Criminal History, Mental Disorder, and Recidivism Among Federally Sentenced Female Offenders. Unpublished Masters Thesis, Carleton University, Ottawa, ON:

Bonta, J., Harman, W. G., Hann, R. G., & Cormier, R. B. (1996). The prediction of recidivism among federally sentenced offenders: A re-validation of the SIR scale. Canadian Journal of Criminology, 38, 61-79.

Bonta, J., Pang, B., & Wallace-Capretta,S. (1995) Predictors of Recidivism among Incarcerated Female Offenders, The Prison Journal, 75, 277-294.

Bonta,J., Lipinski.S., & Martin, M.(1992) Characteristics of Federal Inmates Who Recidivate. Ottawa, ON: Statistics Canada.

Farrington, D. P., & Loeber, R. (1989). Relative improvement over chance (RIOC) and phi as measures of predictive efficiency and strength of association in 2 by 2 tables. Journal of Quantitative Criminology, 5, 201-213.

Grant, B. A., Motiuk, L. L., Brunet,L., Lefebvre, L., & Couturier,P. (1996) Day Parole Program Review: Case Management Predictors of Outcome, Research Report R-52. Ottawa, ON: Correctional Service of Canada.

Hann, Robert G., & Harman, W G. (1988). Release Risk Prediction: A Test of the Nuffield Scoring System. A Report of the Parole Decision Making and Release Risk Assessment Project. Ottawa, ON: Ministry of the Solicitor General of Canada.

Hann, RobertG., & Harman, W.G. (1992) Predicting General Release Risk for Penitentiary Inmates, User Report. Ottawa, ON: Solicitor General Canada.

Hann, Robert G., & William G. Harman, Predicting Release Risk for Aboriginal Penitentiary Inmates, submitted to the Ministry of the Solicitor General of Canada by The Research Group, (1993-12)

Hann, Robert G., & William G. Harman, (1989a) Release Risk Prediction: A Test of the Nuffield Scoring System, Corrections Research. Ottawa, ON: Secretariat of the Ministry of the Solicitor General of Canada.

Hann, Robert G., & William G. Harman, (1989b) Release Risk Prediction: A Test of the Nuffield Scoring System for Native and Female Inmates, Corrections Research. Ottawa, ON: Secretariat of the Ministry of the Solicitor General of Canada.

Luciani, F. P., Motiuk, L. L., & Nafekh, M. (1996). An operational review of the custody rating scale: Reliability, validity and practical utility. Research Report R-47. Ottawa, ON: Correctional Service Canada.

Motiuk, L. L. & Belcourt,R.L. (1995). Statistical Profiles of Homicide, Sex, Robbery and Drug Offenders in Federal Corrections. Ottawa, ON: Correctional Service Canada.

Motiuk, L. L., & Porporino, F. J. (1989). Offender Risk/Needs Assessment: A Study of Conditional Releases. Research Report R-1. Research and Statistics Branch. Ottawa, ON: Correctional Service Canada.

Motiuk, L.L. (1997). Classification for correctional programming: The Offender Intake Assessment (OIA) process. Forum on Corrections Research, 9 (1), 18- 23.

Motiuk, L.L., & Nafekh, M. (2001) Using reintegration potential at intake to better identify safe release candidates. Forum on Corrections Research, 13(1), 11-13

Motiuk, L.L., & Belcourt,R.L. (1996) Prison Work Programs and Post-release

Outcome: A Preliminary Investigation. Research Report R-43. Ottawa, ON: Correctional Service of Canada.

Motiuk, L.L., & Porporino,F.J. (1989) Field Tests of the Community Risk/Needs Management Scale: A Study of Offenders on Caseload. Research Report. R-06. Ottawa, ON: Correctional Service of Canada.

Nafekh, M. (2002). An Examination of Youth and Gang Affiliation within the Federally Sentenced Aboriginal Population, Research Report R-121. Ottawa, ON: Correctional Service of Canada.

National Parole Board of Canada. (1988). Policy and procedures manual. revised. Ottawa, ON: National Parole Board of Canada.

Nuffield, J (1989). The SIR scale: Some reflections on its applications. Forum on Corrections Research, 11(2), 19-22.

Nuffield, J. (1982). Parole decision making in Canada: Research towards decision guidelines. Ottawa, ON: Solicitor General of Canada.

Remaley, A.T., Sampson, M.L., DeLeo, J.M., Remaley, N.A., Farsi, B.D., & Zweig, M.H.(1999). Prevalence-Value-Accuracy Plots: A New Method for Comparing Diagnostic Tests Based on Misclassification Costs. Clinical Chemistry, ;45:943-941

Serin, R. (1996). Violent recidivism in criminal psychopaths. Law and Human Behaviour, 20, 207-217

Swets, J.A. (1986) Form of empirical ROCs in discrimination and diagnostic tasks: Implications for theory and measurement of performance. Psychological Bulletin, 99:181-198.

Wormith, J. S., & Goldstone, C. S. (1984). The clinical and statistical prediction of recidivism. Criminal Justice and Behaviour, 11, 3-34.