UvA-DARE (Digital Academic Repository) Structured risk ... · structured professional judgment...

UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

Structured risk assessment of (sexual) violence in forensic clinical practice

de Vogel, V.

Link to publication

Citation for published version (APA):de Vogel, V. (2005). Structured risk assessment of (sexual) violence in forensic clinical practice. Amsterdam:Dutch University Press.

General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s),other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, statingyour reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Askthe Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam,The Netherlands. You will be contacted as soon as possible.

Download date: 26 Aug 2019

https://dare.uva.nl/personal/pure/en/publications/structured-risk-assessment-of-sexual-violence-in-forensic-clinical-practice(90e16867-f53a-45e3-b0fc-89a478948fd5).html


The HCR-20 and SVR-20 in Dutch forensic psychiatric patients

© Vivienne de Vogel, 2005 ISBN 90 3619 302 8 Cover: Toscane, Jan van Kempen, 2002 Key words: risk assessment, violence, sexual violence, HCR-20, SVR-20 All rights reserved. Save exceptions stated by the law, no part of this publication may be reproduced, stored in a retrieval system of any nature, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, included a complete or partial transcription, without the prior written permission of the publishers, application for which should be addressed to the publishers: Dutch University Press, Bloemgracht 82hs, 1015 TM Amsterdam, The Netherlands Tel.: + 31 (0) 20 625 54 29 Fax: + 31 (0) 20 620 33 95 E-mail: [email protected] www.dup.nl Dutch University Press in association with Purdue University Press, West Lafayette, Ind. U.S.A & Rozenberg Publishers, The Netherlands


The HCR-20 and SVR-20 in Dutch forensic psychiatric patients

Academisch proefschrift

ter verkrijging van de graad van doctor

aan de Universiteit van Amsterdam op gezag van de Rector Magnificus

Prof. mr. P.F. van der Heijden ten overstaan van een door het college voor promoties ingestelde

commissie, in het openbaar te verdedigen in de Aula der Universiteit

op woensdag 25 mei 2005, te 14.00 uur

door Vivienne de Vogel geboren te Dordrecht

Promotor: Mw. Prof.dr. C. de Ruiter Universiteit van Amsterdam Faculteit der Maatschappij- en Gedragswetenschappen

This thesis is dedicated to the memory of my grandmother Truus Baan,

the one who truly shared my interests

List of publications This thesis is based on the following papers:

Vogel, V. de, & Ruiter, C. de (2004). Differences between clinicians and researchers in assessing

risk of violence in forensic psychiatric patients. The Journal of Forensic Psychiatry and

Psychology, 15, 145-164. (Chapter 2)

Vogel, V. de, Ruiter, C. de, Beek, D. van, & Mead, G. (2004). Predictive validity of the SVR-20

and Static-99 in a Dutch sample of treated sex offenders. Law and Human Behavior, 28, 235-

251. (Chapter 3)

Vogel, V. de, Ruiter, C. de, Hildebrand, M., Bos, B., & Ven, P. van de (2004). Type of discharge

and risk of recidivism measured by the HCR-20. A retrospective study in a Dutch sample of treated

forensic psychiatric patients. International Journal of Forensic Mental Health, 3, 149-165.

(Chapter 4)

Vogel, V. de, & Ruiter, C. de (in press). The HCR-20 in personality disordered female offenders:

A comparison with a matched sample of males. Clinical Psychology and Psychotherapy, 12.

(Chapter 5)

Vogel, V. de, & Ruiter, C. de (submitted for publication). Structured professional judgment of

violence risk in forensic clinical practice: A prospective study into the predictive validity of the

Dutch HCR-20. (Chapter 6)

The papers were printed or pre-printed with kind permission from Kluwer Academic / Plenum Publishers, BrunnerRoutledge, Wiley, and the International Association of Forensic Mental Health Services.

Contents Acknowledgements Introduction 1 Chapter 1 Violence risk assessment: State of the art 7 Chapter 2 Differences between clinicians and researchers in assessing risk 49

of violence in forensic psychiatric patients Chapter 3 Predictive validity of the SVR-20 and Static-99 in a Dutch sample of 63 treated sexual offenders Chapter 4 Type of discharge and (risk of) recidivism measured by the HCR-20: 79

A retrospective study in a Dutch sample of treated forensic psychiatric patients

Chapter 5 The HCR-20 in personality disordered female offenders: A comparison 99 with a matched sample of males Chapter 6 Structured professional judgment of violence risk in forensic clinical 117 practice: A prospective study into the predictive validity of the Dutch HCR-20 Chapter 7 General discussion 135 Summary 149

Samenvatting 153 References 157 Appendixes I List of abbreviations 181

II HCR-20 coding form 183 III SVR-20 coding form 187 IV FWC 191 V Incidents registration 193

Curriculum vitae 195

Acknowledgements First and foremost, I would like to thank my supervisor Corine de Ruiter. Her expertise, intelligence,

professional insights, but above all, her continuous and sincere passion for her work deeply impressed

me; she is a great source of inspiration to me. In supervising me, she has always set the example, but at

the same time, encouraged me to use my own words and do things in my own way.

Second, there are many colleagues from the Dr. Henri van der Hoeven Kliniek who contributed to this

thesis and I would like to acknowledge them all for their efforts, enthusiasm and support. I thank the

board of directors, Henri Wiertsema and Jan Gerrits for providing the opportunity to conduct this

research, and the heads of the Research department, Daan van Beek and Judith de Boer for their

continuing support and involvement in this project. The work of all treatment supervisors, group

leaders and researchers who participated in this research was much appreciated. I am grateful to Ellen

van de Broek, Cécile Vandeputte van de Vijver, Pascal Wolters and Stefan Zwartjes for pioneering

with me the first 60 risk assessments and Petra Geerligs, Anke Weenink and Pascalle van der Wolf for

taking it up to the next level. Their expertise, enthusiasm, and moral support have certainly been

stimulating to me. Special thanks go to Cécile Vandeputte van de Vijver, with whom I provided the

risk assessment workshops during the first period of this project. Her warm involvement, clinical

expertise and knowledge inspired me in many ways. Furthermore, I am grateful to Martin Hildebrand

for his overall involvement in this project and for his ‘sharp eye’, he is the one who always stimulated

me to think further. Brechje Bos, Gwen Mead, Esther Oosterhof and Peter van de Ven are kindly

thanked for their contributions to the retrospective studies in this thesis. I appreciate Wineke Smid for

her useful comments in the final stage of this project. Furthermore, several colleagues have been

helpful in a practical way. I thank Quinta Appeldoorn and Harry Houtman for their help with the files

from the retrospective studies, Judith Heine and Lidy Ewijk for their help gathering literature, and

Gerald Louws and Gijs Viester for endlessly replacing documents from disk to system and vice versa.

Special regards go to Henriëtte van der Maeden, Karlijn Vercauteren, Ine Kusters, Ingrid Verduijn,

and Saskia Luyten. I thank them for their moral support, fun, and friendship throughout the years, it

was - and still is very important to me.

Third, several colleagues from the forensic field have inspired me in my work; researchers from other

forensic psychiatric hospitals in The Netherlands as well as colleagues from abroad. The work of

researchers like Stephen Hart, Chris Webster and Kevin Douglas impressed and inspired me. The

meetings of the European Network of Structured Risk Assessment (ENSRA) were stimulating and

useful (and Mats is also thanked for showing us every Irish pub in all cities we met!). Furthermore, I

thank all researchers of the ‘dissertation researchers group’ for the evenings full of interesting

discussions, good food and fun. I specifically thank my two paranimfen; Lieke van Domburgh for her

ingenuity and enthusiasm, and Jacky Das for her reflectiveness, friendship, and all the good times we

had during international conferences.

Last, I want to acknowledge the support of my beloved ones. I wish to express my sincere gratitude to

my parents for their unconditional support and love, their believe in me and for always having given

me the opportunity to develop. Arno is the one who taught me most about how to separate the

essentials from the trivials. I thank him for his talent to see and put things in perspective and, of

course, for his love and unconditional support. Finally, for showing me the true importance of life, I

am grateful to our sparkling little girl Luna and her two little brothers who are on their way.

Introduction

1

Introduction

Background The assessment of risk of future (sexual) violence is one of the most important tasks of mental health

professionals in forensic practice. In The Netherlands, society is regularly confronted with serious

violent recidivism by forensic psychiatric patients during probationary leave or after discharge

(Hilterman, 2001, 2004). Violent (re)offending has severe consequences for the victims and causes

strong feelings of fear, anger, and concern in society. In the past decade, Dutch society has become

increasingly intolerant of (sexual) violent offenses and political parties demand more repressive

policies regarding, for instance, leaves from forensic psychiatric hospitals. Prevention of (sexual)

violent recidivism is thus high on the political, public and clinical agenda. A carefully conducted risk

assessment before a probationary leave or termination of (mandatory) treatment can help to appraise

and manage the risk of recidivism in an adequate way and thereby prevent serious (sexual) violent

offenses (Douglas & Webster, 1999a). Until recently, the best known and most widely used method in

forensic practice, at least in The Netherlands, was the unstructured clinical judgment approach that is

exclusively based on the professional expertise of the mental health professional. However, research

has revealed some important limitations of this unstructured clinical judgment, such as poor reliability

and validity (Lidz, Mulvey, & Gardner, 1993; McNiel & Binder, 1995; Monahan, 1981; see Chapter

1). Therefore, several authors have recommended the use of more structured risk assessment

procedures in order to optimize the accuracy of violence risk assessments (Borum, 1996; Webster,

Douglas, Eaves, & Hart, 1997a).

The past two decades, research into risk factors for (sexual) violence, the development of structured

risk assessment tools and research into the psychometric properties of these tools has expanded

enormously. To date, numerous structured risk assessment tools are available for mental health

professionals working in forensic or general psychiatry or in the penitentiary system. An important

distinction among structured risk assessment tools can be made between the actuarial and the

structured professional judgment (SPJ) approach. Actuarial instruments are developed on the basis of

risk factors that are empirically related to (sexual) violent behavior. These instruments are relatively

simple to code - according to fixed rules and not necessarily by a forensic expert - and contain

predominantly static, non-changeable factors. The scores on the factors are added up according to a

fixed algorithm to reach a (fixed) conclusion on the risk of recidivism. Although risk assessment with

actuarial instruments is a simple and time-effective procedure, there are some important disadvantages

to this approach, most importantly with respect to its usefulness in treatment settings where the aim is

reduction of violence risk (see Chapter 1). Based on the criticism of both the unstructured clinical

judgment and the actuarial approach, a new risk assessment approach was developed that integrates

empirical and clinical knowledge. In this SPJ model, the risk assessment is performed by an

Introduction

2

experienced forensic clinician by means of a standardized checklist containing empirically derived risk

factors for (sexual) violence, historical as well as dynamic. The essential difference between the

actuarial and the SPJ approach is in how the final risk judgments are arrived at; in actuarial

instruments by a fixed algorithm and in SPJ guidelines by (structured) human decision-making.

In this thesis, the value of two risk assessment guidelines according to the SPJ model for Dutch

forensic practice is studied; the Historical, Clinical, Risk Management-20 (HCR-20; Version 1;

Webster, Eaves, Douglas, & Wintrup, 1995; Version 2; Webster, Douglas, Eaves, & Hart, 1997b) for

general violence and the Sexual Violence Risk-20 (SVR-20; Boer, Hart, Kropp, & Webster, 1997) for

sexual violence. Research in various populations and settings in different countries has demonstrated

good interrater reliability and predictive validity for the HCR-20 (Douglas, Guy, & Weir, 2005) and to

a lesser extent for the SVR-20 (de Vogel, 2003). However, research into the psychometric properties

of the Dutch versions of the HCR-20 (Philipse, de Ruiter, Hildebrand, & Bouman, 2000) and SVR-20

(Hildebrand, de Ruiter, & van Beek, 2001) is limited. The main goal of this thesis is to examine if the

HCR-20 and SVR-20 are suitable for the prediction of future (sexual) violence in Dutch forensic

psychiatric patients.

Aims of this thesis The aims of this thesis are:

1. To establish the interrater reliability and predictive validity of the Dutch versions of the HCR-20

and SVR-20 in forensic practice in The Netherlands and, thus, to examine if these guidelines are

suitable for the prediction of future (sexual) violence in Dutch forensic psychiatric patients;

2. To explore differences between independent researchers and treating clinicians in performing risk

assessments of the same patients with the HCR-20, and in the accuracy of their predictions;

3. To compare the predictive validity of an actuarial risk assessment instrument for sexual violence

(Static-99; Hanson & Thornton, 1999) to the predictive validity of the SVR-20 in treated Dutch

sexual offenders;

4. To compare the predictive validity of unstructured clinical judgment as stated in hospital staff’s

advices to the court to HCR-20 risk judgments in treated Dutch forensic psychiatric patients;

5. To explore differences between female and male Dutch forensic psychiatric patients regarding

mean HCR-20 scores and interrater reliability and predictive validity of the HCR-20.

Setting All studies in this thesis were conducted at the Dr. Henri van der Hoeven Kliniek, a Dutch forensic

psychiatric hospital with 100 beds and 40 transmural places in Utrecht, a city with 265.000 inhabitants

in the center of The Netherlands. Patients are admitted under the judicial measure

terbeschikkingstelling (tbs) which can be translated as ‘disposal to be treated on behalf of the state’.

Introduction

3

The tbs-order is imposed by court on offenders who have committed a serious violent offense and are

considered to have diminished responsibility for it because of severe psychopathology. Dutch law

requires that at least two mental health experts from different disciplines report on the defendant (the

so-called ‘pro Justitia’ reports) before the trial court can decide to impose the tbs-order. The main goal

of the tbs-order is to protect society from high risk offenders, directly through the mandatory

admission of these offenders to secure forensic psychiatric hospitals and indirectly through treatment

aimed at reducing violence risk. The tbs-order is of indefinite duration; every one or two years the

court re-evaluates the patient to determine whether the risk of recidivism is still too high and treatment

needs to be continued. The hospital staff provides the court with a detailed evaluation of a patient’s

treatment and gives their judgment about the risk of recidivism. The decision to terminate the tbs-order

can only be made by the court.1

The Dr. Henri van der Hoeven Kliniek was founded in 1955 and is one of 13 forensic psychiatric

institutions in The Netherlands. The hospital admits both men and women and provides a variety of

treatment activities, for instance, psychotherapy, job training, education, sports, and creative arts. The

past five years, farmacotherapy has gained a substantial role in treatment, for example, in sexual

offenders. The treatment model of the hospital is eclectic with an emphasis on sociotherapy and

relapse prevention. The ‘no cure but control’ principle dominates (Laws, Hudson, & Ward, 2000). The

emphasis of treatment is not on changing the personality of the offender, but on reducing / managing

risk factors for recidivism. The patients live together in living groups supervised by group leaders and

have a shared responsibility for their environment. A group of patients (one delegate per living group)

and hospital staff meet daily in the so-called hospital council in which they discuss, among others,

safety issues in the hospital. The current average length of treatment is about five years; about 3 years

of residential treatment and about 2 years of transmural treatment (Dr. Henri van der Hoeven

Stichting, 2003). During treatment, patients can gradually gain more liberties. When hospital staff

considers it feasible, a patient can apply for supervised leaves, and subsequently for unsupervised

leaves. Ideally, these gradual expansions of freedom finally result in a resocialisation phase, called

transmural treatment. During the transmural phase, the patient lives outside the hospital, but is still

treated and supervised by a specialized team from the hospital.

The majority of the patients (about 85%) in the hospital suffer from one or more Axis II personality

disorders (according to the fourth edition of the Diagnostic and Statistical Manual of Mental disorders;

DSM-IV; APA, 1994), about 5% suffer from a pure DSM-IV Axis Ι disorder (excluding substance

abuse / dependence), and about 25% suffer from both Axis I Axis II disorders.2 A history of substance

abuse is very common, about 70% of the patients has abused substances, of which 21% drugs, 20%

alcohol, 51% alcohol and drugs, and 8% other multiple substances, for instance, alcohol and

1 See de Ruiter and Hildebrand (2003) for more detailed information on the Dutch tbs-order. 2 These numbers are based on a database of 155 patients admitted to the hospital between 1995 and 2004. See Hildebrand and de Ruiter (2004) for details.

Introduction

4

medication. About 25% of the patients are from non-Dutch origin, mostly Moroccan or Surinam. The

proportion of female patients in the hospital is about 15%. Female patients do not stay on a separate

ward but reside among the men in living groups, although the policy is that there should be at least two

women in one living group. There are specific treatment activities for female patients, such as female

sports and a therapy group that meets weekly. In this therapy group, themes relevant to female patients

are discussed, for instance, what it is like to live in a predominantly male environment, victimization,

and sexuality.

This dissertation research started in January 2001. All researchers, treatment supervisors, group

leaders and social workers of the Dr. Henri van der Hoeven Kliniek were trained during one-day

workshops in coding the HCR-20 and SVR-20 and started to use these checklists as of May, 2001.

Before May 2001, structured risk assessments were not conducted on a regular basis in the hospital.

Decisions regarding extensions of freedom, such as unsupervised leave or start of transmural phase

were made collectively in staff meetings. This staff meeting is scheduled every day from 9.00 until

10.00 a.m. and staff members of all disciplines are invited to participate in it. The procedure is: the

patient writes a proposal in which he or she describes the requested leave (e.g., location, time period,

frequency), treatment progress, the relationship with his / her living group, hospital staff and family,

self-perceived risk factors, and risk of recidivism or risk of escape during leave. Subsequently, the

patient’s treatment team and living group as well as the hospital council evaluate this proposal and

advise the hospital staff who decides in the staff meeting. Finally, the Ministry of Justice reviews a

proposal written by the patient’s treatment team and gives formal permission for the requested leave.

Since the implementation of structured risk assessment in 2001, decisions regarding extensions of

leave are still made collectively in the staff meetings, but it became a preset condition that a structured

risk assessment with the HCR-20 (and in case of sexual offenders also with the SVR-20) must have

taken place on beforehand (see Chapter 2 and 6 for details regarding how risk assessments are

performed). As of July 2005, the Ministry of Justice mandates all forensic psychiatric institutions in

The Netherlands to perform a structured risk assessment before unsupervised leave or discharge of

patients and to base their judgment on these structured methods.

Thesis outline Chapter 1 provides an outline of the state of the art with respect to risk assessment of future (sexual)

violence. Three approaches to risk assessment are discussed, as well as the actuarial - clinical debate.

Furthermore, research is discussed on violence risk factors in four specific forensic psychiatric

subgroups, i.e., patients with a major mental disorder, personality disordered patients, sexual offenders

and female offenders. Violence risk management and violence risk communication are also reviewed.

In Chapter 2, the interrater reliability of the Dutch HCR-20 as coded by three raters is established in a

group of 60 patients admitted to the Dr. Henri van der Hoeven Kliniek. Differences between treating

Introduction

5

clinicians and researchers in coding the HCR-20 are explored. In this study, we also examined if

clinicians’ feelings towards their patients as measured by the Feeling Word Checklist (FWC; Whyte,

Constantopoulos, & Bevans, 1982) were related to their risk assessments.

Chapter 3 presents a retrospective study into the predictive validity of the SVR-20 and the Static-99,

an actuarial risk assessment instrument for sexual violence in a group of 122 sexual offenders who

were admitted to the Dr. Henri van der Hoeven Kliniek between 1974 and 1996.

In Chapter 4, a retrospective study is presented into the predictive validity of the HCR-20 in a group of

120 patients who were discharged between 1993 and 1999 from the Dr. Henri van der Hoeven Kliniek.

In this study, we compared the unstructured clinical judgment of risk of violent recidivism as stated in

the hospital staff’s advice to the court to HCR-20 numerical scores and final risk judgments.

Chapter 5 examines differences between 42 female forensic psychiatric patients and a matched group

of 42 male forensic psychiatric patients from the hospital regarding the mean HCR-20 scores,

interrater reliability and predictive validity for violent outcome.

Chapter 6 further explores differences between treating clinicians and researchers in performing risk

assessments by linking their HCR-20 scores and final risk judgments of 127 male patients to incidents

of physical violence during treatment. In this prospective study, the predictive validity was established

not only for the HCR-20 codings of the different raters, but also for the codings that were agreed upon

by the researchers and clinicians in consensus meetings.

Finally, in Chapter 7, the main findings of the studies in this thesis are discussed and recommendations

for (sexual) violence risk assessment in forensic practice and future research are provided.

Violence risk assessment: State of the art 11

Violence risk assessment: State of the art

9

CHAPTER 1


The concept of violence risk assessment There are many definitions on the concept of violence. In this thesis, the definition of violence

provided in the HCR-20 is adopted: “violence is actual, attempted, or threatened harm to a person or

persons” (Webster et al., 1997b, p. 24). This definition is flexible and allows users to capture a broad

spectrum of violent acts and to pare down conceptualization into meaningful categories (Douglas,

Cox, & Webster, 1999). The definition of sexual violence is adopted from the SVR-20: “sexual

violence is actual, attempted, or threatened sexual contact with a person who is nonconsenting or

unable to give consent” (Boer et al., 1997, p. 9).

The term ‘dangerousness’ has often been used in relation to the prediction of violence. Dangerousness

is a well-accepted legal concept, however, it is debatable whether the concept is clinically useful,

because it is an ‘all or nothing term’ and difficult to operationalize (Prins, 1996; Snowden, 1997;

Steadman et al., 1993). Otto (2000) stated that predicting ‘dangerousness’ is typically focused on

individual factors while ignoring environmental factors and with little consideration of the possibility

for change over time. The past two decades, there has been a reconceptualization from the legal

concept of dangerousness to the concept of risk that implies predictions on a continuum rather than

simply being a dichotomy, and decision-making (risk management) (Steadman et al., 1993). Snowden

(1997) argued that risk is a more attractive term than dangerousness, because it does not contain

pejorative connotations and invites more objective and robust analysis as it leads to questions such as,

‘risk of what?’ and ‘what is the severity and frequency of risk?’.

Violence risk assessment can be described as the process of evaluating individuals to characterize the

risk that they will commit violence in the future and to develop interventions to manage or reduce that

risk (Hart, 2001b; Monahan, 1981). Risk assessment is not only a central concept in the field of

In this chapter, the literature on risk assessment for future (sexual) violence inadult forensic psychiatric patients is reviewed. First, the concepts of violence,violence risk and violence risk assessment are delineated. Subsequently, three approaches to risk assessment are discussed, i.e., unstructured clinical judgment,actuarial judgment and structured professional judgment (SPJ). Two SPJguidelines - the HCR-20 and SVR-20 - are described. Next, specific risk factors and the implications for violence risk assessment are set out for four differentforensic psychiatric groups: patients with a major mental disorder, patients witha personality disorder, patients who committed sexual offenses and femalepatients. Finally, two topics are discussed that have not yet received much empirical attention; violence risk communication and violence risk management.

Chapter 1

10

forensic psychiatry, but also in, for instance, business, insurance, medicine and engineering (Bernstein,

1996). The clinical task of violence risk assessment is to understand how and why people chose to act

violently and to determine whether these or possibly other factors may lead them to make similar

choices again (Hart, 2001a, 2001b). Hart has described some more specific goals of violence risk

assessment for clinical practice; to prevent violence, guide intervention, improve consistency and

transparency of decisions, and protect clients’ rights. Another important point made by Hart is that the

concept of violence risk is multi-faceted. Violence risk assessment should not only consider the

likelihood of violence, but also the nature, severity, imminence, and frequency of violence.

Some general notes on the concept of violence risk assessment and the use of violence risk assessment

instruments should be made here. First, violence risk assessment inherently bears uncertainties.

Bernstein (1996) defined the concept of risk as a hazard that is incompletely understood and whose

occurrence therefore can be forecasted only with uncertainty. Second, violence risk also depends on

the context in which the person lives. Hart (2001a, 2001b) stated that we can never know someone’s

risk to act violently for sure, we can merely estimate it assuming certain contexts. For instance,

violence risk assessment for a patient in a structured setting like a forensic psychiatric hospital is quite

different from violence risk assessment for a patient who is on parole and living in the community.

Third, several authors have raised ethical or moral issues with respect to violence risk assessment and

the use of violence risk assessments. Grisso and Appelbaum (1991) stressed that clinicians who have

to testify before court about the risk of a person should consider the potential effects of their testimony

and question whether the law’s use of their testimony violates their professional ethical standards. A

fourth note is a concern that is often raised by lawyers, i.e., that the accuracy of structured risk

assessment tools is affected by the quality of the data used to complete the instrument (Layde, 2004).

Price (1997) emphasized that risk assessors should be aware that clinical records and correctional files

contain numerous errors.

Approaches to violence risk assessment

Unstructured clinical judgment1

The unstructured clinical judgment relies on human judgment and is based on informational

contemplation and clinical expertise of a mental health professional. Grove and Meehl (1996) have

described the unstructured clinical judgment as informal, subjective and impressionistic. In

unstructured clinical judgment there are no constraints on the evaluation process (any information can

be considered and gathered in any manner) and decisions (results can be weighed, combined and

communicated in any manner) (Hart, 2001b; Meehl, 1996). Some clinicians might base their judgment

on results of psychodiagnostic instruments, clinical interviews or reviewing personal / situational

factors of the patient (this is also referred to as anamnestic risk assessment, see Otto, 2000), whilst

1 By some referred to as unaided or unguided clinical judgment (e.g., Hanson & Morton-Bourgon, 2004).


11

others prefer to rely on their professional skills and personal experiences with the patient. Clinical risk

assessment can be done by individual clinicians or by teams. Murphy (2002) argued that collective

clinical judgment is the optimal method of assessing risk, because group processes of debate, dialogue

and reflection ameliorate the risk assessment. Advantages of the unstructured clinical judgment are

flexibility, a focus on the person (idiographic) and violence prevention, and minimal costs in terms of

time and other resources (Hart, 1998b; Kropp, Hart, & Lyon, 2002; Snowden, 1997). The unstructured

clinical judgment method of violence risk assessment is well known and widely used.

Criticism of the unstructured clinical judgment approach

The unstructured clinical judgment approach has been criticized on a number of grounds. One of the

most important criticisms is that there is no systematic empirical support for the validity of the

unstructured clinical judgment (Faust & Ziskin, 1988; Hart, 2001b; Meehl, 1996; Monahan, 1981).

The foundation of unstructured clinical judgment is unclear and therefore not verifiable. In 1981,

Monahan published his influential book The clinical prediction of violent behavior in which he

reviewed studies into the clinical prediction of violence published up to that date. From these studies,

Monahan identified several disadvantages of the unstructured clinical judgment, such as lack of

consistency across clinicians regarding how assessments are conducted and how and on what grounds

decisions regarding risk are reached. He found that, in general, clinicians tend to overpredict the risk

of violence (see also Belfrage, 1998b; Doren, 1998; Webster, Harris, Rice, Cormier, & Quinsey, 1994)

and ignore empirical knowledge on base rates2 of violent recidivism and situational cues in their risk

assessments. Base rates vary widely across different populations and mental health professionals

should be cognizant of these base rates in order to make a good violence risk judgment in individual

cases (Doren, 1998). Monahan concluded that “psychiatrists and psychologists are accurate in no more

than one out of three predictions of violent behavior over a several year period” (p. 92). Based on his

review, Monahan argued that clinicians should systematize their predictions. He also provided

suggestions for structured risk assessment, for instance, listed variables such as previous violence,

which might have power as actuarial predictors. Furthermore, Monahan (1981, 1984, 1988) opted for

‘second generation’ research into risk assessment and set up the MacArthur Risk Assessment Study, a

multi-site longitudinal study following psychiatric inpatients after discharge into the community with

the goal to establish robust markers for violence risk (see Monahan et al., 2001; Steadman et al.,

1994).

Researchers have heeded Monahan’s calls; since the 1980s more systematic research into the (clinical)

prediction of violence was conducted (Douglas & Webster, 1999a). Recent studies into the predictive

accuracy of unstructured clinical judgment have demonstrated more optimistic results compared to

Monahan’s review (1981). However, although these studies have demonstrated unstructured clinical

2 The base rate is the prevalence of the defined behavior (e.g., violence) within a defined population over a defined time period.

Chapter 1

12

accuracy to be significantly better than chance, especially for short term predictions (Binder, 1999),

unstructured clinical judgment is still liable to systematic biases. For example, clinicians were found to

be accurate in predicting risk of recidivism in white male cases with a violent history, but less accurate

in predicting risk of violence in female psychiatric patients (underestimation of risk) and nonwhite

men (overestimation of risk) (Lidz et al., 1993; McNiel & Binder, 1995). A prospective study in a

sample of 183 male forensic psychiatric patients demonstrated that clinicians were significantly more

accurate than chance in predicting violence, but failed to use the dual diagnosis of schizophrenia and

substance abuse as a predictor (Hoptman, Yates, Patalinjug, Wack, & Convit, 1999). Skeem, Mulvey

and Lidz (2000) found that clinician’s violence predictions were significantly, but only moderately

more accurate than chance and clinicians were not able to discriminate between patients who are likely

to become violent during periods in which they drink from those who are not.

Other criticisms of the unstructured clinical judgment method are assigning nonoptimal weights to

cues, failure to properly assess covariation and lack of feedback on the accuracy of clinicians’

judgments (Garb, 1998; Grove, Zald, Lebow, Snitz, & Nelson, 2000). Dernevik et al. (2001) argued

that the unstructured clinical judgment is liable to cognitive biases such as the conjunction fallacy

(tendency to correlate information intuitively rather than by laws of probability) and the illusory

correlation (tendency to view unrelated events as correlated or overestimate the significance of a weak

correlation). Furthermore, irrational influences or problems concerning information processing and

decision-making, for instance, the relatively small capacity of working memory, the amount of

information, training and experience may influence the accuracy of unstructured clinical judgment (see

Dernevik, Falkheim, Holmqvist, & Sandell, 2001). Harris, Rice and Cormier (2002) suggested that

clinicians have just too much information at their disposal and without statistical aid have no way to

ascertain which information is relevant to risk of violence.

Studies that examined how to optimize accuracy and validity of the unstructured clinical judgment

found that multi-disciplinary consensus predictions were more accurate than individual predictions

(Fuller & Cowan, 1999). Huss and Zeiss (2004) found that individual clinicians had poor ability to

predict violence, but that the risk assessments are much improved when aggregated as “group”

decisions. It should be noted that in this study, the clinicians did not actually meet and discuss; their

ratings were aggregated by researchers. McNiel, Lam, and Binder (2000) examined if the predictive

accuracy of clinical assessments of violence risk improved when there was agreement between

multiple clinicians (physicians and nurses). This study found that when two clinicians reached similar

conclusions these were more accurate in predicting violence than the conclusions of either clinician

alone when their assessments disagreed. When clinicians have a high degree of confidence in their

violence risk evaluation, the predictive accuracy of the evaluation is stronger (McNiel, Sandberg, &

Binder, 1998).

Summarizing, although the conclusions of Monahan (1981) seem to be too pessimistic and more

recent studies have demonstrated unstructured clinical judgment to predict significantly better than


13

chance, there is ample room for improvement of the unstructured clinical judgment. The use of

standardized, validated risk assessment tools is recommended for better validity and reliability of

violence risk assessment and for successful treatment of mentally disordered patients (Borum, 1996;

Kropp & Hart, 1997; Müller-Isberner & Hodgins, 2000).

Actuarial judgment

The actuarial judgment method is described as mechanical and algorithmic (Grove & Meehl, 1996)

and has a strong empirical basis. The items in actuarial instruments are selected on the basis of their

association with the outcome (i.e., violent recidivism) as found in empirical studies. In these types of

risk assessment instruments, the items are weighed and combined according to a fixed algorithm in

order to reach a conclusion regarding the likelihood of violence over some period of time. It should be

noted that actuarial instruments are specifically designed to predict violence and not to provide a

comprehensive assessment of violence risk, since it neglects potentially relevant variables like

dynamic risk factors (see Hanson & Thornton, 2000). Examples of actuarial instruments are the

Violence Risk Appraisal Guide (VRAG; Harris & Rice, 1997; Harris, Rice, & Cormier, 1993) for

violent behavior, the Rapid Risk Assessment of Sexual Offense Recidivism (RRASOR; Hanson, 1997)

and the Static-99 / Static-2002 for sexual violence (Hanson & Thornton, 1999, 2002). Another

example of actuarial risk judgment is the Iterative Classification Tree (ICT) method that is derived

from empirical data of the MacArthur Risk Assessment Study (see Monahan et al., 2001; Steadman et

al., 2000). This method uses two thresholds to classify a person as high-risk or low-risk based on a

sequence established by a classification tree. In this sequence, a question is asked and contingent upon

the answer the next question is posed and so on, until the person is classified as high or low risk.

Advantages of the actuarial judgment method are standardization, transparency, objectivity, and

empirical support for the inclusion of risk factors. Besides, risk assessment with actuarial instruments

is a rather simple and time-effective procedure that usually does not require specific training. Hanson

and Thornton (2000) argued that simple, actuarial instruments can be a cost-effective option for

decision-makers with limited information or resources. Research has demonstrated that, in general,

actuarial risk judgment has high predictive validity (Harris, Rice, & Cormier, 2002, see also The

clinical-actuarial debate, p. 15).

Criticism of the actuarial judgment approach

The actuarial judgment approach has been criticized, most importantly on the practical use of the

instruments in treatment settings where the aim is reduction of the risk of recidivism. A first point of

criticism is that most of the actuarial instruments focus on static risk factors and do not include

Chapter 1

14

situational or dynamic risk factors.3 The focus on static risk factors can lead to pessimism in both

mental health professionals and patients because it results in a life-time risk judgment implying that

the patient’s risk of violence can never change for the better. A second point of criticism is that

actuarial instruments are primarily empirically-based and do not examine or provide insight into

causes of behavior since they are a-theoretical4 (Grubin, 1997b; Krauss, Sales, Becker, & Figuered,

2000). Due to their primary emphasis on empiricism, actuarial instruments possibly include risk

factors which are not causally related to violence and / or are unacceptable on legal grounds.

Moreover, it can lead to exclusion of possible important risk factors that are logical but of unknown

validity because they have hardly been empirically studied (e.g., homicidal ideation or intent) (Hart,

1998b). Hart (2001b) has demonstrated this problem by constructing a fictitious risk assessment

instrument containing four items that have been demonstrated to be significantly associated with

violence: young age, being a male, dense facial hair, and big feet. Although these items have an

empirical basis, it is obvious that they cannot provide insight into the question why a person is violent

nor be used for realistic intervention strategies. A third point of criticism is that actuarial instruments

specifically aim at predicting the likelihood of violence. This focus on the probabilistic facet of risk

ignores important questions regarding the nature, frequency, severity and imminence of violence.

Again, this limits the practical use of actuarial instruments. Fourth, generalization towards populations

other than the type of samples in which the instrument was developed is limited (Grubin & Wingate,

1996; Hart, 1998b; Price, 1997). Fifth, most actuarial risk assessment instruments consist only of risk

factors and disregard protective factors (Rogers, 2000). Finally, when actuarial instruments are used

in, for instance, the court, it may lead to pseudo-objectivity / pseudo-science (Hart, 2001b). Grubin

(1997b) stated that results of actuarial instruments may be easily misinterpreted by court and that this

can lead to wrong conclusions, for instance, judging someone as low risk because of low scores on

actuarial instrument, whilst individual risk factors and important issues such as homicidal ideation are

neglected.

Evaluation of actuarial judgment in practice

Despite the relatively high predictive accuracy of actuarial instruments, clinicians tend to avoid using

them, mostly because they question the relevance of statistical findings for their daily clinical practice

(Gardner et al., 1996; Webster & Cox, 1997). Elbogen, Calkins Mercado, Scalora and Tomkins (2002)

conducted a study to examine how mental health practitioners perceive the utility of empirically

derived risk factors and actuarial methods in clinical practice and if they are willing to use actuarial

instruments. They asked 134 clinicians in actual practice to complete surveys in which they rated the

3 Two exceptions are the Level of Service Inventory-Revised (LSI-R; Andrews & Bonta, 2000) and the Sex Offender Need Assessment Rating (SONAR; Hanson & Harris, 2000) which was developed as a supplement to the Static-99. 4 Two exceptions are the LSI-R that is based on the theoretical model of social learning and the SONAR that is based on social cognitive theory.


15

relevance of empirically derived risk factors (mainly adopted from the MacArthur Risk Assessment

Study) and additional behavior factors. They found that although clinicians were aware of and

perceived empirical risk factors to be relevant, nearly every clinician perceived dynamic, behavioral

variables to be significantly more relevant than research-based factors. Thus, although clinicians

indicated they wanted to use the actuarial risk factors, they have difficulty integrating this information

with clinical variables that are potentially subject to daily change. In a study of 187 cases into the

influence of actuarial risk assessment on legal decisions, it was found that the best predictor of tribunal

decision was the senior clinician’s testimony (unstructured clinical judgment) and not the actuarial risk

score on the VRAG (Hilton & Simmons, 2001). There was no significant association between the

actuarial scores and the tribunal decision nor between the actuarial scores and the clinician’s

testimony.

The clinical-actuarial debate

Since Meehl (1954) first brought the controversy to light, there has been an ongoing debate between

empirically driven actuarial and clinical decision-making (Douglas & Webster, 1999a). Many authors

state that the bulk of evidence supports static variables as most strongly predictive of violence and that

the use of actuarial instruments improves the consistency, reliability and validity of violence risk

assessments (Douglas, Cox et al., 1999; Harris et al., 2002; Hart, 1998b). Research comparing

actuarial judgment to unstructured clinical judgment has demonstrated its superiority in predicting

violence (Dawes, Faust, & Meehl, 1989; Gardner, Lidz, Mulvey, & Shaw, 1996; Grove & Meehl,

1996; Grove et al., 2000; Harris et al., 2002; Monahan et al., 2001). Mossman (1994) reviewed 44

published studies into violence prediction and concluded that although unstructured clinical judgment

was substantially more accurate than chance, past behavior alone was a better long-term predictor of

future violent behavior than unstructured clinical judgment. Grove and Meehl (1996) performed a

meta-analysis on 136 studies comparing the accuracy of actuarial and unstructured clinical judgment

in predicting violence and found 64 studies in favor of the actuarial judgment, 8 studies in favor of the

unstructured clinical judgment and 64 studies that found no difference between the two approaches.

Hanson and Morton-Bourgon (2004) reviewed research into recidivism risk in sexual offenders and

concluded that actuarial risk instruments were consistently more accurate than unstructured clinical

judgment in predicting sexual recidivism. Harris and colleagues (2002) reviewed the literature

regarding risk assessment methods and concluded that the evidence favouring the use of static risk

factors and the actuarial judgment method is increasing. They stated that available evidence suggests

that dynamic variables contribute little to the question who poses the greatest violence risk, but might

be valuable in predicting when a person is likely to act violently.

Others have argued that the superiority of actuarial methods has not yet been convincingly proven

(e.g., Litwack, 2001). With the current body of research it is still difficult to make meaningful

comparisons of the unstructured clinical and the actuarial judgment approaches (Hart, Laws, & Kropp,

Chapter 1

16

2003; Sturdisson, Haggård-Grann, Lotterberg, Dernevik, & Grann, 2004). An important objection to

studies into the predictive validity of unstructured clinical judgment is that they were not optimally

designed to assess the reliability and validity of risk assessments, for instance, because they used

vignettes rather than actual cases (Litwack & Schlesinger, 1999; Sturdisson et al., 2004). Furthermore,

Litwack (2001, 2002) argued that a clinical assessment of violence risk is not equivalent to a

prediction of violence, because clinical assessment aims at decreasing risk and preventing violence and

thus cannot be compared to actuarial prediction. Hart (1998a) stated that predictions of violence are

not passive assessments, but decisions that influence services delivered to individuals: “Clinicians are

bound - morally, ethically, and legally - to try to prove themselves wrong when they predict violence

and take every reasonable action to prevent violence” (p. 365). Litwack (2002) made a strong plea for

more descriptive, qualitative and narrative studies of both clinical and actuarial violence assessments

in order to better evaluate and compare the two approaches.

Several authors have argued that both approaches have value for violence risk assessment in clinical

practice and there should not be a debate between reseachers and clinicians, but rather an integration

of unstructured clinical and actuarial judgment (Blumenthal & Lavender, 2000; Borum, 1996;

Douglas, Cox et al., 1999; Ottto, 2000; Pagani & Pinard, 2001; Sreenivason, Kirkish, Garrick,

Weinberger, & Phenix, 2000; Webster & Cox, 1997; Webster et al., 1997b). Douglas, Cox, and

Webster (1999) plead for adoption of the scientist-practitioner model which posits that the practice of

psychology should be informed by scientific research. They advise mental health professionals to

critically evaluate empirical research into violence risk assessment and adopt reliable and valid

findings that could be reasonably expected to generalize to the setting at hand.

Structured professional judgment (SPJ)

Recognizing the criticism of both the unstructured clinical judgment and actuarial judgment approach,

several authors have opted for an integration of both approaches (Borum, 1996; Douglas, Cox et al.,

1999), by some referred to as ‘third generation research’ (Fuller & Cowan, 1999). Webster and

colleagues (1997b) stated that otherwise there is a strong chance that the world of the clinic and the

world of research will continue on their separate courses. Therefore, in the mid-nineties, researchers

from Simon Fraser University in Vancouver, Canada developed the structured professional judgment

(SPJ) model. Their goal was to bridge the gap between clinical practice and empirical knowledge by

developing a guideline for violence risk assessment that standardizes the clinical judgment – and thus

increases interrater reliability and validity – and can be used by mental health professionals in their

day-to-day practice and responsibilities. They believed that these guidelines should have a firm

connection to the scientific literature, be empirically based and testable but also have grounding in

clinical reality. Other important aspects in the development of the guidelines were that they contain

and integrate both static and dynamic risk factors, are easy to administer, understand and score and

that they provide suggestions for risk management. The SPJ model was specifically designed to


17

prevent instead of predict violence (Hart, 2001b). The use of structured guidelines in clinical practice

can improve transparency and consistency in decision-making processes regarding, for instance, leave

or discharge or needed level of security.

The HCR-20 for the assessment of future violence risk was one of the first checklists according to the

SPJ model that was published and is now widely used for research purposes and clinical practice in

different countries. Researchers from several countries, for instance, Sweden, Germany, France and

The Netherlands have translated and adapted the HCR-20 for use in their country. Research in

different settings and countries has demonstrated that the HCR-20 can be used reliably and validly (see

for an overview of studies Douglas et al., 2005). Associated guidelines are the SVR-20 for the

assessment of sexual violence risk in adults, the Spousal Assault Risk Assessment guide (SARA;

Kropp, Hart, Webster, & Eaves, 1999) for the assessment of relational violence risk, the Structured

Assessment of Violence Risk in Youth (SAVRY; Borum, Bartel, & Forth, 2002) for the assessment of

violence risk in adolescents, the Early Assessment Risk List for Boys (EARL-20B; Augimeri, Koegl,

Webster, & Levene, 2001) for the assessment of antisocial behavior in boys up to the age of 12, and

the Early Assessment Risk List for Girls (EARL-21G; Levene et al., 2001) for the assessment of

antisocial behavior in girls up to the age of 12. The guidelines are developed from a thorough

consideration of the empirical literature and the clinical expertise of a number of forensic mental

health professionals and should all be seen as ‘work in progress’. The authors strive for refinement and

further development of the guidelines in the future, based on, for instance, developments in the

scientific and professional literature or feedback from users of the guidelines. The guidelines are not to

be seen as formal psychological tests but guides that structure the assessment of risk; the terms

checklists and aide-mémoire are often employed to emphasize the idea behind the SPJ model.

Furthermore, the guidelines are to be viewed as checklists that encourage discussion among and

between colleagues - mental health professionals, researchers and administrators - rather than being

viewed as capable of offering violence risk estimates (Webster et al., 1997b). The guidelines are not

meant to be a substitute for theoretical or clinical knowledge on the nature, causes, prediction and

management of violence and on the relationship between psychopathology and violence. An important

aspect of the SPJ model is that the rater has to follow the coding procedure as described in the manuals

of the guidelines. In the description of the HCR-20 (see below), this coding procedure is described.

The design and coding procedure of the associated guidelines are highly comparable.

The HCR-20

The HCR-20 is a structured professional guideline designed for the assessment of risk of future

violence in adult offenders with a violent history and / or a major mental disorder and / or personality

disorder. The HCR-20 consists of 20 items, divided into three subscales: Historical scale, Clinical

scale and Risk management scale that relate to risk factors in the past, present and future, respectively

Chapter 1

18

(see Table 1 and Appendix I). The Historical items are static, unchangeable factors5, whilst the

Clinical and Risk management factors are considered to be changeable, for instance, due to clinical

intervention. The items have to be coded on a three point scale: ‘0’ item does not apply according to

the available information,‘1’ the item probably or partially applies, and ‘2’ the item definitely applies.

Information needed to code the HCR-20 includes, for example, criminal records / police files,

psychological reports, interviews with significant others, and observations and is preferably from

different sources and gathered with different methods. Aside from the 20 items, the HCR-20 offers the

possibility to code other considerations, that is, case-specific risk factors that do not fit within the item

descriptions.

Table 1. HCR-20 items

Historical items

Clinical items

Risk management items

1.

Previous violence

11.

Lack of insight

16.

Plans lack feasibility

2. Young age at first violent incident 12. Negative attitudes 17. Exposure to destabilizers

3. Relationship instability 13. Active symptoms of major 18. Lack of personal support

4. Employment problems mental illness 19. Noncompliance with

5. Substance use problems 14. Impulsivity remediation attempts

6. Major mental illness 15. Unresponsive to treatment 20. Stress

7. Psychopathy

8. Early maladjustment

9. Personality disorder

10. Prior supervision failure

Note. From Webster et al. (1997b).

The final risk judgment has to be judged as low, moderate, or high and is valid for a specific time

period, for instance, during a specific treatment phase or for a given context (e.g., inpatient versus

outpatient). The key question for the low, moderate or high judgment is: what level of effort, attention,

and intervention is required to prevent this person from perpetrating violence? The risk judgment not

only depends on the simple summation of item scores, but also on specific combinations of factors or

other considerations. For example, research has demonstrated that the combination of a diagnosis of a

major mental disorder and substance abuse severely increases the risk of violence (e.g., Swanson,

1994, see further Specific risk factors for violence in mentally disordered patients, p. 25). In some

cases, only one or two items may be sufficient to justify the judgment high-risk, for example, when a

patient has active psychotic symptoms (e.g., auditory command hallucinations that instruct the patient

5 This is not completely true, Historical items can change in an unfavorable way, for instance, item 10 when a patient violates the rules by escaping from the hospital.


19

to commit homicide). The final risk judgment can be considered as a structured professional judgment

that is arrived at through the process of coding the checklist and integrating all available information.

The HCR-20 has to be coded by an experienced forensic mental health professional, who should use

all available information on the offender. The rater should be trained and experienced in interviewing,

administrating and interpreting standardized tests and be cognizant of the empirical and professional

literature on the nature, causes, prediction and management of violence. In addition, the rater should

have received training in coding the HCR-20 with case examples and also be trained in coding the

PCL-R or the Psychopathy Checklist:Screening Version (PCL:SV; Hart, Cox, & Hare, 1995), because

the HCR-20 and several of the associated guidelines include the item ‘Psychopathy’ as measured by

the PCL-R or PCL:SV. The HCR-20 item descriptions in the manual should be read by the rater at

every individual risk assessment. The authors further stress that the HCR-20 should be employed and

interpreted with great caution and in consultation with the authors or other colleagues familiar with

this and related schemes. Collaboration among colleagues - both researchers and mental health

professionals - is strongly advised. They need to spend a considerable amount of time in discussing the

meaning of the items.

The coding of the HCR-20 should be viewed as a first step in the assessment process, the guideline is

neither exhaustive nor fixed. In any given risk assessment, there can be additional, case-specific risk

factors that are important. Essential questions for the raters who code the HCR-20 are not only about

the likelihood of future violent behavior, but also questions regarding the nature, imminence, severity

and frequency of violence (Hart, 2001a). For example, what kind of violence might this person

commit?, what would be the physical harm to victims?, who is a likely victim?, how soon might the

violence occur?, what can be done to prevent the violence?, and are there possible protective factors

that can diminish the risk of violence? (see Appendix II). Finally, the authors of the HCR-20 do not

consider the risk of future violence as a stable characteristic of a person. Therefore, the guideline

contains dynamic risk factors that are changeable, for instance, due to clinical intervention. Repeated

measurements with the HCR-20, for instance, when the context of the assessment changes, are

recommended to monitor a person’s risk of violence and to provide mental health professionals with

the opportunity to monitor treatment progress.

Research in various psychiatric and forensic settings in different countries - mainly Canada, the USA

and Sweden (see also the special issue of Psychology, Crime and Law; Hart, 2002) - has demonstrated

good interrater reliability and predictive validity for the HCR-20 (Belfrage, 1998a; Belfrage, Fransson,

& Strand, 2000; Douglas, Ogloff, Nicholls, & Grant, 1999; Strand, Belfrage, Fransson, & Levander,

1999).6 For example, Douglas, Ogloff and Hart (2003) found good predictive validity for the HCR-20

in a sample of 100 forensic psychiatric patients. Moreover, they demonstrated that the structured final

risk judgments added incremental validity to the HCR-20 used in a numerical sense, i.e., the simple

6 See Douglas et al. (2005) for a comprehensive review of studies into the HCR-20.

Chapter 1

20

addition of item scores. The same was found for the SARA (Kropp & Hart, 2000) and the SVR-20

(Dempster, 1998). Recently, Douglas, Yeomans and Boer (in press) carried out a study to compare the

predictive accuracy of the HCR-20, two actuarial instruments - the VRAG and the Violent Offender

Risk Assessment Scale (VORAS; Howells, Watt, Hall, & Baldwin, 1997) - the PCL-R and the

PCL:SV. They found strong support for the validity of the VRAG, HCR-20 total score and HCR-20

final risk judgment. In this study, multivariate analyses showed that the HCR-20 was more

consistently related to violence than the other measures. Research demonstrated that changes in risk

during clinical intervention can be measured with the Clinical and Risk management items of the

HCR-20 (Belfrage & Douglas, 2002). Regarding the Dutch version of the HCR-20 only one study was

published so far. Philipse, van Erven, and Peters (2002) conducted a retrospective study in a group of

69 forensic psychiatric patients and found that particularly the Historical factors and the final risk

judgment predicted relapse in violent offending.

The SVR-20

The SVR-20 is a structured professional guideline designed for the assessment of risk of sexual

violence in adult sexual offenders. The SVR-20 consists of 20 items, divided into three domains:

Psychosocial adjustment, Sexual offenses and Future plans (see Table 2 and Appendix II), that have to

be coded by an experienced forensic mental health professional. The coding procedure of the SVR-20

is similar to the procedure of the HCR-20.

Table 2. SVR-20 items

Psychosocial adjustment

Sexual offenses

Future plans

1.

Sexual deviance

12.

High density sex offenses

19.

Lacks realistic plans

2. Victim of child abuse 13. Multiple sex offense types 20. Negative attitude toward

3. Psychopathy 14. Physical harm to victim(s) in sex offenses intervention

4. Major mental illness 15. Uses weapons or threats of death in

5. Substance use problems sex offenses

6. Suicidal / homicidal ideation 16. Escalation in frequency or severity of

7. Relationship problems sex offenses

8. Employment problems 17. Extreme minimization or denial of

9. Past nonsexual violent offenses sex offenses

10. Past nonviolent offenses 18. Attitudes that support or condone

11. Past supervision failure

sex offenses

Note. From Boer et al. (1997).


21

Dempster (1998, see also Dempster & Hart, 2002) examined the predictive validity of five risk

assessment instruments: the SVR-20, PCL-R, VRAG, Rapid Risk Assessment of Sexual Offense

Recidivism (RRASOR; Hanson, 1997) and the Sex Offender Risk Assessment Guide (SORAG;

Quinsey, Harris, Rice, & Cormier, 1998). These instruments were rated for 95 sexual offenders from

several correctional institutions in Canada. All instruments predicted general violence, but only the

RRASOR and the SVR-20 final risk judgment were significant in predicting sexual violence. The

SVR-20 final risk judgment had incremental predictive validity relative to the SVR-20 actuarial

scores. Sjöstedt and Långström (2002) compared four risk assessment instruments, including the SVR-

20 in a sample of 51 rapists and found a significant relation between the subscale Psychosocial

adjustment and non-sexual violent recidivism, but no significant predictive validity of the SVR-20

subscales and total scores for sexual offenses. However, the authors warrant caution in generalizing

their results because this study had some important limitations such as a small sample size and poor

interrater reliability for the SVR-20 (average Cohen’s kappa = .36).

Criticism of the SPJ approach

The SPJ approach has been criticized on a number of grounds. A first comment is that the guidelines

largely lack a theoretical basis. The same, however, is true for most actuarial risk assessment

instruments. Second, like in actuarial instruments, there is a sole focus on negative characteristics of a

person and his / her social environment. Possible protective factors that diminish the risk of violence

are not taken into account (Hart, 2001a; Rogers, 2000). An exception is the SAVRY for violence risk

assessment in youth that contains six protective factors, for instance, resilient personality, social

support, and positive attitude towards intervention. A third comment to the SPJ model concerns

practical considerations, such as the need for investment in new material, training and technology and

the need for high quality (clinical) information to code the guidelines. For example, for coding the

item ‘Psychopathy’, the rater must code the PCL-R (and be trained and experienced in coding the

PCL-R), which is rather time consuming.

Several book reviews have commented in detail on the HCR-20 (see Arbisi, 2004; Buchanan, 2001;

Cooper, 2004; Mossmann, 2000; Witt, 2000). Arbisi (2003) argued that directions for administration

are somewhat vague, for instance, the manual provides little guidance for coding the item

‘Unresponsive to treatment’. Buchanan (2001) raised some moral issues with respect to the coding of

the HCR-20. For example, he expressed his concern that the person who is being assessed or the

relatives of this person will not co-operate with the risk assessor, for instance, provide information if

they knew that the risk assessment could lead to extension of their detention. However, these types of

moral or ethical issues concern risk assessment in general and are not only valid for the SPJ approach.

Witt (2000) mentioned that there is some overlap in the items of the PCL-R and the HCR-20, for

instance, impulsivity and prior supervision failure, which could lead to overrating the risk of violence.

Witt also described a possible problem for (inexperienced) risk assessors, i.e., that it is unclear what

Chapter 1

22

rules one can use to combine the items to conclude upon the violence risk, for example, if some items

should have more weight than others. However, the SPJ model specifically states that the guidelines

are not designed to use algorithms, but it is the mental health professional who by structuring his / her

thinking should arrive at the final risk judgment. The mental health professional is strongly advised to

use actual empirical knowledge to help him / her in this process. For instance, research has

demonstrated that the combination of psychopathy and sexual deviance severely increases the risk of

sexual recidivism (Hildebrand, de Ruiter, & de Vogel, 2004; Rice & Harris, 1997), and such empirical

knowledge should guide the structured risk judgment.

A final thought that should be noted here - although it is not really a comment to the SPJ model - is

that researchers who were involved in developing the SPJ model are concerned about quality

assurance when SPJ guidelines are used in practice. Hart (2001a, p. 21) stated that “Basic research to

develop risk assessment procedures is important, but it is naïve to assume that any procedure will

function similarly in the field”. Webster, Müller-Isberner and Fransson (2002) expressed their concern

for erratic use of the HCR-20 and associated guidelines and stress that risk assessors should study the

HCR-20 with great care before adopting it into practice and strictly follow the coding procedure as

described in the manual. They stated that sometimes the HCR-20 is misapplied, for instance,

improperly coded, interpreted or used for decision-making. Erratic use of the HCR-20 could be due to

the risk assessor, for example, when he or she is not qualified or competent, is biased, has no expertise

in coding the instrument or does not follow the coding procedure as described in the manual. Hare

(1998a) and Edens (2001) have described the same with respect to the potential misuse of the PCL-R

in mental health and criminal justice systems.

Summary: Approaches to violence risk assessment

Both the unstructured clinical judgment and actuarial judgment approach were demonstrated to have

some major limitations with respect to the assessment of violence risk in forensic clinical practice. The

SPJ model seems to incorporate the advantages of both approaches. The SPJ guidelines possess a

strong empirical basis like the actuarial instruments, but are also suitable for practical use in

treatment settings, because they focus on dynamic, changeable risk factors and provide guidelines for

treatment. However, the SPJ guidelines also incorporate some disadvantages of both approaches, for

instance, the neglect of protective factors as in actuarial instruments and their accuracy depends

partly on the qualities of the risk assessor as in unstructured clinical judgment. The SPJ guidelines

should still be seen as work in progress, more research into the reliability and validity of the

guidelines is needed.

Risk assessment in specific forensic psychiatric samples The aim of this paragraph is to provide state of the art knowledge of specific risk factors and violence

risk assessment in different forensic psychiatric samples: patients with a major mental disorder (i.e.,


23

Axis I disorder according to the DSM-IV), personality disordered patients (i.e., DSM-IV Axis II

disorder), sexual offenders and female offenders. The four groups are very heterogeneous, so the

relevance of risk assessment is in the identification of those (subgroups of) patients who pose a high

risk of violence and distinguish them from patients who pose a low or moderate risk of violence. In

this paragraph, the four groups are described as separate groups. In practice, the groups obviously

overlap.

Major mental disorders

The relationship between major mental disorders and violence

There is a long-standing public perception that patients with a major mental disorder (MMD) are prone

to violence (Bonta, Law, & Hanson, 1998; Hiday, 1995; Link & Stueve, 1994; Monahan, 1992; Phelan

& Link, 1998). The General Social Survey examined opinions of 1332 Americans and found that most

of them viewed patients with a MMD, especially those with alcohol or drug dependence to be likely to

be violent toward others (Pescolido, Monahan, Link, Stueve, & Kikuzawa, 1999). The question is if

this perception is accurate and if there is a significant (causal) relationship between MMD and

violence. In 1983, Monahan and Steadman reviewed over 200 studies on the association between

violence and MMDs and concluded that when controlling for demographic and criminal characteristics

the association tended to disappear. This finding was replicated by others in more recent studies. For

example, Bonta et al. (1998) conducted a meta-analysis into predictors of recidivism and failed to

demonstrate a significant relationship between MMDs and violence after controlling for demographic

characteristics or criminal history variables. The same was found in a six-year longitudinal study in

728 jail inmates (Teplin, Abram, & McClelland, 1994). However, Monahan (1992) got back to the

subject and stated that the conclusion from 1983 was at least premature and might well have been

wrong because controlling for factors that are highly related to mental disorder is problematic and also

because new studies found a consistent albeit modest relationship between MMDs and violence (e.g.,

Klassen & O’Connor, 1988, 1990; Swanson, Holzer, Ganju, & Tsutomu Jono, 1990).

Studies supporting the association between MMDs and violence offered three types of evidence. First,

research has demonstrated a comparatively high prevalence of MMDs in prison inmates and forensic

psychiatric samples convicted for a violent offense (Coté & Hodgins, 1992; Monahan, 1992; Mullen,

1997; Teplin, 1990). Second, studies into violent behavior in mentally disordered patients have found

higher rates of violent behavior compared to community controls (Lindqvist & Allebeck, 1990;

Wessley, Castle, Douglas, & Taylor, 1994). Third, epidemiological studies have demonstrated a

significant relationship between violence and MMDs (Link, Andrews, & Cullen, 1992; Swanson et al.,

1990; Tiihonen, Isohanni, Rasanen, Koiranen, & Moring, 1997). For example, Swanson et al. (1990)

conducted a large-scale community research in a sample of approximately 10.000 people in three

metropolitan areas in the USA. In this Epidemiologic Catchment Area (ECA) study, self-report data

regarding violence during the 12 months preceding the assessment were linked to psychiatric disorders

Chapter 1

24

as measured with the Diagnostic Interview Schedule (DIS; Robins, Helzer, Croughan, & Ratcliff,

1981), an instrument yielding DSM-III-R (APA, 1987) psychiatric diagnoses. Persons with a MMD

compared to persons with no MMD were found to be significantly more violent after controlling for

sex, age and socioeconomic status (prevalences of violence during the past year: no mental disorder:

2%; affective disorders: 12%; schizophrenic disorders: 13%; alcohol abuse: 25%; drug abuse: 35%).

They also found a significant interaction effect between MMD and substance abuse (see further

Specific risk factors for violence in mentally disordered patients, p. 25). Furthermore, epidemiological

studies of birth cohorts followed from pregnancy to adulthood indicated that persons with a MMD are

at increased risk as compared to non-disordered persons to commit violent offenses (Hodgins, 1992;

Hodgins, Mednick, Brennan, Schulsinger, & Engberg, 1996).

It should be noted that most studies into the relationship between MMD and violence have been

conducted in samples with mainly schizophrenic patients. Several studies have demonstrated that

patients with schizophrenia, especially paranoid schizophrenics, are most likely to be violent compared

to patients with other MMDs (excluding substance abuse) (Benezech, Bourgeois, & Yevasage, 1980;

Krakowski, Volavka, & Brizer, 1986; Wessely, Castle, Douglas, & Taylor, 1994). Tiihonen (2001)

stated that mentally disordered violent offenders are usually schizophrenics with comorbid substance

abuse and insufficient insight into their mental illness. Few studies have specifically addressed the

association between, for instance, major affective disorders or mental retardation and violence

(Douglas & Webster, 1999a; Hodgins, 2002). In a birth cohort study, Hodgins (1992) found that

intellectually handicapped men and women were significantly more often registered for a violent

offense than men and women with no MMD or intellectual handicap. In a birth cohort study in

Finland, patients with an organic disorder compared to a control group were found to pose a

significantly higher risk of minor non-violent offenses, but not of violent offenses (Tiihonen et al.,

1997). Estroff and Zimmer (1994) compared patients with schizophrenia to patients with affective

disorders and found the odds of making violent threats not to differ, but the odds of acting violently

was 10 times greater for schizophrenics than for those with affective disorders. However, others have

found lower rates of violence in schizophrenics compared to patients with affective disorders

(Monahan et al., 2001). Kausch and Resnick (1999) stated that many patients in a manic state threaten

others, but that they rarely commit serious violence.

Is the relationship between MMDs and violence causal?

The lines of research described above have consistently found a significant association between

MMDs and violence although the strength of this relationship differs per MMD diagnosis. It should be

noted, however, that in many studies, there were methodological limitations, such as confounding, lack

of controlling for demographic or historical variables, use of retrospective data, and reliance on self-

report limiting the results (see Hodgins, 2001). Furthermore, factors that are correlated with or predict

violence are not necessarily involved in causing these behaviors (Hodgins, 1997, 1998). Therefore,


25

several authors have questioned the existence of a causal connection between MMDs and violence.

They suggested that the increased relative risk of violence is in large part accounted for by prior

history of violence and that comorbid substance abuse or personality disorder are the most important

contributors to and explain violence in patients with a MMD (Arboleda-Floréz & Stuart, 2000; Rice &

Harris, 1992; Rice et al., 2002; Wallace et al., 1998). Hodgins (2001, 2002) stated that most

schizophrenics do not commit violent offenses and suggested that in schizophrenics who are violent, a

stable pattern of antisocial behavior present from at least early adolescence is the etiological factor for

violence. Some authors even believe the diagnosis of MMD to be a protective factor for violence. For

instance, in the VRAG, a diagnosis of schizophrenia is inversely related to risk (Harris & Rice, 1997).

On the other hand, others have argued that the risk from comorbid substance abuse appears to be

additive and not causal because there also is an increase in violence risk in mentally disordered

patients without comorbid substance abuse (Arsenault, Moffitt, Caspi, Taylor, & Silva, 2000). Link

and Stueve (1994) reviewed studies into the relationship between MMDs and violence and concluded

that “the robust pattern of findings across studies using different designs, samples and measures tips us

in favor of the conclusion that there is a causal connection between some types of mental illness and

violence” (p. 142).

Next to criminal history and comorbid abuse or personality disorder, there are two other important

issues with respect to a causal association between MMDs and violence: the social context in which

mentally disordered patients live and victimization. Particular social circumstances that surround many

patients, for instance, living in a neighborhood with high unemployment, high crime rates and readily

available drugs and weapons, seem to provide a possible explanation for the association between

MMDs and violence (Barratt & Slaughter, 1996; Hiday, 1995; Mullen, 1997). Silver, Mulvey and

Monahan (1999) found neighborhood poverty to have an impact over and above the effects of

individual level risk factors (e.g., criminal history, age at admission) in identifying cases who become

violence. Many mentally disordered patients are frequently victim of violence rather than perpetrator

not only because of the social context of their lives, but also because of their visible disabilities

(Hiday, Swanson, Swartz, Borum, & Wagner, 2001). Hiday et al. (2001) found that in a group of 331

persons with a MMD, perpetrated violence was significantly associated with being criminally

victimized after controlling for demographic variables. Bonta and colleagues (1998) suggested that

risk assessment and management in mentally disordered patients can be enhanced with more attention

to the social psychological criminological literature and less reliance on models of psychopathology.

Specific risk factors for violence in mentally disordered patients

Different studies have demonstrated that the diagnosis of a MMD itself is not so important, but that the

risk of violence in mentally disordered patients strongly increases in case of: 1) acute psychotic

symptoms; 2) comorbid substance abuse; and 3) a comorbid personality disorder. In general, the

Chapter 1

26

higher the psychiatric comorbidity, the higher the risk of violent behavior (Bourgeois & Benezech,

2001).

1. Acute psychotic symptoms. Research has demonstrated that in patients with a MMD, acute psychotic

symptoms, such as delusions and hallucinations were predictive of violence (Hiday, 1995; Link,

Stueve, & Phelan, 1998; McNiel, 1994; Mullen, 1997; Swanson, Borum, Swartz, & Monahan, 1996).

Delusions are most strongly related to violence (Link et al., 1998; Taylor et al., 1998), especially

persecutory delusions (Wessely et al., 1993). Taylor et al. (1998) found in a record survey of 1740

mentally disordered patients that more than 75% of those with a psychosis were recorded as being

driven to offend by their delusions. In the absence of delusions, hallucinations had no such effect.

Buchanan et al. (1993) found that in a group of 79 psychiatric inpatients, those who reported being

aware of evidence supporting the delusion and those who actively sought out such evidence were

significantly more likely to act upon their delusions. Link and Stueve (1994) have described the

principle of rationality within irrationality. This principle posits that objectively irrational psychotic

symptoms are experienced by the patient as real and, thus, violence is a rational response for the

psychotic patient. This is especially the case when a patient experiences threat / control – override

(TCO) symptoms. These symptoms are characterized by feelings of threat that cannot be controlled by

the patient, for instance, thoughts that are put into one’s head or the feeling that one’s mind is

dominated by forces beyond one’s control. Link and Stueve (1994) found that in a sample of 232

psychiatric patients, TCO symptoms were a significant predictor of violence after controlling for

sociodemographic and criminal variables and other psychotic symptoms (e.g., feelings of non-

existence, feelings of having special powers). Moreover, TCO symptoms accounted for differences

between patient groups and community groups. These results have been corroborated by Swanson and

colleagues (1996) in the ECA study. They found patients with TCO symptoms to be about twice as

likely to engage in violent behavior compared to patients with other psychotic symptoms and five

times as likely as those with no MMD. When these symptoms co-occurred with substance abuse

disorders, the effect was even stronger. However, Milton et al. (2001) examined 166 patients with a

first episode psychosis and found no empirical evidence for the predictive accuracy of TCO symptoms

or other specific psychotic symptoms for violence. In the MacArthur Risk Assessment Study, it was

found that violence by mentally disordered persons was linked to their perceptions and experiences of

hostility and being threatened by significant others (Estroff, Swanson, Lachiotte, Swartz, & Bolduc,

1998), but specific TCO symptoms as assessed by interviewers were not found to be predictive of

violence (Appelbaum, Robbins, & Monahan, 2000; Monahan et al., 2001). However, when the

researchers used patients’ self-reported TCO symptoms, they found significantly higher rates of

violence in the previous 10 weeks and at follow up evaluations compared to patients who did not

report TCO symptoms. Appelbaum and colleagues (2000) concluded that their data should not be

taken as evidence that TCO delusions never cause violence, because “it is clear from clinical


27

experience and many other studies that they can and do” (p. 571). A problem is that studies into this

subject all use different criteria for TCO symptoms and that most studies have important

methodological limitations, for instance, the use of self-report to measure psychotic symptoms and

violence.

Regarding hallucinations, Rudnick (1999) reviewed studies on the relationship between command

hallucinations and violence and concluded that this relationship is complex and that there are possible

mediating factors such as benevolence of the content and familiarity of the commanding voice. Hersh

and Borum (1998) summarized the literature regarding hallucinations and violence and concluded that

compliance rates ranged from 39% to 89%. They named several factors associated with higher

compliance such as familiarity and trustworthiness of hallucinated voices. Junginger (1990, 1995)

found that psychiatric inpatients who could identify the hallucinated voice, who experienced less

dangerous commands or who reported hallucination-related delusions were more likely to comply with

command hallucinations.

2. Comorbid substance abuse. Numerous studies in different settings and countries have consistently

demonstrated that in patients with a MMD, the risk of violence is severely increased in case of

comorbid substance abuse (Arsenault et al., 2000; Bland & Orn, 1986; Milton et al., 2001; Monahan et

al., 2001; Rice & Harris, 1995; Smith & Hucker, 1994; Steadman et al., 1998; Swanson et al., 1990;

Swartz et al., 1998; Tardiff, Marzuk, Leon, Protera, & Wiener, 1997; Tiihonen et al., 1997; Wallace et

al., 1998; Walsh, Buchanan, & Fahy, 2002). For instance, Swanson (1994) examined data from the

ECA study regarding MMD diagnoses and self-report data on violence from a five-item index of the

DIS designed to identify assaultive behavior and antisocial personality disorder. He found mentally

disordered patients with comorbid substance abuse significantly more likely to be violent than those

with MMD alone (see Figure 1). Substance abuse itself was associated with a very high relative risk of

violence.

Chapter 1

28

Figure 1. Lifetime prevalence of violent behavior by current MMD diagnoses

Note. Adopted from Swanson (1994). The index of violence relates to five items in the DIS: 1) have used a weapon; 2) have been in a fight; 3) hitting a child; 4) hitting a spouse; and 5) fighting while drinking. ‘Two-item’ means that the respondent answered ‘yes’ to items 1 and 2. ‘Four-item’ means that the respondent answered ‘yes’ to items 1,2, 3, and 4. ‘Five-item’ means that the respondent answered ‘yes’ to all items.

Medication noncompliance seems to further intensify the interaction between MMDs and substance

abuse. Swartz et al. (1998) performed a study into the joint effect of substance abuse and medication

noncompliance in a sample of 331 involuntarily admitted patients with a severe mental illness. In this

study, it was found that the combination of medication noncompliance and substance abuse was

significantly associated with serious violence in the community after demographic and clinical

characteristics were controlled for. Research has indicated a high rate of comorbidity and a complex

interaction of MMD and substance abuse. Regier et al. (1990) found a high rate of comorbidity of

mental disorders and addictive disorders in prison populations, most notably with schizophrenia,

antisocial personality disorder and bipolar disorder. Stålenheim and von Knorring (1996) found a high

rate of comorbidity between substance abuse, Axis I and Axis II disorders in 61 Swedish forensic


3. Comorbid personality disorder. The relationship between MMDs and violence was demonstrated to

be stronger in case of a comorbid personality disorder, especially the antisocial personality disorder or

psychopathy (Abram & Teplin, 1991; Blackburn & Coid, 1999; Rice & Harris, 1992; Tardiff, Marzuk,


29

Leon, & Protera, 1997). Ghandi et al. (2001) conducted a follow up study in 155 patients with severe

mental illness, and found that patients with a comorbid personality disorder, especially the antisocial

and borderline personality disorder, were six times more likely to have police contacts than those with

no comorbid personality disorder. Comorbidity of Axis I and Axis II disorders is quite common in

violent offenders (Hildebrand & de Ruiter, 2004). Coid (2003) found high rates of comorbidity of Axis

I and Axis II disorders in a sample of 260 offenders in British high security hospitals. For instance,

61% of the offenders with borderline personality disorder and 51% of the offenders with antisocial

personality disorder were diagnosed with a lifetime diagnosis of depression.

Nature and severity of violence in mentally disordered patients

Research has indicated that relatives and significant others (e.g., mental health professionals) are at

higher risk of becoming a victim of violence by mentally disordered patients and that violence by

mentally disordered patients usually takes place at their residence and not in public settings (Estroff et

al., 1998; Estroff, Zimmer, Lachiotte, & Benoit, 1994; Milton et al., 2001; Monahan et al., 2001;

Steadman et al., 1998; Straznickas, McNiel, & Binder, 1993; Tardiff, Marzuk, Leon, & Protera, 1997;

Tardiff, Marzuk, Leon, Protera, & Weiner, 1997). Mothers who live with an adult schizophrenic son

with comorbid substance abuse pose the highest risk of being the target of (repeated) violence (Estroff

et al., 1994). A possible explanation is that relatives or mental health professionals bear considerable

responsibility and are both psychologically and physically closest to the patient. Many patients with a

MMD have a small social network, are unemployed and financially dependent on their family. The

relatives’ or mental health professionals’ involvement in the patient’s daily living and vulnerability

creates the opportunity for violence (Estroff et al., 1994).

Patients with a MMD were demonstrated to pose a high risk of threatening behavior towards others,

but overall, they commit less serious violence compared to personality disordered patients or violent

offenders without a mental disorder (Lindqvist & Allebeck, 1990; van Panhuis & Dingemans, 2000).

Hochstedler Steury and Choinski (1995) compared frequency and nature of violence between 32

patients with a MMD and 82 violent offenders with no MMD and concluded that violence by mentally

disordered patients generally is less serious, but more unpredictable.

Summary and implications for risk assessment in mentally disordered patients

The lines of research described above have convincingly demonstrated a significant association

between MMDs and violence, although this association appears to be relatively modest, very complex,

and influenced by several factors or mediators, such as comorbidity, current symptomatology,

neighborhood characteristics, and victimization. The majority of patients with a MMD do not pose a

high risk of violence, but research has identified subgroups with an elevated risk. Thus, risk

assessment should not solely focus on the general diagnosis of MMD, but take into consideration the

specific risk factors in patients with a MMD, i.e., acute psychotic symptoms, especially TCO

Chapter 1

30

symptoms, comorbid substance abuse or personality disorder and medication noncompliance. For

example, the risk assessor should do a specific inquiry into the contents, frequency and recency of

delusions and / or command hallucinations. Other important issues for risk assessors to bear in mind

are the nature and context of violence in patients with a MMD: they usually commit violence against

relatives or significant others at their own residence. Finally, risk assessors should not only focus on

the individual, but also take into account social circumstances, such as neighborhood characteristics,

because these could be important dynamic risk factors in mentally disordered patients.

Personality disorders

The relationship between personality disorders and violence

Personality disorders (PDs) are usually seen by mental health professionals as an important risk factor

for violent behavior (Berman, Fallon, & Coccaro, 1998). This belief is supported by three lines of

research: 1) research into the prevalence of PDs in forensic psychiatry and the penitentiary system; 2)

psychological assessment research in personality disordered patients; and 3) research that specifically

examines the relationship between PDs and violence.

First, research has demonstrated that PDs, particularly cluster B disorders (according to the DSM-III-R

or DSM-IV) are highly prevalent in forensic psychiatric populations (Coid, Kahtan, Gault, & Jarman,

1999; de Ruiter & Greeven, 2000). Hart, Dutton and Newlove (1993) found that 80% to 90% of 85

domestic violence offenders attending treatment groups suffered from a DSM-III-R PD, most

frequently antisocial personality disorder (ASPD), borderline personality disorder (BPD) or sadistic

personality disorder (SPD). Coid (2000, 2003) found a high prevalence of cluster B disorders in a

sample of 260 males and females detained in maximum security hospitals in England. In this sample,

ASPD was strongly comorbid with paranoid personality disorder and BPD. Warren et al. (2002) found

a high prevalence of (comorbidity among) cluster B disorders in a sample of 261 incarcerated women,

especially between ASPD and BPD.

Second, research has shown that personality disordered patients obtain higher scores on measures of

aggressive behavior, hostility and anger compared to control groups (Coccaro, Berman, & Kavoussi,

1997; Dougherty, Bjork, Huckabee, Moeller, & Swann, 1999; Else, Wonderlich, Beatty, Christie, &

Staton, 1993; Gardner, Leibenluft, O’Leary, & Cowdry, 1991). For example, in a sample of 46 patients

with BPD significantly higher scores on the Buss Durkee Hostility Inventory (BDHI; Buss & Durkee,

1957) were found compared to a control group of normal volunteers (Gardner et al., 1991).

Third, several studies have specifically addressed the relationship between PDs and violence and

found a significant association. Warren et al. (2002) found a significant association between PDs and a

history of violent offenses or institutional violence in a sample of 261 incarcerated women, especially

for ASPD and narcissistic personality disorder (NPD). They also found cluster A pathology to be

significantly associated with incarceration for a violent offense. Widiger and Trull (1994) reviewed the

literature regarding the relationship between ASPD and BPD and violence and concluded that a DSM-


31

III-R diagnosis of these PDs does provide a risk factor for violent behavior in persons with a history of

violence. They described childhood trauma as an etiological factor regarding the relationship between

BPD and violence. Studies of victims of childhood (sexual) abuse have consistently demonstrated an

association between childhood abuse and adult BPD (Brown & Anderson, 1991; Links, Steiner,

Offord, & Eppel, 1988; Perry & Herman, 1993; Zanarini, Gunderson, Marino, Schwartz, &

Frankenburg, 1989). Since childhood abuse is correlated with antisocial behavior in adolescence or

adulthood (Smith & Thornberry, 1995; Widom, 1989a, 1989b), it is likely that the diagnosis of BPD

constitutes a risk factor for violent behavior (see also Tardiff, 2001). Regarding ASPD, it has

consistently been found that a diagnosis of ASPD predicts (sexual) violence in all kinds of samples

(Widiger & Trull, 1994). Comorbid substance abuse, especially alcohol and cocaine abuse (Bland &

Orn, 1986; Tardiff, 2001) and brain damage (Tardiff, 2001) further increases the risk of violence in

patients with ASPD.

Coid (2000) conducted a study into the association between PDs and (motivations for) serious criminal

behavior in a sample of 260 males and females detained in maximum security hospitals in England. He

found several significant associations between cluster B disorders and index offenses, for instance,

between BPD and arson, ASPD and robbery, and NPD and homicide. There were several significant

associations between PDs and motivational variables, for instance, hyperirritability and financial gain

in ASPD, displaced aggression and relief of tension in BPD, blow to self-esteem and need for power

or control over a victim in NPD, and avoidance of arrest and financial gain in histrionic personality

disorder. In a clinical observation study into violence in patients with NPD, it was found that recent

narcissistic injury and an inability to acknowledge or express painful or angry affect related to the

injury were seen as important precursors to violence (Schulte, Hall, & Crosby, 1994).

Summarizing, research has found: 1) a high prevalence of cluster B disorders in forensic psychiatric

samples; 2) personality disordered patients to score higher on measures of aggression and hostility;

and 3) a significant association between BPD, ASDP, NPD, cluster A disorders and violence.

However, some cautionary notes regarding the relationship between PDs and violence should be made

here. First, for PDs such as ASPD and BPD, the link with violence seems partly tautological, because

several of the defining criteria of these disorders are empirically and / or logically related to violence.

For example, aggressiveness, violation of the rights of others, and impulsivity are criteria for the

diagnosis of ASPD, and irritability, impulsivity and difficulty in controlling anger for the diagnosis of

BPD. The same is true for some of the criteria of other PDs, for example, hostility in paranoid

personality disorder and lack of empathy in NPD. Second, an important limitation of studies into the

relationship between PD and risk of violence is that most of these studies have been conducted in

prison or forensic psychiatric subjects who have already proven to be violent. Besides, most of these

studies failed to control for the presence of other mental disorders and to use control groups (Berman

et al., 1998; Widiger & Trull, 1994). Epidemiological research into the relationship between PDs and

risk of violence is scarce. Berman et al. (1998) examined the relationship between DSM-III-R PDs and

Chapter 1

32

aggressive behavior as measured with the Life History of Aggression scale (LHA; Coccaro et al.,

1997), a semi-structured interview, in 137 adult research volunteers recruited from the community.

The authors controlled for gender and co-existing mental disorders. In this study, aggressive behavior

had a significant positive correlation with ASPD, BPD, NPD, histrionic, paranoid, and passive

aggressive personality disorder, and a significant negative correlation with schizoid personality

disorder. When all PDs were entered simultaneously in a hierarchical multiple regression analysis, the

paranoid and passive-aggressive personality disorder were the only significant predictors of aggressive

behavior. Cluster B disorders appeared to account for overlapping variance in the prediction of

violence, but paranoid and passive-aggressive symptoms seemed to provide additional, independent

information. The authors concluded that the relationship between personality pathology and violence

is complex and that personality disordered patients constitute a heterogeneous group with respect to

violent behavior. Johnson et al. (2000) conducted a community-based study into the association

between adolescent PDs and violence during adolescence and early adulthood in 717 youths from

upstate New York.7 Personality disorders were assessed with the Diagnostic Interview Schedule for

Children (DISC; Costello, Edelbrock, Duncan, & Kalas, 1984) and the Personality Diagnostic

Questionnaire (Hyler, Rieder, Williams, Spitzer, Hendler, & Lyons, 1988) which items were modified

so that they were age-appropriate and could be administered in an interview format. Adolescents with

a greater number of DSM-IV cluster A or B PD symptoms were significantly more likely than other

adolescents to have shown violent behavior during adolescence and early adulthood. Paranoid,

narcissistic and passive-aggressive symptoms were independently associated with risk of violence

after controlling for demographic characteristics and comorbid Axis I disorders.

Psychopathy as a risk factor

One specific PD – although it is not in the DSM – that was found to have a strong relationship to

violence is psychopathy (Hemphill, Templeman, Wong, & Hare, 1998; Salekin, Rogers, & Sewell,

1996; Serin & Amos, 1995). Psychopathy is a personality disorder defined by a constellation of

affective, interpersonal, and behavioral characteristics, for instance, egocentricity, lack of empathy,

guilt or remorse, deceitfulness, superficial charm, impulsivity and manipulativeness (Cleckley, 1976;

Hare, 1998b). The clinical concept of psychopathy is clearly linked to violence, of great theoretical

importance in the explanation of criminal violence and of great practical importance in the assessment

of risk of violence (Hart, 1998a). The most widely validated instrument for assessing psychopathy is

the PCL-R. Hart (1998b) stated that psychopathy should be considered in any assessment of violence

risk and that the concept of psychopathy should be assessed with the PCL-R, because in almost all

research that supports the predictive validity of psychopathy assessments for violence the PCL-R was

used. Hare (1980) has developed the PCL, followed in 1991 by a revised version, the PCL-R, because

7 ASPD was not assessed because the offenders were younger than 18.


33

he believed that the traditional construct of psychopathy is not well represented by the criteria for

ASPD described in the DSM. The PCL-R comprises two factors: Factor 1 which has been labeled

‘selfish, callous and remorseless use of others’, and Factor 2 which represents ‘a chronically unstable

and antisocial lifestyle’ (Hare, 1991). Cooke and Michie (2001) have subjected the PCL-R items to

Item Response Theory (IRT) analyses and demonstrated that a hierarchical three-factor model

(interpersonal, affective and behavioral factors) provides a better understanding of the multifaceted

concept of psychopathy than the two factor model of the first edition PCL-R. Recently, Hare (2003)

published the second edition of the PCL-R. In this second edition, Factor 1 and 2 are both divided into

two empirically derived and validated factors: Interpersonal and Affective, and Lifestyle and

Antisocial, respectively (see Table 3). The PCL-R consists of 20 items that have to be coded on a

three-point scale - ‘0’ item does not apply,‘1’ the item probably or partially applies, and ‘2’ the item

definitely applies - from a semi-structured interview and collateral information. The total score can

range from 0 to 40 and reflects an estimate of the degree to which an individual matches the

prototypical psychopath. The cut off score for the diagnosis of psychopathy is generally 30, but in

several European countries, for instance, Scotland, England, Sweden and The Netherlands a cut off

score of 25 or 26 has proven useful (Hare, Clark, Grann, & Thornton, 2000; Hildebrand, 2004).

Ideally, the PCL-R is coded on the basis of both a semi-structured interview and file information,

however, previous research showed that for research purposes, PCL-R ratings can be done reliably on

file information (Grann, Långström, Tengström, & Stålenheim, 1998; Hildebrand, de Ruiter, & de

Vogel, 2004).

Chapter 1

34

Table 3. PCL-R items

Interpersonal factor

Affective Factor

Glibness / superficial charm

Lack of remorse or guilt

Grandiose sense of self-worth Shallow affect

Pathological lying Callous / lack of empathy

Conning / manipulative Failure to accept responsibility of own actions

Lifestyle factor

Antisocial factor

Need for stimulation / proneness to boredom

Poor behavioral controls

Parasitic lifestyle Early behavioral problems

Lack of realistic, long-term goals Juvenile delinquency

Impulsivity Revocation of conditional release

Irresponsibility Criminal versatility

Additional items

Many short-term marital relationships

Promiscuous sexual behavior

Note. From Hare (2003). Additional items are items that do not load on one of the factors.

Numerous studies and several meta-analyses have demonstrated that the PCL-R total score is a strong

predictor of general, sexual and violent recidivism in both prison and general / forensic psychiatric

populations (Hemphill et al., 1998; Salekin et al., 1996; Serin & Amos, 1995). Therefore, psychopathy

as measured by the PCL-R is included as one of the risk factors in several structured risk assessment

instruments, such as the VRAG and most SPJ guidelines. The ability of the PCL-R to predict

recidivism was shown to have cross-cultural generalizability (Hare et al., 2000). In the studies in this

thesis, the Dutch version of the PCL-R (Vertommen, Verheul, de Ruiter, & Hildebrand, 2002) was

administered. Research in the Dr. Henri van der Hoeven Kliniek rendered a good interrater reliability

for the Dutch PCL-R (Hildebrand, de Ruiter, de Vogel, & van der Wolf, 2002). Furthermore, PCL-R

scores were significantly related to disruptive behavior in a sample of 92 male forensic psychiatric


35

inpatients in the Dr. Henri van der Hoeven Kliniek (Hildebrand, de Ruiter, & Nijman, 2004). And also,

PCL-R psychopathy significantly predicted violent and sexual recidivism in treated sexual offenders.8

Summary and implications for risk assessment in personality disordered patients

The research described above indicates that a diagnosis of PD is a risk factor for violence. However,

the risk assessor should take into account which PD was diagnosed, because not all PDs were

demonstrated to be predictive of violence. For instance, no significant association was found between

violence and cluster C pathology and a negative association for schizoid personality disorder.

Psychopathy, DSM cluster B PDs, paranoid and passive aggressive PD were found to be most

relevant to violence risk assessment.

Sexual offenders

Base rates of sexual recidivism

Research has demonstrated that sexual offenders are a very heterogeneous group and that base rates of

sexual reoffending differ strongly per type of sexual offender (Doren, 1998; Greenberg, 1998; Grubin,

1997a; Prentky, Lee, Knight, & Cerce, 1997). In order to make an accurate assessment of the risk of

sexual reoffending, the risk assessor must rely on reliable estimates of base rates of recidivism among

subgroups of sexual offenders (Prentky et al., 1997). However, there are at least three issues that

complicate establishing clear-cut base rates per subgroup of sexual offenders: 1) methodological

limitations of studies into sexual recidivism; 2) overlap in subgroups of sexual offenders; and 3) the

dark number of sexual offenses.9 First, many studies into base rates of sexual recidivism have

methodological limitations, for instance, short follow up periods and use of inadequate definitions of

recidivism. These limitations all serve potentially to underestimate the true base rate of sexual

recidivism (Doren, 1998; Furby, Weinrott, & Blackshaw, 1989; Prentky et al., 1997). Moreover,

methodological variability among studies into sexual recidivism, for instance, in length of follow up

period, definition of sexual recidivism, sample selection, study design, and use or non-use of survival

analyses that take into account time at risk, makes it very difficult to compare the derived base rates

and conclude upon generally valid base rates (see Furby et al., 1989). Second, the distinction between

different types of sexual offenders is not always clear, there is overlap. Harris, Rice and Quinsey

(1998) stated that there is a sizeable proportion of sexual offenders who have offended against both

adults and children. It has been suggested that sexual offenders whose victims come from all

categories – adult women, boys, girls – pose the highest sexual recidivism risk of all sexual offenders

(Rice & Harris, 1997). Third, reported rates of sexual reconviction or arrest in empirical studies should

all be taken as a conservative approximation because many sexual offenses go undetected and not all

8 For more information on research with the Dutch PCL-R, see Hildebrand (2004). 9 It should be noted that these methodological issues are valid in recidivism research in general, not only for sexual recidivism research.

Chapter 1

36

sexual offenders are apprehended and arrested (Groth, Longo, & McFadin, 1982; Johnson & Sacco,

1995; Weinrott & Saylor, 1991).

Doren (1998) reviewed the literature regarding base rates of sexual recidivism in rapists and child

molesters and concluded that, in general, child molesters reoffend more than rapists. Other reviews,

however, found rapists to reoffend more than child molesters (Hanson & Bussière, 1998; Quinsey,

Lalumière, Rice, & Harris, 1995). Doren (1998) argued that rapists do have higher reconviction rates

within the first five years at risk, but that child molesters reoffend equally or more than rapists when

studies had a long follow up period or used a less stringent definition of recidivism. Doren (1998)

stated that the study of Prentky et al. (1997) yielded base rates closest to reality because this study is

methodologically the soundest with a lengthy follow up period (25 years) and use of survival analyses

and different definitions of recidivism. Prentky and colleagues reported a base rate of sexual

recidivism of 39% in rapists and 52% in child molesters, over a period of 25 years at risk. Notable is

that they found that both rapists and child molesters not only reoffended with sexual offenses, but also

with non-sexual violent offenses (base rates 49% and 23%, respectively) and general offenses (base

rates 74% and 75%, respectively). This finding resembles the conclusions of the meta-analysis of

Hanson and Bussière (1998) and the updated meta-analysis of Hanson and Morton-Bourgon (2004).

Generally, rapists show higher rates of non-sexual violent and general offenses than sexual offenses

(Hanson & Bussière, 1998; Quinsey, Rice, & Harris, 1995; Rice & Harris, 1997; Sjöstedt &

Långström, 2002). Compared to child molesters, rapists were found to have a more antisocial lifestyle

(Firestone, Bradford, Greenberg, & Serran, 2000; Simon, 2000). This seems to be particularly true

when compared to incest offenders. Child molesters who are referred to maximum security institutions

appear to be relatively versatile offenders like the rapists (Quinsey, Lalumière et al., 1995).

Furthermore, research has shown that some sexual offenders, particularly child molesters, may

reoffend after a long period of non-offending (Hagan & Gustbrey, 1999; Hanson, Steffy, & Gauthier,

1993; Prentky et al., 1997; Quinsey, Lalumière et al., 1995).

Regarding the group of child molesters, an important distinction must be made between those with

intrafamilial victims and those with extrafamilial victims, as well as between child molesters with girl

victims and those with boy victims. Most studies into sexual recidivism do not differentiate between

different types of child molesters, but those that did found important differences. A robust finding is

that child molesters with extrafamilial boys as victims reoffend significantly more than all other types

of child molesters (Hanson et al., 1993; Harris & Hanson, 2004; Prentky et al., 1997; Quinsey,

Lalumière et al., 1995). For example, Quinsey, Lalumière and colleagues (1995) analyzed sexual

recidivism data from 17 independent samples of child molesters (N= 4483) and found a weighted

average sexual reconviction rate of 35% for homosexual child molesters versus 18% for heterosexual

child molesters and 9% for incest offenders. In a study into long term recidivism rates of 197 child

molesters released from prison, Hanson and colleagues (1993) used survival analyses and found even

higher rates for child molesters with boy victims, i.e., about 60%. Furthermore, several studies have


37

demonstrated relatively low sexual recidivism rates (4% - 20%) for incest offenders, especially for

those with female victims, as compared to child molesters with extrafamilial victims (Alexander,

1999; Firestone et al., 1999; Hanson et al., 1993; Harris & Hanson, 2004; Quinsey, Lalumière et al.,

1995). It should be borne in mind, however, that there is possible overlap between child molesters with

intrafamilial and extrafamilial victims. For example, Studer (2000) asked 150 incest offenders about

their past and found more than half of them reported sexual offenses with extrafamilial children in

their past. While some state that intrafamilial child molesters have low recidivism rates because they

rarely show deviant sexual preferences (Barbaree & Seto, 1997), others have found that both

intrafamilial and extrafamilial child molesters show deviant arousal patterns (as measured by

phallometric tests) to stimuli depicting prepubertal children (Barsetti, Earls, Lalumière, & Bélanger,

1998; Rice & Harris, 2002).

Few studies have specifically addressed the group of exhibitionists, but studies that did showed high

sexual recidivism rates (see Alexander, 1999; McGrath, 1991). For example, Alexander (1999)

reported that 57% of untreated exhibitionists was rearrested for a new sexual offense. Again, it should

be noted that there is overlap between exhibitionists and other types of sexual offenders. Research has

demonstrated that about a quarter of convicted exhibitionists had had at least one conviction for a

contact sexual offense (Abel, Becker, Cunningham-Rather, Mittelman, & Rouleau, 1988; Sugarman,

Dumughn, Saad, Hinder, & Bluglass, 1994).

Concluding, although it is not simple to report clear-cut base rates of sexual recidivism per subgroup

of sexual offenders, it is possible to make a hierarchy, which can be useful for the risk assessor.

Generally, child molesters with extrafamilial boy victims and exhibitionists pose the highest chance of

sexual recidivism, then rapists and child molesters with extrafamilial girl victims, and heterosexual

incest offenders pose the lowest risk. Sexual offenders who offended against different categories of

victims are suggested to pose the highest risk of sexual recidivism.

Specific risk factors in sexual offenders

Various studies and meta-analyses have indicated that there are specific risk factors for sexual violence

apart from those for general violence (Hanson & Bussière, 1998; Hanson & Morton-Bourgon 2004;

Quinsey, Lalumière et al., 1995; Rice & Harris, 1997). Hanson and Morton-Bourgon (2004) reviewed

95 studies into recidivism risk factors in sexual offenders and concluded that the most important

predictors of sexual recidivism are antisocial lifestyle and deviant sexual interests, i.e., a relatively

stable pattern of sexual arousal to inappropriate stimuli such as prepubescent victims, or the suffering

of female adult victims. Other significant predictors of sexual recidivism are prior sexual offenses,

intimacy deficits, attitudes tolerant of sexual assault, failure to complete treatment or cooperate with

supervision (Hanson & Bussière, 1998; Hanson & Morton-Bourgon 2004; Langton, 2003), hostility

(Prentky, Knight, Lee, & Cerce, 1995; Rice, Quinsey, & Harris, 1991), having victims from multiple

categories, seriousness of victim injury (Rice & Harris, 1997), problems with sexual self-regulation

Chapter 1

38

(Hanson, Gizzarelli, & Scott, 1994), and emotional identification with children (Wilson, 1999). There

are indications that not all empirically derived risk factors are equally relevant to the different types of

sexual offenders. For example, rapists compared to child molesters were found to have a more

antisocial lifestyle (Firestone et al., 2000; Simon, 2000). Whereas this general criminal deviance or

antisocial lifestyle may be more important for rapists, sexual deviance may be a more important

predictor for child molesters (Rice & Harris, 1997). Barbaree and Marshall (1989) stated that sexual

deviance is an important predictor and probably causal factor of sexual recidivism especially in child

molesters with extrafamilial boy victims.

Based on their meta-analysis, Hanson and Morton-Bourgon (2004) concluded that none of the

predictors alone is sufficiently predictive of sexual recidivism to be used in isolation. A combination

of risk factors that has been proven to be strongly predictive of sexual recidivism is sexual deviation

and psychopathy as measured by the PCL-R (Hildebrand, de Ruiter, & de Vogel, 2004; Rice & Harris,

1997; Serin, Mailloux, & Malcolm, 2001). Hildebrand, de Ruiter, and de Vogel (2004) conducted a

study into the role of the PCL-R and sexual deviance as measured by the item ‘Sexual deviance’ of the

SVR-20 in a sample of 94 convicted rapists from the Dr. Henri van der Hoeven Kliniek. They found

psychopathic sexual offenders with deviant sexual preferences to be at substantially greater risk of

sexual recidivism compared to other sexual offenders (see Figure 2).


39

Figure 2. Kaplan-Meier survival curves for sexual recidivism for psychopathic (PCL-R > 26) and nonpsychopathic (PCL-R < 26) rapists subdivided into those with and without deviant sexual preferences

Note. Adopted from Hildebrand (2004, p. 162).

Regarding the predictor sexual deviance, the question is how to adequately assess deviant sexual

preferences. A widely used assessment tool for sexual deviance is penile plethysmography, also

referred to as phallometric tests, in which changes in penile tumescence are measured during

presentations of pictures or audiotaped vignettes of sexual scenarios. In The Netherlands, however,

this method is hardly used. The use of penile plethysmography as a reliable indicator of sexual

deviance is controversial. The reliability and validity have been questioned and ethical concerns have

been raised (see Laws, 2003; Marshall & Fernandez, 2000). Standardized procedures are still lacking

with respect to, for instance, duration of the presentation of the stimuli and the type and intrusiveness

of the stimuli presented. Research has demonstrated that the validity of phallometric tests can be

affected, for instance, by faking by participants and by the method of scoring (Harris et al., 1998).

Harris and colleagues (1998) believe, however, that under limited circumstances (see Harris & Rice,

1996) phallometric tests can be valid and reliable procedures. They reviewed the literature into risk of

recidivism in sexual offenders and concluded that sexual deviance as measured with phallometric tests

can discriminate with high accuracy between sexual offenders and non-sexual offenders.

0,0

0,2

0,4

0,6

0,8

1,0

0 5 10 15 20Time - in Years

Cum

ulat

ive

prop

ortio

n su

rviv

ing

Deviant psychopathsDeviant nonpsychopathsDeviant nonpsychopathsNondeviant psychopaths

Chapter 1

40

Summary and implications for risk assessment in sexual offenders

The risk assessor should be cognizant of specific risk factors for sexual violence and of empirical base

rates of sexual recidivism in different types of sexual offenders. Generally, child molesters with

extrafamilial boy victims, exhibitionists, and sexual offenders who offended against different

categories of victims pose the highest risk of sexual recidivism, then rapists and child molesters with

extrafamilial girl victims, and heterosexual incest offenders pose the lowest risk. The most important

predictors of sexual recidivism are antisocial lifestyle and deviant sexual preferences. A very

important combination of risk factors in sexual offenders is sexual deviance and psychopathy as

measured by the PCL-R. Repeated risk assessments are important because research has demonstrated

that some sexual offenders, especially child molesters, reoffend after a long period of time.

Furthermore, the risk assessor should not only focus on the risk of sexual violence, but also consider

the risk of non-sexual violent and / or general offenses.

Female offenders

Gender is one of the most significant predictors of violence; regardless of age, ethnicity, culture and

socioeconomic status, men are significantly more often convicted for violent offenses than women

(Archer & McDaniel, 1995; Monahan et al., 2001). However, research also suggests that mental

disorder reduces the gender gap in violence, especially for inpatient aggression. Among psychiatric

patients, the base rate for (inpatient) violence is not significantly different for male and female patients

(Lidz et al., 1993; McNiel & Binder, 1990; Newhill, Mulvey, & Lidz, 1995; Nicholls, 2001; Tardiff,

Marzuk, Leon, Protera, & Wiener, 1997). Ross, Hart and Webster (1998) found no sex differences in

the occurrence of inpatient aggression between a sample of 82 male and 49 female psychiatric patients.

However, regarding violence in the community after treatment, male patients were found to be four

times more likely than female patients to express any aggression. Research has demonstrated that the

diagnoses of schizophrenia, substance abuse and antisocial personality disorder have a stronger effect

on women than on men with respect to violence risk (Beck & Wencel, 1998; Bland & Orn, 1986;

Eronen, Angermeyer, & Schulze, 1998; Hodgins, 1992; Lindqvist & Allebeck, 1990; Modestin, 1995;

Wallace et al., 1998). For example, in a study in 1423 homicide offenders in Finland, schizophrenia

with a secondary diagnoses of alcohol abuse increased the risk of homicide by 16.6 times in men and

84.6 times in women (Eronen, Hakola, & Tiihonen, 1996).

Research has demonstrated that unstructured clinical judgment of violence risk is sensitive to sex-

based biases; mental health professionals tend to underestimate the risk of violence in female

psychiatric patients (Lidz et al., 1993; McNiel & Binder, 1995). Use of structured risk assessment

instruments is recommended to avoid these types of biases (Borum, 1996). However, existing

structured risk assessment instruments are largely developed based on violence risk research primarily

in male samples. Thus, the question arises if the risk factors for violence found in male samples are

also valid for females and consequently, if the existing structured risk assessment instruments are


41

suitable for use with female patients. Several authors have argued that risk factors for violence in

female samples are generally the same as in male samples and that existing risk assessment

instruments are likely valid for use with females (Blanchette, 1994; Harer & Langan, 2001; Simourd &

Andrews, 1994; Strand & Belfrage, 2001). Loucks and Zamble (1999) compared the characteristics of

100 female offenders to a sample of male offenders10, and although they found some differences in the

occurrence of important life experiences, these differences were not predictive of criminal behavior. In

contrast, others have argued that assessing risk of violence is different for women compared to men

because risk factors for women are closely linked to their unique experiences as a woman, for instance,

victimization (Chesney-Lind, 1989; Scarth & McLean, 1994) or to the fact that social bonds are of

greater importance to women than to men and that women are thus more sensitive to disruptions in

close relationships (see Funk, 1999; Odgers & Moretti, 2002). Funk (1999) tested risk factors for

reoffending in 388 male and 112 female juvenile delinquents on probation and found several risk

factors (e.g., child abuse or neglect, running away from home) that were significantly predictive for

females but not for males. Therefore, she concluded that risk factors for females differ substantially

from those of their male counterparts, that risk assessment instruments fail to identify most female risk

factors, and that separate risk assessment instruments for males and females should improve

classifications for risk of reoffending. Only one structured risk assessment instrument has been

developed especially for the assessment of risk of antisocial behavior in females; the EARL-21G for

girls between 6 and 12 years old (Levene et al., 2001). Vitale and Newman (2001) stated that existing

risk assessment instruments have not yet been adequately tested to determine their generalizability to

women.

Nature and severity of violence in female offenders

Research has shown that in general, the nature, severity and victims of violent offenses committed by

women are different from those committed by men. Female violence is less often sexual in nature, less

often characterized as instrumental and more often as reactive, less often resulting in serious injury,

more often relational and more often occurring in the residence (Monahan et al., 2001; Nicholls, 2001;

Odgers & Moretti, 2002). However, regarding seriousness of injury, it has also been found that

violence by women against partners and children is more likely to lead to the death of the victim than

is the case with men (Logan, 2004).

The HCR-20 and PCL-R in females samples

The HCR-20 was primarily developed on the basis of research in male samples and most research into

the psychometric properties of the HCR-20 has been conducted in male samples. Therefore, the

question whether the HCR-20 is also suitable for use with females seems important. Nicholls (2001)

10 The authors do not mention the number of males and whether the males were matched to the females.

Chapter 1

42

conducted a retrospective study to evaluate the validity of the HCR-20 and the PCL:SV for assessing

female patients’ risk of inpatient and community violence. She compared the results of 47 female

patients to a matched sample of 47 male patients admitted to a forensic psychiatric hospital and found

the distribution of the mean HCR-20 and PCL:SV scores to be comparable. The HCR-20 showed good

predictive accuracy for inpatient aggression for both male and female patients. The predictive accuracy

of the HCR-20 for community aggression was modest for both samples. Strand and Belfrage (2001)

conducted a retrospective study to investigate the utility of the HCR-20 in a female forensic

psychiatric sample. They compared HCR-20 scores of 63 female and 85 male patients admitted to two

forensic psychiatric hospitals in Sweden and found some significant differences in mean individual

item scores. However, the mean subscale scores and total score did not differ significantly. The

authors thus concluded that the HCR-20 is suitable for use in female forensic psychiatric patients,

particularly to assess inpatient violence. A limitation of this research is that the authors did not

examine the predictive validity of the HCR-20 scores for violent outcome, which is the most important

aspect to conclude whether the HCR-20 is adequate for female patients.

Several studies have been conducted into the use of the PCL-R in female samples. In general, a lower

prevalence of psychopathy among females compared to males was found (Grann, 2000; Salekin,

Rogers, & Sewell, 1997; Vitale, Smith, Brinkley, & Newman, 2002; Warren et al., 2003). Vitale and

Newman (2001) reviewed the literature regarding the PCL-R in female samples and found good

support for its reliability, but modest support for its predictive validity. They concluded that whereas

the PCL-R might be able to postdict violent behavior in the past, there is no evidence that the PCL-R

can predict future violence in women. The issue whether the PCL-R is suitable for the assessment of

psychopathy in women is not settled. Some have argued that the PCL-R is adequate for assessing

psychopathy in women, since they found a considerable degree of similarity to the construct of

psychopathy in male offenders (Salekin et al., 1997; Warren et al., 2003). On the contrary, Vitale and

colleagues (2002) believe that the findings thus far are not sufficiently convincing to conclude towards

a similarity of the PCL-R structure across gender. They express concern that some PCL-R items are

not adequately assessing the construct of psychopathy as it is expressed in women.

Summary and implications for risk assessment in female offenders

The above suggests assessment of violence risk differs at least to a certain degree between female and

male patients, and that the utility of the existing structured risk assessment instruments in female

samples has yet to be convincingly proven. Risk factors that seem more relevant to violence risk in

women are, for instance ,child abuse and relationship difficulties. Risk assessors should be cognizant

of the base rates of female violence and the different nature of violence in women. When risk assessors

use the current risk assessment instruments developed for men in female populations, they should

excersise great caution in the interpretation of the results.


43

Violence risk communication Violence risk communication is usually understood as the written or verbal report of the results of a

violence risk assessment by a risk assessor to a decision-maker (e.g., the court, mental health

professionals). Besides, violence risk can be communicated to the (relatives of the) person whose risk

has been judged or to possible victims in order to warn them. Despite considerable advances in

research on violence risk assessment, empirical research into violence risk communication is still in its

infancy (Heilbrun, Dvoskin, Hart, & McNiel, 1999; Heilbrun, Philipson, Berman, & Warren, 1999;

Slovic, Monahan, & Macgregor, 2000). In 1996, Monahan stated that over the next 20 years, risk

communication should become an essential adjunct to risk assessment. Heilbrun, Dvoskin et al. (1999)

offered a number of justifications for studying violence risk communication, for instance, the

importance of effective risk communication as a link between risk assessment and risk management

and the impact of risk communication on legal decision-making. The authors stated that “The only

way risk assessors can influence decisions is by effectively communicating their findings to the legal

and clinical actors whose decisions they wish to influence” (p. 94). The major challenge of risk

assessors who communicate violence risk is the translation of nomothetic empirical data into

understandable idiographic, case-specific conclusions.

Preferences of mental health professionals in violence risk communication

Heilbrun and colleagues conducted several studies into risk assessors’ practices and preferences in

violence risk communication. In a group of 55 clinicians it was found that most of them preferred to

specify factors that raise or lower risk and / or to use categories (low, moderate, high) in

communicating violence risk (Heilbrun, Philipson et al., 1999). Only one clinician used numerical

probability figures. The most common reasons for the clinicians not to use numbers were that they felt

that the state of the research literature did not justify using specific numbers and that they did not feel

that precise. In a study of 71 experienced psychologists and psychiatrists it was found that the most

highly valued form of risk communication was categorical and had explicit implications for risk

management (Heilbrun, O’Neill, Strohman, Bowman, & Philipson, 2000). Huss and Zeiss (2004) also

found empirical support for clinicians’ preferences to use risk categories and reluctance to use

numerical probabilities in estimating violence risk. Heilbrun and colleagues (2000) provided risk

assessors with twelve recommendations to communicate the results of the violence risk assessment to

decision-makers, among others, to use plain language, clearly state the purpose of the assessment,

describe the procedures used in the assessment, describe the results in terms of consistency with other

sources, describe not only the risk level, but also the nature and imminence of violence risk and

identify management recommendations.

Slovic and colleagues (2000) examined the effect of using frequencies or probabilities in risk

communications. They asked 479 experienced clinicians to judge vignettes in which the violence risk

Chapter 1

44

was given in frequencies or probabilities. They found that when the violence risk was communicated

as a relative frequency (e.g., 2 out of 10) this led to much higher perceived risk than when it was

communicated as a comparable probability (e.g., 20%). The authors suggested that the reason for this

might be that it is easier to visualize numbers, and that visualizing violent mental patients may evoke

the emotion of fear in clinicians who subsequently react by keeping the patients in the hospital. More

recently, a study was conducted to replicate these findings and to examine if there were differences

between clinicians who were asked to evaluate a vignette with a pallid outcome depiction versus those

with a vivid outcome depiction. It was found that both frequency and vivid depiction of violent

behavior resulted in more conservative risk management decisions by clinicians working in forensic

settings (Monahan et al., 2002). Regarding the use of probabilistic estimates of violence risk, research

demonstrated that when clinicians had to rate their conclusion on a response scales allowing more

discriminability among smaller probabilities this led to lower risk judgments (i.e., when the response

scale ranged from 1 to 100% the risk was judged as higher by clinicians than when it ranged from 1 to

40%; Slovic & Monahan, 1995; Slovic et al., 2000). Slovic and colleagues (2000) advise clinicians to

employ multiple formats for communicating violence risk (i.e., to communicate both frequencies and

probabilities, see also Monahan & Steadman, 1996) or to use a categorical format (i.e., low to very

high).

Violence risk communication to judicial decision-makers

The studies described above examined how mental health professionals prefer and perceive risk

communications but did not consider how decision-makers (e.g., the court) or policy makers evaluate

violence risk communications. Besides, in these studies vignettes were used instead of real cases. The

question is how risk assessors communicate violence risk in practice and if decision-makers are able to

understand, interpret and act upon the violence risk communications in the way the risk assessors

intended it. Grann and Pallvik (2002) examined 142 existing written reports in forensic psychiatric

evaluations to the court in Sweden. They found that in the majority of the cases risk of criminal

recidivism was communicated (86%), but that the risk was primarily communicated when the risk was

perceived to be high. The identification of specific risk factors and providing suggestions for risk

management was common, but describing factors that decrease risk or specifying the imminence of the

risk was rare. Studies that addressed preferences of the court regarding violence risk communication

has demonstrated that judges consistently preferred individualized data containing behavioral

observations and historical data, instead of statistical information (Melton, Petrila, Poythress, &

Slobogin, 1997). In a study in 187 cases into the influence of actuarial risk assessment, it was found

that the best predictor of tribunal decision was the senior clinician’s testimony (unstructured clinical

judgment) and not the actuarial risk score on the VRAG (Hilton & Simmons, 2001). There was no

significant association between the actuarial scores and the clinician’s testimony or the tribunal

decision. The authors concluded that providing decision-makers with actuarial results did not alter


45

long established patterns of forensic decision-making. This means that a categorical format is most

likely the preferred mode of risk communication for decision-makers.

Violence risk communication according to the SPJ model

According to the SPJ model raters are strongly advised to not merely provide percentages or numbers,

but to be descriptive in their risk communications. The formulation of the risk assessment should be

clear and unambiguous to avoid misunderstanding. In the HCR-20 manual, Webster and colleagues

(1997b) recommend raters to follow the historical (past), clinical (present), risk management (future)

and final risk judgment structure in their report. In the historical section, the rater should describe the

psychiatric and criminal history of the patient. In the clinical section, a description of the current

mental state of the patient and possible active symptoms is important. In the risk management section,

the rater should provide guidelines for the reduction of violence risk, for instance, intervention

strategies and needed level of security. Finally, in the section about the final risk judgment, the rater

should conclude upon the patient’s risk of future violence based on the previous three sections of the

report. The final risk judgment should not be in numerical scores, but in terms of low, moderate or

high. Important issues are not only the likelihood of violence, but also its the nature, severity,

frequency and imminence.

In the SVR-20 manual, Boer and colleagues (1997) recommend that raters address the following basic

questions in their communications: 1) What is the likelihood that the individual will engage in (sexual)

violence, if no efforts are made to manage risk?; 2) What is the probable nature, frequency, and

severity of any future (sexual) violence?; 3)Who are the likely victims of any future (sexual)

violence?; 4)What steps could be taken to manage the individual’s risk of (sexual) violence?; 5)What

circumstances might exacerbate the individual’s risk of (sexual) violence? The rater should indicate

the time frame for which the risk assessment is valid and how possible changes in the situation of the

patient could alter the risk assessment. Finally, the rater can put his or her findings in a larger

perspective, for instance, by referring to empirical or professional literature on base rates for the type

of (sexual) violence that is assessed.

Violence risk management

Violence risk management can be defined as all intervention strategies aimed at reducing violence risk

developed on the basis of the results of violence risk assessment. Several authors view violence risk

management as incorporated in violence risk assessment (Hart, Webster, & Douglas, 2001; Dernevik,

Grann, & Johansson, 2002; Reed, 1997). “The ultimate goal of violence risk assessment is violence

prevention, that is, minimizing the likelihood of and negative consequences stemming from any future

violence” (Hart et al., 2001, p. 15). Others have argued that risk assessment and risk management

should be separate processes (Heilbrun, 1997; Monahan et al., 2001). These latter scholars believe that

Chapter 1

46

risk assessment should focus on static risk factors and risk management on dynamic risk factors. The

advocates of the SPJ model of risk assessment on the other hand believe that “historical factors are no

less important and must be considered in the development of any sensible risk management plan”

(Hart et al., 2001a, p. 15). Despite important advances in violence risk assessment, relatively little

attention has been paid to the joining of violence risk assessment and risk management in the empirical

literature (Dernevik et al., 2002). Douglas and Kropp (2002) stated that research that addresses

whether risk assessments actually lead to the prevention or reduction of violence are sorely lacking.

The link between violence risk assessment and management is important because many contexts call

for management rather than risk prediction (Douglas & Kropp, 2002; Dvoskin & Heilbrun, 2001;

Heilbrun, 1997; Otto, 2000). Douglas and Kropp (2002) offered suggestions for research linking

violence risk assessment and management, for instance, prospective repeated measures designs using

survival analyses with time dependent covariates or experimental clinical trials.

Andrews, Bonta and Hoge (1990) integrated violence risk assessment and management. They

emphasized that the level of service should be matched with the assessed risk of recidivism and

developed the Level of Service Inventory-Revised (LSI-R; Andrews & Bonta, 2000) to identify

dynamic areas of risk and need that may be addressed by intervention programs to reduce risk.

Andrews and colleagues (1990) described four principles of classification for effective rehabilitation:

1) risk; higher levels of services are reserved for higher risk cases; 2) need; targets of service are

matched with the criminogenic needs of offenders; 3) responsitivity; styles and modes of service are

matched to the learning styles and abilities of the offender; and 4) professional override; having

considered risk, need and responsitivity, decisions are made as appropriate under present conditions.

Another example of a model developed to minimize violence risk based on risk assessment is provided

by Otto (2000). This model includes three different types of intervention. First, the implementation of

treatment designed to affect underlying disorders, symptoms, thinking patterns or behaviors which

increase violence risk. The second intervention comprises target hardening, i.e., intervention that

focuses on the potential victim(s), for instance, warning victims, increasing security of victims, or

providing risk education and symptom monitoring to, for instance, relatives of a psychiatric patient.

The third intervention includes incapacitation of the person with a high violence risk. This

incapacitation does not address factors underlying the threatened violence and thus will only be helpful

in the short run. Sheldrick (1999) provided some general suggestions for violence risk management.

For instance, she suggested that mental health professionals should respond as rapidly as possible to

high-risk cases and that they should not only think about risk factors but also about protective factors

or strengths of the patient when constructing a risk management plan (see also Rogers, 2000). Pinard

and Pagani (2001) also listed some important considerations in risk management strategies, for

instance, periodic re-evaluation of the risk, collaboration and communication between professionals,

preventive intervention programs, pharmacological intervention, and coordination of services. For

more information on (effectiveness of) offender treatment (programs), the reader is referred to the a


47

number of review papers (e.g., Gendreau & Andrews, 1990; Green, Pedley, & Whittingham, 2004;

Hollin, 1999; Lösel, 1998; Hodgins & Müller-Isberner, 2000; Ward & Eccleston, 2004).

Violence risk management according to the SPJ model

The SPJ model specifically aims at gaining insight into violence risk (factors) in the individual case in

order to prevent future violent behavior by developing risk management strategies. Kropp and Hart

(1997) offered some examples of violence risk management strategies suggested by SARA items, for

instance, couples counseling for the item ‘Recent relationship problems’. Douglas and Kropp (2002)

also listed some examples of violence risk factors and related prevention strategies, for instance, court-

ordered abstinence or urinanalysis for patients who have abused substances. The HCR-20 Companion

guide (Douglas, Webster, Hart, Eaves, & Ogloff, 2001) was developed with the purpose of informing

users of the HCR-20 how best to reduce risk of aggression and violence. In this HCR-20 Companion

guide, detailed guidelines for risk management that follow directly from the risk factors are provided

for mental health professionals to be used in their day-to-day practice. The authors suggested that risk

reduction strategies may be most effective when they are the product of cooperation among clinicians,

researchers and administrators. Four basic types of intervention that risk management typically

comprises are described in the HCR-20 Companion guide: monitoring, treatment, supervision, and

victim safety planning. First, monitoring strategies include, for instance, repeated measures with the

HCR-20 to monitor treatment progress. Second, treatment should be aimed at improving deficits in the

individual’s psychosocial adjustment, for instance, it can be directed at the mental disorder that is

causally related to the individual’s history of violence or at the reduction of acute life stresses. Third,

supervision involves the restriction of the individual’s rights or freedoms, for instance, incarceration or

community supervision. The intensity of the supervision should be commensurate with the level of

violence risk of the individual as indicated by the structured risk assessment. Fourth, victim safety

planning includes improving the victim’s dynamic and static security resources, also referred to as

target hardening. Victim safety planning can include, for example, training in self-protection,

improving visibility of the victim’s house, or installing alarms in the victim’s house or garden.

In the HCR-20 Companion guide, the main emphasis is on the dynamic, changeable Clinical and Risk

management items, although the authors suggest that some Historical factors also have potential for

change. For instance, items such as ‘Relationship instability’ or ‘Employment problems’ might change

for the better or for worse, whilst items such as ‘Prior supervision failure’ can only change in a

negative way. Also, new information, or re-evaluation of old information could alter the coding of the

Historical items. However, the focus of risk management should be on the dynamic or criminogenic

risk factors - in other words - on the changeable and causally related factors to violence (Douglas &

Kropp, 2002). For all of the Clinical and Risk management items, intervention and management

strategies are provided aimed at reducing the risk factor.

Differences between clinicians and researchers in assessing risk of violence in forensic psychiatric patients

This chapter is a slightly revised version of Vogel, V. de, & Ruiter, C. de (2004). Differences between clinicians and researchers in assessing risk of violence in forensic psychiatric patients. The Journal of Forensic Psychiatry and Psychology, 15, 145-164. The authors wish to thank all clinicians and researchers who participated in this study. Special thanks go to Cécile Vandeputte-van de Vijver who functioned as workshop trainer together with the first author and also participated as a researcher in the study.

22

Differences between clinicians and researchers in assessing risk of violence

51

CHAPTER 2

Differences between clinicians and researchers in assessing risk of violence in forensic psychiatric patients

The assessment of risk of violence is an important task of psychologists working in forensic practice.

A carefully conducted risk assessment before a probationary leave, parole decision, or termination of

(mandatory) treatment can help to appraise the risk of recidivism in an adequate way and thereby assist

in risk management (Douglas & Webster, 1999a). To date, the best known and most widely used

method in practice, at least in The Netherlands, is the unstructured clinical judgment approach which

is exclusively based on the professional expertise of the clinician. However, research has revealed

some important limitations of this unstructured clinical judgment, such as poor reliability and validity

(see for a discussion of these disadvantages Quinsey et al., 1998, pp. 55-72). The employment of

structured risk assessment procedures is highly recommended (Borum, 1996; Webster et al., 1997a).

One of the most promising risk assessment instruments at the moment is the Historical, Clinical, Risk

management-20 (HCR-20; Webster et al., 1997b). This instrument consists of 20 items representing

risk factors for violence in the past, present and future. Research in various psychiatric and

penitentiary settings in different countries has demonstrated good interrater reliability and predictive

validity for the HCR-20 total score (Belfrage, 1998; Belfrage et al., 2000; Douglas et al., 2005; Strand

et al., 1999). Furthermore, it was demonstrated that changes in risk during clinical intervention can be

measured with the HCR-20 (Belfrage & Douglas, 2002). However, an important disadvantage of many

studies into the HCR-20 is their retrospective design. So far, only a few prospective studies have been

Do clinicians and researchers differ in their violence risk assessment of the same patient? In this study, the Dutch version of the HCR-20 was coded by two independent researchers and two independent clinicians (treatment supervisorand group leader) for 60 patients admitted to the Dr. Henri van der HoevenKliniek. The aim of the study was threefold: (1) to establish the interrater reliability of the Dutch HCR-20; (2) to gain insight into differences betweenresearchers and clinicians in coding the HCR-20; and (3) to examine the relationship between clinicians’ feelings towards their patients and their risk judgment. Overall, the interrater reliability of the HCR-20 was good. The group leaders gave significantly lower HCR-20 scores than the researchers. There were no significant differences between the mean HCR-20 scores of treatment supervisors and researchers, but there was a significant difference in theinterpretation of the scores: treatment supervisors had more ‘low risk’ judgmentsthan researchers. Furthermore, it was found that feelings of clinicians towardstheir patients were associated with their risk judgment. Feelings of beingcontrolled and manipulated by the patient were related to higher HCR-20 scores, whereas positive feelings (helpful, happy, relaxed) were related to lower riskjudgments.

Chapter 2

52

conducted into the predictive validity of the HCR-20 (e.g., Belfrage et al., 2000; Douglas, Cox et al.,

1999).

Another disadvantage of many studies into the HCR-20 concerns the ecological validity, i.e., its

relevance to actual clinical risk assessment practice. In most published studies, the HCR-20 was coded

by independent researchers, not by practicing clinicians. Generally, the researchers did not know the

patient personally and coded the HCR-20 solely on the basis of file information. When using the HCR-

20 in clinical practice - the assessment of risk of future violence and the use in leave decisions - coding

by experienced clinicians is required (Webster et al., 1997b). Also, in clinical practice it is usually the

treating clinicians who are responsible for leave decisions (Dernevik et al., 2001). However, there is

some doubt about the objectivity of clinicians, especially clinicians who are closely involved in the

treatment of the patient (Dernevik et al., 2001; Litwack & Schlesinger, 1999). Are treating clinicians

who know the patient well and who have invested a lot in his treatment capable of putting aside their

personal feelings towards the patient when they assess their risk of violence? Several authors have

argued that these feelings inevitably lead to subjectivity and as a consequence the roles of forensic

assessor and treating clinician are irreconcilable (Ackerman, 1999; Greenberg & Schuman, 1997;

Litwack & Schlesinger, 1999). These authors suggest that more distant raters, not directly involved in

the treatment, should conduct forensic evaluations. On the other hand, it can be argued that the treating

clinician has the most detailed knowledge on the patient which is necessary to perform forensic

evaluations. Also, the argument is raised whether these feelings of countertransference are truly

irrelevant or irrational or may actually contribute to a more accurate risk assessment. Furthermore, it

can be reasoned that because of the structured guidelines for risk assessment, such as the HCR-20,

countertransference feelings are less likely to interfere. From all of these arguments it can be

concluded that it is important to examine what exactly is the role of feelings towards the patient and if

there are differences between researchers and clinicians in performing risk assessments, as well as in

the accuracy of these assessments.

Dernevik et al. (2001) studied the relationship between clinicians’ feelings as measured by the Feeling

Word Checklist (FWC; Whyte, Constantopoulos, & Bevans, 1982) and HCR-20 scores in a forensic

psychiatric institution in Sweden. Forty experienced nurses who had attended a one-day training

workshop in risk assessment, coded the HCR-20 for one of eight patients. Every patient was thus

judged by five nurses. The risk of these eight patients was also assessed by a number of independent

experts in risk assessment. The nurses gave a significantly higher mean HCR-20 total score than the

experts. Furthermore, this study demonstrated “the nurses’ scores on the HCR-20 assessments to be

influenced by their scores on the FWC” (R = .66, R2 = .43; Dernevik et al., 2001, p. 94). Feeling close

and accepting towards the patient was associated with a higher HCR-20 score and feelings of

helpfulness and autonomy were associated with a lower score. Douglas and Belfrage (2001)

commented on this study that it cannot be concluded that the FWC scores actually caused differences

in HCR-20 total scores rather than merely being correlated with them. However, they acknowledged


53

the possible effect of bias and considered this research a reminder for raters. Recently, Dernevik and

Douglas (2002) conducted a follow up on this study and demonstrated that after a period of two years,

the nurses showed good predictive validity for inpatient violence, but not for (violent) recidivism after

discharge from the hospital. On the contrary, the experts’ were more accurate in their predictions of

violence in the long term, that is, (violent) recidivism after discharge.

In this chapter, results are presented of a prospective study which started in January 2001. The

authorized Dutch version of the HCR-20 (Philipse et al., 2000) was coded for 60 patients admitted to

the Dr. Henri van der Hoeven Kliniek by both clinicians and independent researchers. The aim of the

study was to establish the interrater reliability of the HCR-20, to gain insight into differences between

clinicians and researchers in coding the HCR-20, and to examine if clinicians’ feelings towards their

patients as measured by the FWC were related to the risk assessments. Results into the predictive

validity of the Dutch HCR-20 as well as differences between clinicians and researchers in risk

assessment accuracy are not presented in the present study (see Chapter 6).

Method

Setting

This study was conducted in the Dr. Henri van der Hoeven Kliniek, a Dutch forensic psychiatric

hospital in Utrecht, The Netherlands (see Introduction, p. 2).

Subjects

The current sample comprised 53 men and seven women. The mean age at the time of the risk

assessment was 36.6 years (SD = 8.0, range 22-54). Seventy percent of the subjects had been

convicted before their tbs-order, with an average number of 4.9 (SD = 13.0, range 0-15) prior

convictions. The index-offenses were: 57% (attempted) homicide, 20% sexual offenses, 17% other

violent offenses and 7% arson. In 8% of the patients a DSM-IV (APA, 1994) Axis I disorder was

diagnosed, 52% met the criteria for one or more Axis II disorders, and in 35% there was comorbidity

of Axis I and II disorders. Axis I diagnoses were lifetime clinical diagnoses based on consensus

between four raters (see Hildebrand & de Ruiter, 2004), Axis II disorders were diagnosed with the

Structured Interview for DSM-IV Personality (SIDP-IV; Pfohl, Blum, & Zimmerman, 1995).

Instruments

HCR-20

The HCR-20 is a structured professional guideline (checklist) designed for the assessment of risk for

violence in adult offenders (see Chapter 1, p. 17 for a detailed description).

Chapter 2

54

FWC

The FWC was administered to map the clinicians’ feelings towards their patients. This list was

originally developed by Whyte and colleagues (1982) and adapted by Holmqvist and Armelius (1994)

and Holmqvist and Fogelstam (1996). Table 1 shows the 30 items of the instrument relating to

‘feelings’ that have to be coded on a four point scale according to the extent to which the clinician has

the specific feeling towards the patient whose risk is being assessed (see also Appendix IV). The items

are grouped into eight subscales which can be viewed as four continuous dimensions: (1) Helpful

versus Unhelpful; (2) Accepting versus Rejecting; (3) Close versus Distant; (4) Autonomy versus

Controlled.

Table 1. Feeling Word Checklist When I think about ……… I feel: 1.

Helpful

16.

Surprised

2. Happy 17. Tired 3. Angry 18. Threatened 4. Enthusiastic 19. Receptive 5. Anxious 20. Objective 6. Strong 21. Overwhelmed 7. Manipulated 22. Bored 8. Relaxed 23. Motherly 9. Cautious 24. Confused 10. Disappointed 25. Embarrassed 11. Indifferent 26. Interested 12. Affectionate 27. Aloof 13. Suspicious 28. Sad 14. Sympathetic 29. Inadequate 15. Disliked

30. Frustrated

Note. From Whyte et al. (1982). Subscale items: Helpful: 1, 2, 8; Unhelpful: 3, 10, 15, 18, 24; Close: 12, 16, 21,

23; Distant: 9, 11, 27; Accepting: 4, 14, 19, 26; Rejecting: 13, 17, 22, 30; Autonomy: 6, 20; Controlled: 5, 7, 25, 28, 29.

Procedure

First, all raters in this study were trained in coding the HCR-20 during a one-day workshop given by a

senior clinical psychologist and a research psychologist. During this workshop, relevant literature was

discussed and the HCR-20 coding procedure was practiced on the basis of file information and

videotapes of actual cases. All raters were instructed to use the HCR-20 manual and all available file

information in all cases they had to rate.

During treatment, a number of specific phases can be distinguished in which the liberties of a patient

expand and therefore the risk of violence needs to be (re-)evaluated. These phases are when a patient


55

has his first leave from the hospital without supervision and when a patient is about to enter the

transmural phase. Since January 2001, the HCR-20 was coded for all patients who were in the above

two phases, and for all patients who were already in the transmural phase. In addition, the HCR-20

was coded for all patients who were newly admitted to the hospital to assess the risk of inpatient

violence. Two researchers, a group leader and the treatment supervisor independently coded the HCR-

20 for each case. When the patient was a sexual offender, the Dutch version of the SVR-20 (Boer et

al., 1997; authorized Dutch version: Hildebrand et al., 2001) was coded in addition to the HCR-20.1

All raters had access to file information that, in general, consisted of psychological reports, reports to

the court regarding treatment progress and recommendations for termination or prolongation,

treatment plans and evaluations. Subsequently, the two researchers agreed upon a consensus score in a

case conference, and one of the researchers took this to a joint meeting with the two clinicians. The

group leader and treatment supervisor independently filled out the FWC before this consensus

meeting. In general, the consensus meetings lasted about one hour and were considered very useful by

both the researchers and clinicians. The results of the risk assessments were not communicated to the

patients because at the time of this study, we still considered the Dutch HCR-20 as a research

instrument with insufficiently established predictive validity.

Raters

Forty-four raters participated in the present study: five researchers, seven treatment supervisors and 32

group leaders. The researchers were all Master’s level clinical psychologists of the Research

department, who are responsible for psychological assessment and empirical research in the hospital.

The researchers are not in a treatment relationship with patients and do not have intensive contact with

them, but they all know the patients to some extent. The treatment supervisors have a supervising and

planning role in the treatment of around 20 patients; they were all senior clinicians, mostly clinical

psychologists or psychotherapists. The professional background of the group leaders was diverse, but

most of them had relevant higher vocational or academic training (e.g., in nursing, social work,

psychology).

Statistical analyses

The interrater reliability of the HCR-20 was examined by means of the intraclass correlation

coefficient (ICC), using the two-way random effect variance model and consistency type (McGraw

and Wong, 1996). Critical values for single measure ICCs were: ICC ≥ .75 = excellent; .60 ≤ ICC <

.75 = good; .40 ≤ ICC < .60 = moderate; ICC < .40 = poor (Fleiss, 1986).

1 Eighteen of the 60 patients were sexual offenders. We consider this group too small and will not present results on the interrater reliability of the SVR-20 at this moment.

Chapter 2

56

The F-test was used to examine differences between researchers, group leaders and treatment

supervisors on HCR-20 subscales and total scores. For differences in HCR-20 risk judgments we used

chi-square tests. The relationship between FWC subscales and HCR-20 total scores / risk judgments

were analyzed using Pearson product-moment correlations and stepwise multiple regression analyses.

Results Risk judgments

The mean HCR-20 total score as agreed upon during the final consensus meetings was 26.1 (SD = 6.5,

range = 10-37), the mean score for the Historical scale 14.6 (SD = 3.3, range = 6-19), for the Clinical

scale 5.3 (SD = 2.2, range = 0-9) and for the Risk management scale 6.1 (SD = 2.1, range = 2-10). On

average, 3.3 (SD = 1.8, range = 0-8) other considerations were identified.

The risk of 17 patients was judged as low (mean total score = 20.8, range = 10-28), of 24 as moderate

(mean total score = 24.7, range = 16-34), and of 19 as high (mean total score = 32.4, range = 23-37).

The differences in mean total scores of the low, moderate and high categories were significant (F (1,

60) = 31.6, p < .01). The overlap in ranges of total scores for the risk judgments low, moderate and

high was rather large.

Other considerations

Frequently coded other considerations were sadistic fantasies, social desirability / elusiveness, social

isolation, financial problems and lack of prospects.

Interrater reliability

Table 2 shows the single measure ICCs for the HCR-20 subscales, total score and risk judgment for

the different groups of raters. Overall, the interrater reliability of the HCR-20 subscales and total score

was good. Particularly for the Historical scale we found excellent interrater reliability. However, two

Historical items showed poor interrater reliability: ‘Previous Violence’ and ‘Early maladjustment’.

Good interrater reliability was demonstrated for the Clinical items, except for the item ‘Impulsivity’.

The Risk management scale revealed moderate interrater reliability. ‘Exposure to destabilizers’,

‘Noncompliance with remediation attempts’, and ‘Stress’ were items with poor interrater reliability.

We found some differences in interrater reliability between the different rater groups. Overall, the

interrater reliability among the researchers was excellent. The interrater reliability among the

researchers and group leaders, as well as among the researchers and treatment supervisors was good,

except for the Risk management items. In contrast, the interrater reliability among group leaders and

treatment supervisors was moderate for both the Clinical and Risk management items.


57

Table 2. Intraclass Correlation Coefficients (ICCs) single measure

H items

C items

R items

Total score

Risk

judgment

Researcher 1 and 2

.85

.75

.70

.85

.76

Researchers and treatment supervisors .83 .67 .56 .79 .65 Researchers and group leaders .78 .70 .58 .82 .68 Treatment supervisors and group leaders .86 .54 .58 .77 .63 Researchers, treatment supervisors and group leaders

.82 .64 .57 .79 .65

Note. All ICCs above 0 (p < .001). Researchers = consensus researcher 1 and 2. H items = Historical items. C

items = Clinical items. R items = Risk management items.

Differences between raters in risk assessments

Table 3 presents the mean HCR-20 scores and overall risk judgments of the different rater groups.

Group leaders gave significantly lower scores on the Historical items, Risk management items, other

considerations and total scores. There were no significant differences in mean HCR-20 scores between

the researchers and the treatment supervisors, except for the number of other considerations. However,

there was a significant difference in risk judgments: treatment supervisors more often judged patients

as ‘low risk’ compared to researchers.

At the end of the consensus meetings, the raters were asked how much time they had spent coding the

HCR-20, and which information they used to found their risk assessment. Researchers said they had

spent on average 120 minutes per risk assessment, group leaders 30 minutes and treatment supervisors

15 minutes. Besides, the researchers stated they based their risk assessments predominantly on file

information whereas the group leaders and treatment supervisors mostly relied on their personal

experiences with the patient. Several treatment supervisors stated that they did not need to read the file

information, because they were already familiar with it and sometimes they had written the

information themselves, for instance, a treatment plan or report to the court.

Chapter 2

58

Table 3. Risk judgments (N= 60)

Mean scores

Risk judgments

H items

C items

R items

Total

Other

Low

Moderate

High

Researchers

14.5a

5.2

6.2c

25.9e

2.4g

8j

28

24

Treatment supervisors 13.8 5.0 5.8 24.6 0.8h 18k 21 21 Group leaders 13.2b 4.5 5.0d 22.7f 0.5i 12 28 20 Consensus

14.6 5.3 6.1 26.1 3.3 17 24 19

Note. a > b p < .05. c > d p < .01. e > f p < .01. g > h > i p < .05. k > j p < .05. H items = Historical items. C items =

Clinical items. R items = Risk management items. Other = other considerations. Clinicians’ feelings towards their patients and the risk assessments

First, Pearson product-moment correlations were computed between the FWC subscales scores and the

HCR-20 total scores / risk judgments (see Table 4). The subscales Unhelpful, Distant, Rejecting and

Controlled showed significant positive correlations with HCR-20 total scores and risk judgments. In

contrast, the subscales Helpful and Accepting demonstrated significant negative correlations with

HCR-20 total scores and risk judgments.

Table 4. Pearson product-moment correlations FWC subscales and HCR-20 total score and risk judgment

Helpful

Unhelpful

Close

Distant

Accepting

Rejecting

Autonomy

Controlled

HCR-20 total score

Helpful

-

Unhelpful -.45** - Close .19* .07 - Distant -.48** .50** -.10 - Accepting .72** -.49** .36** -.42** - Rejecting -.35** .69** .07 .55** -.44** - Autonomy .24** .05 .21 -.03 .11 -.02 - Controlled -.40** .63** .27** .42** -.24** .56** -.06 - HCR-20 total score -.28** .38** .12 .20* -.19* .34** .04 .46** - HCR-20 risk judgment

-.34** .33** .19* .40** -.23* .34** -.12 .37** .56**

Note. * p < .05 (two-tailed). ** p < .01 (two-tailed).

Next, stepwise multiple regression analyses were conducted. Feelings of being controlled or

manipulated by the patient significantly predicted high HCR-20 total scores: 21% of the variance in

HCR-20 total scores was explained by the subscale Controlled (see Table 5; F (1, 60) = 31.1, p <

.001). None of the other FWC subscales yielded significant prediction for the HCR-20 total score.


59

Three subscales were significant predictors for the HCR-20 risk judgments: the subscales Distant and

Close predicted high risk judgments, whereas the subscale Helpful predicted low risk judgments.

Together, these three subscales explained 23% of the variance in HCR-20 risk judgments (see Table 6;

F (1, 60) = 13.0, p < .001). Table 5. Stepwise multiple regression FWC subscales as predictor variables for HCR-20 total scores Variable FWC

B

SE B

ββββ

sign. T

R2

Adjusted R2

Controlled

5.8

1.0

.46

< .01

.21

.21

Note. B = Regression coefficient. SE B = Standard error of B. β = Beta. R2 = Squared multiple correlation. Table 6. Stepwise multiple regression FWC subscales as predictor variables for HCR-20 risk judgments. Variables FWC

B

SE B

ββββ

sign. T

Distant

.46

.14

.31

<.01

Close .35 .11 .26 <.01 Helpful

-.34 .13 -.24 <.01

Multiple R

.50

R2 .25 Adjusted R2

.23

Note. B = Regression coefficient. SE B = Standard error of B. β = Beta. R2 = Squared multiple correlation.

Discussion The present study demonstrated good interrater reliability for the Dutch HCR-20, provided insight into

differences between researchers and clinicians in coding the HCR-20 and showed that clinicians’

feelings towards their patient were related to their risk assessments. The mean HCR-20 scores we

found resemble those found in previous studies conducted in forensic psychiatric institutions in other

countries (e.g., Belfrage, 1998; Strand et al., 1999).

Overall, the interrater reliability of the HCR-20 was good and this corresponds to previous findings

(Belfrage, 1998; Douglas & Webster, 1999b). Moreover, the differences in interrater reliability

between the subscales - excellent for the Historical scale, good for the Clinical scale, and moderate for

Chapter 2

60

the Risk management scale - were also demonstrated before (Belfrage, 1998). Some individual items,

however, showed poor interrater reliability. We suggest three possible causes for this. First, the

inexperience of some raters in coding standardized instruments such as the HCR-20 could have led to

the low interrater reliability for these items. For example, the item ‘Previous violence’ was erroneously

rated by some clinician raters: they did not rate the index offense as an instance of previous violence.

However, during the training workshop it had been emphasized that previous violence refers to all

violence prior to assessment. The researchers, on the contrary, who were more experienced in the use

of standardized instruments, obtained an almost perfect interrater reliability on this item. Second, some

HCR-20 item descriptions are unclear or so global that they are easily open to multiple interpretations.

A number of raters argued that the items ‘Impulsivity’ and ‘Exposure to destabilizers’ are items with a

rather broad and imprecise definition. Third, differences in degree of clinical experience and personal

attitudes possibly contributed to poorer interrater reliability. A recurrent discussion during the

consensus meetings was if the patient’s problems were serious enough to warrant a code of ‘2’ instead

of ‘1’, for instance, when coding the items ‘Early maladjustment’ and ‘Stress’. More experienced

clinicians, such as the treatment supervisors, tended to view the problems of the patients as less

serious, because they compared them to even more severely disturbed patients they had treated before.

The present study revealed some important differences between researchers and clinicians, not only in

their HCR-20 scores, but also in their way of coding, i.e., the time taken to code the HCR-20 and the

information used for the coding. Clinicians usually relied on their personal experiences with the

patient and made almost no use of file information, whereas researchers predominantly relied on file

information, naturally because they did not know the patient well enough personally. Future research

will have to establish whether these differences impact on the accuracy of the risk assessments. We

also found some differences between the three rater groups. First, although researchers and treatment

supervisors did not significantly differ in their mean HCR-20 scores, there was a substantial difference

in their risk judgments. Treatment supervisors more often judged the overall risk as ‘low’, and thus

seemed more optimistic in their interpretation of HCR-20 scores compared to researchers. A possible

reason is that treatment supervisors experience pressure to let patients pass to the transmural phase as

soon as possible.2 It could be that - despite the structured way of assessing risk - treatment supervisors

are susceptible to cognitive distortions / biases, such as the tendency to correlate information

intuitively rather than by laws of probability (‘conjunction fallacy’) or the tendency to view unrelated

events as correlated (‘illusory correlation bias’; see for further explanation Dernevik et al., 2001). The

researchers were more distant from the patients and their treatment, and not involved in leave

decisions and were therefore probably less susceptible to these biases.

2 There are capacity problems in the Dutch tbs-system, resulting in a strong call on the forensic psychiatric institutions to complete the treatment of patients within a specific time period.


61

A second difference was that group leaders gave lower HCR-20 scores compared to researchers, and -

albeit not significantly - also compared to treatment supervisors. A possible reason is the group

leaders’ daily interaction with the patients; continuous awareness of the risk these patients pose would

probably get in the way of a therapeutic interaction. In addition, the frequent interaction can induce

emotional ties and involvement with the patient and as a consequence more access to the ‘nicer sides’

of the patient, which the group leaders then took into account when assessing risk. On the contrary,

researchers may have emphasized the negative aspects of patients because file information usually

focuses on the risks and problems of patients. Thus far, only one other study has been conducted that

compared HCR-20 ratings of clinicians and researchers. Contrary to our findings, Dernevik et al.

(2001) found a higher mean HCR-20 total score for nurses compared to independent experts (26.3

versus 22.7). However, these nurses are hardly comparable to our group leaders because of substantial

differences in the mean number of work years in the forensic institution (24.6 versus 6.2).

A third difference lies in the interrater reliabilities of the rater groups. Notable is the merely moderate

agreement for the Clinical and Risk management items among treatment supervisors and group

leaders. A possible cause is the different roles in treatment they fulfill: treatment supervisors have a

supervising and planning role whilst the group leaders conduct the daily and practical supervision and

spend most of the time with the patients. However, these differences in item scores did not seem to

interfere substantially with the complete risk assessment since the mean scores of the treatment

supervisors and group leaders did not differ significantly.

The feeling of being controlled and manipulated by the patient was strongly related to high HCR-20

total scores. Since psychopaths (as measured by the PCL-R) in particular are known to be capable of

evoking these types of feelings in clinicians (Hare et al., 2000; Lösel, 1998), this would be an

interesting issue for further study. Negative feelings, such as frustration, disappointment,

suspiciousness and rejection were related significantly to high HCR-20 total scores. Interestingly, both

feelings of closeness and distance / aloofness were related to high risk judgments. A possible

explanation is that these two feelings are both at the ends of a continuum, in which the middle presents

adequate professional attachment to the patient. Positive feelings like helpfulness, happiness and

relaxation were related to low risk judgments. The conclusion of the present study that feelings of

clinicians towards their patients are related with the risk assessment is in line with the findings of

Dernevik and colleagues (2001).

A number of limitations to the present study should be mentioned. First of all, the small sample size is

a limitation. For example, because of the small sample size it was difficult to examine possible

differences between the supervisors and group leaders in their feelings toward the patients. Second, the

exact meaning and implications of the relationship between clinicians’ feelings towards patients and

their risk assessment remains unclear. The question to be addressed is if these feelings do actually

interfere with the accuracy of the risk judgments. It is possible that patients who receive a high score

on the HCR-20 also tend to evoke aversive feelings in the people around them, e.g., because of their

Chapter 2

62

psychopathic traits or (personality) disorder. Dernevik and Douglas (2002) found that clinicians were

able to accurately predict inpatient violence, whereas experts showed good predictive validity for

violence after discharge. Hence, the question who is most suitable to conduct risk assessments – the

treating clinician or the independent researcher – cannot be answered at this moment (see Chapter 6).

Another limitation is the large number of raters who participated in this study, especially group

leaders. Almost no group leader performed more than two risk assessments. Consequently, mistakes

such as those described above with the item ‘Previous violence’ were made quite regularly.

Based on the results and experiences of the present study, we would like to make some

recommendations for the implementation of standardized risk assessment instruments in clinical

practice. First, for clinical use, we recommend the use of consensus risk assessments with both

clinicians and researchers in order to rule out possible effects of rater bias, to discuss risk management

strategies and to identify possible additional risk factors or protective factors. In the present study, the

combination of the more distant, objective researcher and the treating clinicians who know the patient

well seemed to benefit the discussion about risk factors. Second, a thorough and repeated training in

coding risk assessment instruments is essential. The number of persons trained in such a workshop

should not exceed 10, because this facilitates group discussion and attention to individual biases. The

time period between the workshop and the first risk assessment should not exceed six months,

otherwise the obtained knowledge and skills may have been lost. Continuous training can be achieved

by organizing ‘return days’ to discuss questions and pitfalls and also to provide the raters with

feedback on the accuracy of their risk assessments, for instance, when a patient who is living outside

the hospital has recidivated with a violent offense.

Future research will have to demonstrate the predictive validity of the Dutch HCR-20 and to examine

if there are differences in predictive validity of HCR-20 assessments of researchers and clinicians,

both in the short and longer term (see Chapter 6). In addition, the relationship between clinicians’

feelings towards their patients and their risk judgment needs to be clarified, for example, by studying

the relationship between FWC scores and recidivism or measures of inpatient violence. Prospective

research is strongly recommended, although a number of problems might be encountered. The most

important problem is that clinical goals of risk assessment (i.e., risk management) will interfere with

prospective research into predictive validity. Hart (1998a) stated that predictions of violence are not

passive assessments, but decisions that influence services delivered to individuals: “Clinicians are

bound - morally, ethically, and legally - to try to prove themselves wrong when they predict violence

and take every reasonable action to prevent violence” (Hart, 1998a, p. 365). Thus, when clinicians

perform HCR-20 risk assessments it is very likely that the outcome influences decisions concerning

probationary leave or termination of (mandatory) treatment and high-risk patients will not be let out of

the hospital. In conclusion, continuous effort in research will be needed to clarify the processes

underneath coding structured risk assessment instruments and to improve the accuracy of risk

assessment procedures in clinical practice.

Predictive validity of the SVR-20 and Static-99 in a Dutch sample of treated sexual offenders

This chapter is a revised version of Vogel, V. de, Ruiter, C. de, Beek, D. van, & Mead, G. (2004). Predictive validity of the SVR-20 and Static-99 in a Dutch sample of treated sex offenders. Law and Human Behavior, 28, 235-251. The authors wish to thank Ine Kusters, M.Sc. and Quinta Appeldoorn for their assistance in retrieval of archival hospital data.

33

Predictive validity of the SVR-20 and Static-99

65

CHAPTER 3

Predictive validity of the SVR-20 and Static-99 in a Dutch sample of treated sexual offenders

The assessment of risk for (sexual) violence is an important task of psychologists working in forensic

practice. Sexual violent (re)offending often has severe consequences for the victims and causes strong

feelings of fear, anger, and concern in society. A carefully conducted risk assessment before a

probationary leave, parole decision, or termination of (mandatory) treatment can help to appraise the

risk of recidivism in an adequate way and thereby prevent serious (sexual) violent offenses (Douglas

& Webster, 1999a). To date, the best known and most widely used method in practice, at least in The

Netherlands, is the unstructured clinical judgment approach that is exclusively based on the

professional expertise of the clinician. However, research has revealed some important limitations of

this unstructured clinical judgment, such as poor reliability and validity (Monahan, 1981; see for a

discussion of these disadvantages Quinsey et al., 1998, pp. 55-72). Although more recent studies have

demonstrated clinical accuracy to be significantly better than chance, unstructured clinical judgment is

liable to systematic biases. For example, clinicians were found to be accurate in predicting risk of

recidivism in cases with a violent history, but less accurate in predicting risk of violence in female

psychiatric patients (underestimation of risk) and nonwhite men (overestimation of risk) (Lidz et al.,

1993; McNiel & Binder, 1995).1 Therefore, several authors recommend to employ more structured risk

assessment procedures in order to optimize accuracy and validity (Borum, 1996; Webster et al.,

1997a).

An important distinction among structured risk assessment instruments can be made between the

actuarial and the structured professional judgment (SPJ) approaches. Actuarial instruments are

In this retrospective study, the interrater reliability and predictive validity of two risk assessment instruments for sexual violence are presented. The SVR-20, an instrument for structured professional judgment (SPJ), and the Static-99, an actuarial risk assessment instrument, were coded from file information of 122 sexual offenders who were admitted to the Dr. Henri van der Hoeven Kliniekbetween 1974 and 1996 (average follow-up period 140 months). Recidivism data (reconvictions) from the Ministry of Justice were related to the risk assessments. The base rate for sexual recidivism was 39%, for non-sexual violent offenses 46%, and for general offenses 74%. Predictive validity of the SVR-20 was good (total score: r = .50, AUC = .80; final risk judgment: r = .60, AUC = .83), of the Static-99 moderate (total score: r = .38, AUC = .71; risk category: r = .30, AUC = .66). The SVR-20 final risk judgment was a significantly better predictor of sexual recidivism than the Static-99 risk category.

Chapter 3

66

developed on the basis of risk factors that are empirically related to (sexual) violent behavior. These

instruments are relatively simple to code - according to fixed rules and not necessarily by a forensic

expert - and contain predominantly static, non-changeable factors that are added up according to a

fixed algorithm to reach a conclusion on the risk of recidivism. Examples are the Violence Risk

Appraisal Guide for violent behavior (VRAG; Harris & Rice, 1997) and the Static-99 for sexual

violent behavior (Hanson & Thornton, 1999).2 Although risk assessment with actuarial instruments is a

simple and time-effective procedure, there are some important disadvantages to this approach. Most of

the actuarial instruments3 do not include situational or dynamic risk factors and do not offer guidelines

for treatment, which makes them useless in treatment settings where the aim is reduction of the risk of

recidivism. Furthermore, generalization towards populations other than the type of samples in which

the instrument was developed is limited (Grubin & Wingate, 1996; Hart, 1998). Based on this

criticism, a new risk assessment approach was developed, called structured professional judgment

(SPJ). In this approach, the risk assessment is performed by a forensic clinician by means of a

standardized checklist, containing empirically derived risk factors for (sexual) violence, historical as

well as dynamic factors. The essential difference between the actuarial and the SPJ approach is in how

the final risk judgments are arrived at; in actuarial instruments by a fixed algorithm and in SPJ

guidelines by (structured) human decision-making. Examples of SPJ guidelines are the HCR-20

(Webster et al., 1997b) and the SVR-20 (Boer et al., 1997). Research in several populations and

settings has demonstrated good interrater reliability and predictive validity of the above-mentioned

risk assessment instruments (e.g., Belfrage et al., 2000; Dempster, 1998; Douglas et al., 2005).

However, these results have been obtained with North American samples and some European samples,

predominantly Swedish (see also the special issue of Psychology, Crime and Law; Hart, 2002); the

psychometric properties of the Dutch translations of these risk assessment instruments are unknown.

Sexual offenders are considered as a special group for the assessment of risk of recidivism. Various

studies and meta-analyses have indicated there are specific risk factors for sexual violence, aside from

risk factors for general violence (e.g., psychopathy, criminal history), such as sexual deviance and

prior sexual offenses (Hanson & Bussière, 1998; Hanson et al., 1993). Most of the risk assessment

schemes for sexual violence were developed based on findings from these studies. Furthermore,

research has demonstrated that the group of sexual offenders is a heterogeneous one (Doren, 1998;

Greenberg, 1998; Prentky et al., 1997), which makes it difficult to develop a risk assessment

instrument that predicts accurately for all types of sexual offenders. For example, there are major

differences in the base rate for sexual reoffending between rapists, child molesters with extrafamilial

boys as victim, child molesters with extrafamilial girls as victims and incest offenders. Child molesters

1 For a detailed discussion about the clinical-actuarial controversy, we refer the reader to two reviews: Douglas, Cox, and Webster (1999) and Litwack (2001). 2 Recently, a revised version of the Static-99 is developed; the Static-2002 (Hanson & Thornton, 2002). 3 An exception is the Sex Offender Need Assessment Rating (SONAR) that completely consists of dynamic factors. Hanson and Harris (2000) developed this instrument as a supplement to the Static-99.


67

with extrafamilial boys as victims reoffend more than rapists and child molesters with extrafamilial

girls as victims (Hanson et al., 1993; Quinsey et al., 1995). Furthermore, some sexual offenders,

particularly child molesters, may reoffend after a long period of non-offending (Hanson et al., 1993;

Prentky et al., 1997). This makes it necessary to re-assess the risk of recidivism regularly. When

studying the literature on sexual recidivism, one can conclude that there are a number of subgroups of

sexual offenders that reoffend frequently and seriously. Risk assessment can assist in detecting these

subgroups and distinguish them from sexual offenders who pose a low or moderate risk of recidivism.

In this chapter, we will present findings from a retrospective study on the interrater reliability and

predictive validity of two risk assessment instruments for sexual violence - the Static-99 and the SVR-

20 - in a group of sexual offenders who were admitted to the Dr. Henri van der Hoeven Kliniek

between 1974 and 1996. The aim of the present study was to determine the value of these instruments

for the prediction of sexual violence in The Netherlands, and to compare the predictive value of the

actuarial instrument with the guideline according to the SPJ approach.

Method Setting



Subjects

The group of sexual offenders consisted of 95 rapists and 27 child molesters, all male. The child

molester group was comprised of 16 child molesters with extrafamilial girls as victim, 10 child

molesters with extrafamilial boys as victim, and one incest offender. Table 1 presents the demographic

characteristics of the sample. The majority of the sexual offenders were Dutch, single and had no work

at the time of the index offense. More than half of the group did not complete their hospital treatment;

in 36% of the cases, the tbs-order was terminated by court against the hospital’s advice, and 29% of

the sexual offenders were readmitted to another forensic psychiatric institution. Reasons for these

replacements differed, but the most common was that the therapeutic relationship between the patient

and hospital staff was disturbed to such an extent that further treatment was considered impossible.

The table also shows that there were a number of significant differences between rapists and child

molesters. Child molesters more often than rapists grew up in foster or children’s homes, had been

admitted to inpatient psychiatric hospitals, and obtained lower scores on intelligence scales.

Furthermore, child molesters more often knew their victim and had made more than one victim

compared to rapists.

Chapter 3

68

Table 1. Sample characteristics

Rapists n = 95

Child molesters

n = 27

Total

N = 122

Demographic

Mean age upon admission 24.6 25.6 24.8 Dutch nationality 84 (88%) 25 (93%) 109 (89%) Upbringing in foster or children’s home 48 (51%) 19 (70%)** 67 (55%) Single (at the time of the index offense) 74 (78%) 25 (93%) 99 (82%) No education after primary school 53 (56%) 14 (52%) 67 (55%) Special education 19 (20%) 12 (44%)** 31 (25%) Unemployed (at the time of the index offense) 49 (52%) 13 (48%) 62 (51%) Psychiatric

No psychiatric history 27 (28%) 2 (7%)* 29 (24%) Out-patient treatment(s) 21 (22%) 4 (15%) 25 (21%) Inpatient admission(s) 47 (50%) 21 (78%)** 68 (56%) Alcohol abuse 34 (36%) 10 (37%) 44 (36%) Drug abuse 3 (3%) 1 (4%) 4 (3%) Multiple substance abuse 21 (22%) 3 (11%) 24 (20%) Mean intelligence score 97.1b 85.8a 93.8 Offenses

Victim was not a stranger 15 (16%) 10 (37%)* 25 (21%) Number of victims more than one 35 (40%) 16 (59%)* 51 (42%) Previously convicted for sex offense(s) 60 (63%) 20 (74%) 80 (66%) Previously convicted to the tbs-order 5 (5%) 5 (19%) 10 (8%) At the time of the study still or again under the tbs-order 4 (4%) 4 (15%)* 8 (7%) Treatment

Mean duration of treatment in months 51.4 62.9 54.0 Treatment included a probationary period 38 (40%) 11 (41%) 49 (40%) Mean duration of probationary period in months 12.6 17.0 13.6 Readmitted to another institution 29 (31%) 6 (22%) 35 (29%) Termination of tbs-order against the hospital’s advice

33 (35%) 11 (41%) 44 (36%)

Note. ** p < .01, * p < .05 (chi-square analysis, two-tailed). a < b, p < .01 (t (59) = 2.8). Intelligence scores were

available for 42 rapists and 17 child molesters. Special education is for children with learning disabilities and / or conduct problems.

Instruments

Static-99

The Static-99 is a brief actuarial instrument for the assessment of risk for sexual violence in adult

sexual offenders. The instrument is derived from a fusion of two previously developed risk assessment

instruments, the RRASOR (Hanson, 1997) and the Structured Anchored Clinical Judgment (SACJ-

Min; Grubin, 1998). The Static-99 is composed of 10 historical risk factors (see Table 2) that have to

be coded from file information. The factors add up to a maximum total score of 12 that is subsequently

translated into four risk categories: low (0,1), medium low (2-3), medium high (4-5) and high (6 or

more) (Hanson & Thornton, 1999).


69

Table 2. Items of the Static-99 1.

Prior sex offenses

2. Prior sentencing dates (excluding index offense) 3. Any convictions for non-contact sex offenses 4. Index non-sexual violence 5. Prior non-sexual violence 6. Any unrelated victims 7. Any stranger victims 8. Any male victim 9. Young (18 – 24 years) 10. Single (ever lived with lover for two years or more?)

Note. Adopted from Hanson & Thornton (1999).

Hanson and Thornton (2000) tested the predictive accuracy of the Static-99 in a group of 1301 sexual

offenders from four different institutions in Canada and England and found a moderate predictive

validity for sexual violence (r = .33, Receiver Operating Characteristics (ROC)4: Area Under the

Curve (AUC) =. 71) and violent recidivism (r = .32, AUC = .69). In this study, sexual offenses were

included in the definition of violent recidivism. Similarly, Sjöstedt and Långström (2001) found

moderate to good predictive validity in a group of 1400 Swedish prisoners (sexual recidivism: r = .22,

AUC =.76; violent (including sexual offenses) recidivism: r = .30, AUC = .74). This study rendered a

good interrater reliability (Cohen’s Kappa = .90).

SVR-20

The SVR-20 is a structured professional guideline (checklist) designed for the assessment of risk for

sexual violence in adult sexual offenders (see Chapter 1, p. 20 for a detailed description).

Psychopathy Checklist-Revised

The PCL-R was designed to assess the construct of psychopathy (Hare, 1991, 2003, see Chapter 1, p.

33 for a description of the instrument).

Procedure

File information was gathered on 123 sexual offenders who were admitted between 1974 and 1996 in

the hospital (release dates between 1977 and 2000). In general, these files consisted of psychological

reports, reports to the court regarding treatment progress and recommendations for

4 Receiver operating characteristics (ROC) is a statistical method to assess predictive validity. See Statistical analyses for an explanation.

Chapter 3

70

termination or prolongation, treatment plans and evaluations. Next, we rated the Dutch versions of the

SVR-20 (Hildebrand et al., 2001), PCL-R (Vertommen et al., 2002), and Static-99 (van Beek, de

Doncker, & de Ruiter, 2001) on the basis of all available file information. In order to establish the

interrater reliability, three raters (in different compositions out of a group of four raters) independently

rated 30 cases. During a case meeting, raters discussed their scores, and agreed upon a consensus

score. The case meetings provided raters with an opportunity to sharpen their understanding of the

individual SVR-20 items. The consensus score was used for the analyses on predictive validity.

Subsequently, the remaining cases were divided among the four raters. The rating procedure was

conducted while all raters were blind to the outcome.5 One case was not rated, because this patient

died within two months after admission, and one case was not included in the analyses because this

patient was an illegal inhabitant of The Netherlands who immediately returned to his native country

after termination of the tbs-order.

Recidivism data

Data on recidivism were retrieved from the Judicial Documentation register of the Ministry of Justice

after all the files had been coded. Sexual recidivism was defined as a new conviction for a sex offense

in accordance with Dutch criminal law, and comprises both hands-on (e.g., rape, sexual assault, child

molestation) and hands-off (e.g., exhibitionism, possession of child pornography) offenses.

Furthermore, we explored new convictions for non-sexual violent and general offenses. The follow-up

period, starting on the date of release from the hospital or readmission to another institution and

ending on the date of data gathering (November 2001), varied from 20 to 291 months with an average

of 140 months.


Student’s t-tests and chi-square tests were used to examine differences between the rapists and child

molesters in demographic characteristics. Survival analyses, more specifically the Kaplan-Meier

method, were used to calculate recidivism rates (Schmidt & Wytte, 1988; Tabachnick & Fidell, 2001).

These analyses take into account the time the offender has been at risk. Thus, it is possible to calculate

the recidivism rate for a specific period despite the fact that the follow-up periods of the offenders

diverge. The log rank statistic was used to test differences between the rapists and child molesters. In

order to determine the effect of termination of treatment, the Cox proportional hazard method was

used, which results in the hazard ratio (eB) that can be interpreted as the relative risk.

The interrater reliability of the SVR-20 and Static-99 was examined by means of the intraclass

correlation coefficient (ICC), using the two-way random effect variance model and consistency type

5 One of the four raters is a clinician who knew (about the outcome of) some of the patients. This clinician coded files of patients he did not know and the other files were randomly divided among the other raters who had no information about the outcome.


71

(McGraw and Wong, 1996). Critical values used for single measure ICCs were: ICC ≥ .75 = excellent;

.60 ≤ ICC < .75 = good; .40 ≤ ICC < .60 = moderate; ICC < .40 = poor (Fleiss, 1986). The predictive

validity of both the instruments was established with receiver operating characteristics (ROC) analyses

(Mossman, 1994; Rice & Harris, 1995). The major advantage of this statistical method is its

insensitivity to base rates. The ROC analyses result in a plot of the true positive rate (sensitivity)

against the false positive rate (1 minus specificity) for every possible cut-off score of the instrument.

The area under the curve (AUC) can be interpreted as the probability that a randomly selected

recidivist would score higher on the instrument than a randomly selected non-recidivist. An AUC of

.50 represents chance prediction, and an AUC of 1.0 perfect prediction. In general, AUC values of .70

and above are considered moderate, and above .75 good (Douglas et al., 2005). To compare the

obtained AUC values of the SVR-20 and Static-99, we used the software program AccuROC (Vida,

1997) which applies the non-parametric method described by DeLong, DeLong and Clarke-Pearson

(1988). Pearson point-biserial correlations between SVR-20 and Static-99 scores and the dichotomous

outcome variables were calculated for comparative purposes.

Results Reconviction rates

Table 3 shows the base rates of sexual recidivism, both official reconviction rates as registered by the

Judicial registration system and rates computed with survival analyses. The Judicial registration

system reported 89 new convictions for sexual offenses, 10 for homicide offenses, and 77 for non-

sexual violent offenses. The most frequently reported sexual and violent offenses were: rape (25),

sexual assault (12), exhibitionism (16), assault (19), threat (19) and unlawful confinement (16). We

compared the sexual reconviction rates of rapists to those of child molesters, and the rates of offenders

who completed hospital treatment to those who did not. Child molesters were significantly more often

than rapists reconvicted for a sexual offense (log rank = 8.0, p < .01). The sexual reconviction rate for

child molesters with extrafamilial boys as victim was highest of all offender types (80%). Sex

offenders who were readmitted to other institutions or whose tbs-order was terminated by court against

the hospital’s advice had more sexual reconvictions (eB = 2.24, 95% Confidence Interval (CI) = 1.08 -

4.64, p < .05) and property offenses (eB = 3.67, 95% CI = 1.55 - 8.67, p < .01) compared to those who

completed their hospital treatment. Further analysis indicated that for sexual reconviction this was only

the case for the child molesters who did not complete treatment (eB = 4.77, 95% CI = 1.07 - 21.30, p <

.05) and not for the rapists (eB = 1.71, 95% CI = .73 - 3.96, p = .21). In the analyses, we checked if the

Cox proportional hazard assumption was valid (i.e., if the hazard ratio was constant during the entire

follow-up period, which is the case when the log minus log curves are parallel), and found this to be

true.

Chapter 3

72

Table 3. Reconviction rates of subgroups of sexual offenders

N

Sexual recidivism

Violent offending

General offending

%

S%

%

S%

%

S%

Rapists

94

33a

43a

44

54

72

90a

Child molesters 27 59b 70b 52 52 78 92 Extrafamilial ♀ 16 44c 56 50 54 69 82 Extrafamilial ♂ 10 80d 89b 50 50 90 100b Completed hospital treatment 41 22e 27e 37 48 71 77 Did not complete hospital treatment 80 48f 58f 50 68 75 84 Total

121 39 48 46 63 74 91

Note. a < b p < .01. a < d p < .01. c < d p < .05. e < f p < .01. % = reconviction rate as registred by the Ministry of

Justice, differences tested with chi-square (two-tailed). S% = reconviction rates calculated with survival analyses, differences tested with log rank. Violence = excluding sexual offenses. Child molesters = child molesters with extrafamilial girls as victim + child molesters with extrafamilial boys as victim + one incest offender. Did not complete hospital treatment = termination of treatment by court against the hospital’s advice or readmitted to another forensic psychiatric institution.


The interrater reliability of the SVR-20 subscales and total score was good to excellent (ICC single

measure: SVR-20 total score = .75, Psychosocial adjustment = .74, Sexual offenses = .74 and Future

plans = .78). The interrater reliability of the final risk judgment was moderate (ICC = .48). In two of

30 cases (6.7%), one rater judged ‘high risk’ whereas another rater judged ‘low risk’. Two items –

‘Sexual deviance’ and ‘Relationship problems’ - demonstrated poor interrater reliability (ICCs = .38

and .29, respectively). Three items – ‘Victim of child abuse’, ‘Employment problems’ and ‘Extreme

minimization or denial of sex offenses’- yielded moderate interrater reliability (ICCs = .49 and .48 and

.42, respectively).

The overall interrater reliability of the Static-99 was excellent (ICC total score = .80). The risk

category demonstrated good interrater reliability (ICC = .61). The ICC for item 6 could not be

computed due to lack of variance, but the percentage of agreement was 98.9%.

Risk judgments

The mean total scores on the SVR-20, Static-99, and PCL-R were 23.7 (SD 6.7), 6.0 (SD 1.7) and 21.9

(SD 7.2), respectively. There were no significant differences between the rapists and child molesters,

except for the average number of other considerations in the SVR-20 (3.6 versus 4.5, p < .05). Table 4

shows the final risk judgments of the SVR-20 and the risk categories of the Static-99. More than half

of the group of sexual offenders was judged to pose a ‘high risk’ by both instruments. Furthermore,


73

sexual recidivists obtained significantly higher total scores on both instruments (SVR-20: 27.6 versus

21.3; Static-99: 6.7 versus 5.5, p < .01) and were more often judged to pose a high risk for sexual

reoffending compared to sexual non-recidivists (SVR-20: 91% versus 30%; Static-99: 79% versus

50%, p < .01). When a PCL-R total score of 30 was used as cut-off score for the diagnosis of

psychopathy, 21% of the sexual offenders could be classified as psychopathic, and when 26 was used

as cut-off score 36%.

Table 4. Final risk judgment SVR-20 / risk category Static-99

Rapists n = 95

Child molesters

n = 27

Total

N = 122

SVR-20 Low

17 (18%)

2 (7%)

19 (15%) Moderate 27 (29%) 9 (33%) 36 (30%) High 51 (54%) 16 (59%) 67 (55%) Static-99 Low

-

-

-

Medium low 5 (5%) 2 (7%) 7 (6%) Medium high 36 (38%) 5 (19%) 41 (33%) High

54 (57%) 20 (74%) 74 (61%)

Predictive validity

Figure 1 presents the ROC curve of the SVR-20 and Static-99 for sexual reoffending. The SVR-20

exhibited good predictive validity for sexual reoffending: all AUC values and Pearson correlations for

the subscales, the total score, and the final risk judgment were significant (see Table 5). The total

score, the final risk judgment, and the subscales Psychosocial adjustment and Future plans also

predicted violent (excluding sexual offenses) and general offending. Next, we conducted Cox

regression analyses. The SVR-20 subscales scores were entered on block 1. The SVR-20 final risk

judgment was entered on block 2 by using the forward conditional method. In block 1, the SVR-20

subscales scores produced a significant model fit (χ2 (3, 121) = 28.7, p < .001). In block 2, the SVR-20

final risk judgment produced a significant improvement to the model’s fit (χ2 change (1, 121) = 15.7, p

< .01).

The predictive validity of the Static-99 for sexual reoffending was moderate. The Static-99 was not

predictive for violent (excluding sexual offenses) and general offending. However, when we included

sexual offenses in the definition of violence, like Hanson and Thornton (2000) and Sjöstedt and

Långström (2001) did in their studies, we found a significant predictive validity of the Static-99 total

score for violent offenses (AUC = .62, p < .05). When we used AccuROC to compare the AUC values

of the SVR-20 and Static-99, we found a marginally significant difference between the AUC values of

Chapter 3

74

the SVR-20 total score and Static-99 total score (χ2 (1, 121) = 2.8, p = .09). The difference in AUC

values of the SVR-20 final risk judgment and Static-99 risk category, however, was significant (χ2 (1,

121) = 15.0, p < .001). Figure 1. ROC curves for SVR-20 and Static-99 (N=121)

Reference line

Static-99 risk category

Static-99 total score

SVR-20 final risk judgment

SVR-20 total score

1-Specificity

Sen

sitiv

ity


75

Table 5. Predictive validity of the SVR-20 and Static-99 (N=121)

Sexual recidivism

Violent offending

General offending

AUC

SE

r

AUC

SE

r

AUC

SE

r

SVR-20 Psychosocial adjustment .68*** .05 .30** .71*** .05 .38** .71*** .05 .33** Sexual offenses .79*** .05 .49** .61 .05 .21* .60 .06 .16 Future plans .76*** .04 .47** .70*** .05 .38** .73*** .05 .38** Total score .80*** .04 .50** .71*** .05 .38** .71*** .05 .33** Final risk judgment .83*** .03 .60** .67** .05 .31** .69** .05 .30** Static-99

Total score .71*** .05 .38** .57 .05 .16 .57 .06 .13 Risk category

.66** .04 .30** 51 .05 .05 .55 .06 .08

Note. * p < .05, ** p < .01, *** p < .001 (two-tailed). AUC = Area under the curve. SE = Standard error. r =

Pearson point-biserial correlation. Violent offending = excluding sexual offenses.


Other considerations that were frequently coded on the SVR-20 are lack of coping skills,

suggestibility, impulsivity, failure of prior treatment(s), social isolation, lack of social and emotional

support, financial problems, and preoccupation with non-deviant sex (hypersexuality).

Discussion The present study is the first one in The Netherlands to assess the predictive validity of risk assessment

instruments for sexual violence. The base rates for sexual recidivism found in this study are

comparable to those found in other studies with long follow-up periods (e.g., Prentky et al., 1997).

Furthermore, the present study yielded good interrater reliability for the SVR-20 total and subscale

scores and the Static-99 items and total score. However, for the final risk judgment and some

individual SVR-20 items we found only a moderate or - for two items - poor interrater reliability. An

explanation for this was lack of information in the files to code specific items, for example, item 17

(‘Extreme minimization or denial of sex offenses’). In addition, the description of some items is rather

broad and therefore open to multiple interpretations. In particular, coding item 1 (‘Sexual deviance’)

for rapists raised a lot of discussion because clear and objective criteria for sexual deviance are

lacking. This is especially true for forensic assessment in The Netherlands where phallometric

methods to assess sexual deviance are not generally used.

Regarding the predictive accuracy, we found a difference between the actuarial instrument and the SPJ

guideline. Although the difference in AUC values of the SVR-20 total score and Static-99 total score

was only marginally significant, the SVR-20 final risk judgment was significantly more accurate in

predicting sexual recidivism than the Static-99 risk category. Although both instruments were

Chapter 3

76

specifically designed to predict sexual violence, it turned out that the SVR-20 also had significant

predictive accuracy for non-sexual violent and general offenses. The SVR-20 subscale Sexual offenses

was specifically predictive for sexual reoffending, whereas the subscales Psychosocial adjustment and

Future plans also predicted non-sexual violent and general offenses. Furthermore, the final risk

judgment added significant incremental validity to the SVR-20 subscales scores. Thus, using the

SVR-20 as a SPJ method seems to have superior accuracy than using it as an actuarial tool (i.e.

summing the individual item scores). This finding is in line with Dempster (1998) who found

incremental predictive validity for the SVR-20 structured final risk judgment relative to the actuarial

SVR-20 (i.e., the SVR-20 total score). The same was found for the SARA (Kropp et al., 1999), a SPJ

guideline for the assessment of relational violence (Kropp & Hart, 2000) and the HCR-20 (Douglas et

al., 2003; de Vogel & de Ruiter, submitted for publication, see Chapter 6).

The Static-99 exhibited a moderate predictive accuracy for sexual reoffending, and no significant

predictive validity for non-sexual violent or general offenses. The predictive validity for sexual

reoffending resembles that documented by Hanson and Thornton (2000) and Sjöstedt and Långström

(2001). Contrary to their findings, we found no predictive validity for non-sexual violent and general

offenses. However, these authors adopted a different definition of violent offenses, which included

sexual offenses. We decided to exclude sexual offenses because we wanted to make a clear distinction

between sexual and non-sexual violent offenses. When we included sexual offenses in our violent

recidivism definition, we also found a significant predictive validity of the Static-99 for violent

offending. Furthermore, the Static-99 risk categories did not differentiate in our sample: 94% of the

sexual offenders were classified as medium high or high by this instrument. The mean Static-99 total

score in our sample (6.0) is extremely high (see Harris, Phenix, Hanson, & Thornton, 2003). Perhaps

the Static-99 will show better differentiation and predictive accuracy in other populations, such as

sexual offenders in outpatient settings. We suggest that – especially for settings where time, staff and

information about patients is limited - the Static-99 could serve as an instrument for a first, global

screening of sexual recidivism risk to decide whether more elaborate risk assessment with the SVR-20

is desirable.

There are a number of limitations to the present study. The first limitation has to do with a possible

rater effect, which may subsequently influence the generalizabilitiy of the findings. Only four raters

participated in this study, of which rater 1 (VdV) coded the majority of the cases. We believe this is a

greater limitation for the coding of the SVR-20. The Static-99 is rather straightforward to code and the

final risk category is the result of a fixed algorithm, thus, we do not believe that the rater effect can be

strong for this instrument. This hypothesis is confirmed by the excellent interrater reliability of the

Static-99 and the more moderate reliability of the SVR-20 final risk judgment. Another issue

concerning the raters is that some of them may have held (strong) views regarding the superiority of

the SPJ method to the actuarial method, which may have impacted the results. However, all raters

were blind to the recidivism data when they coded the patient file data and it is unlikely that a positive


77

expectancy regarding the SVR-20 alone, could account for the large difference in predictive validity

between the two instruments. In future research, numerous different raters who are blind to the

outcome and to the hypothesis of the study are needed. Second, the sexual offenders formed a select

group, because they had severe psychological problems and had committed serious offenses for which

involuntary treatment was considered necessary. Although this group of sexual offenders is

representative for Dutch sexual offenders with a tbs-order (Van Emmerik & Brouwers, 2001), they are

probably not for sexual offenders in general. Both the SVR-20 and Static-99 show that the majority of

the sexual offenders in our study are a high-risk group. Third, recidivism data were retrieved from only

one source, the Judicial Documentation register of the Ministry of Justice. As a consequence, the

reconviction rate is inevitably an underestimation of the actual recidivism rate, because research has

revealed that many sexual offenses go undetected and not all sexual offenders are apprehended and

arrested (Groth et al., 1982; Weinrott & Saylor, 1991). Moreover, the Judicial Documentation data

cannot be considered as completely reliable for long term follow-up studies like the present study,

because - as stated by Dutch Criminal law - offenses that occurred 20 years or longer ago have to be

removed from the register (Dutch Criminal Code, Act of Judicial Documentation, section 7). Lastly,

there are limitations that relate to the retrospective design of this study. The quality of the files varied

substantially. Since the 1990s, treatment progress was documented much more carefully than in the

seventies and eighties, which may have influenced coding. This may have especially affected coding

the SVR-20 items, because these items comprise quite complex constructs such as sexual deviance and

escalation in sexual offending. In contrast, the Static-99 items are relatively easy to code from file

information. Furthermore, some of the sexual offenders were subjected to outdated treatment methods

that do not correspond to current best practice.

Future research will have to focus on groups of sexual offenders across different settings and contexts,

for instance, in the prison system and outpatient settings. Prospective research is recommended,

although a number of problems might be encountered. The most important problem is that prospective

predictive research will be hampered by the clinical goals of risk assessment, i.e., risk management

and prevention. Hart (1998a) stated that predictions of violence are not passive assessments, but

decisions that influence services delivered to individuals: “Clinicians are bound - morally, ethically,

and legally - to try to prove themselves wrong when they predict violence and take every reasonable

action to prevent violence” (Hart, 1998a, p. 365). Thus, when clinicians perform SVR-20 or HCR-20

risk assessments it is very likely that the outcome influences decisions concerning probationary leave

or termination of (mandatory) treatment and high-risk patients will not be released from the hospital.

Therefore, retrospective studies such as the present study are particularly suitable to assess

psychometric properties, most notably their predictive validity, of risk assessment instruments.

Risk assessment instruments should be regarded as ‘work in progress’ and further improvement of

these instruments is desirable (Webster et al., 1997b). More specifically, attention needs to be paid to

the development and refinement of dynamic risk factors and protective factors, as well as theoretical

Chapter 3

78

models that explain the relationship between risk factors and actual offending. Our study has directed

attention to a number of other risk factors, some of which might be valuable additions to the SVR-20,

for instance, hypersexuality and social isolation. Finally, we should bear in mind that one of the most

important goals of structured risk assessment is to gain insight into strategies to diminish risk for

sexual and violent behavior.

Type of discharge and risk of recidivism measured by the HCR-20: A retrospective study in a Dutch sample of treated forensic psychiatric Patients

This chapter is a slightly revised version of Vogel, V. de, Ruiter, C. de, Hildebrand, M., Bos, B., & Ven, P. van de (2004). Type of discharge and risk of recidivism measured by the HCR-20. A retrospective study in a Dutch sample of treated forensic psychiatric patients. International Journal of Forensic Mental Health, 3, 149-165. The authors wish to thank Harry Houtman and Quinta Appeldoorn for their assistance in retrieving and screening file information.

44

Different ways of discharge and risk of recidivism measured by the HCR-20

81

CHAPTER 4

Type of discharge and risk of recidivism measured by the HCR-20: A retrospective study in a Dutch sample of treated forensic psychiatric patients

When is a forensic psychiatric patient ready to leave the secured institution without posing a serious

risk to society? In The Netherlands, society is regularly confronted with serious violent recidivism by

forensic psychiatric patients during probationary leave or after discharge (Hilterman, 2001, 2004).

Violent (re)offending by patients who are admitted under a judicial order causes strong feelings of

fear, anger, and concern in society. A carefully conducted risk assessment before a probationary leave,

parole decision, or termination of (mandatory) treatment can help to appraise the risk of recidivism in

an adequate way and thereby prevent serious violent offenses (Douglas & Webster, 1999a). To date,

the most widely used method in forensic practice, at least in The Netherlands, is the unstructured

clinical judgment approach that is exclusively based on the professional expertise of the clinician.

However, research has revealed some important limitations of this unstructured clinical judgment,

such as poor reliability and validity (Monahan, 1981; see for a discussion of these limitations Quinsey

et al., 1998, pp. 55-72). Although more recent studies have demonstrated clinical accuracy to be

significantly better than chance, unstructured clinical judgment is liable to systematic biases. For

example, clinicians were found to be accurate in predicting risk of recidivism in male cases with a

violent history, but they underestimated the risk of violence in female psychiatric patients and

This retrospective study examined the predictive validity of the HCR-20, a violence risk assessment instrument. The HCR-20 as well as the PCL-R were coded on the basis of file information of 120 patients discharged from the Dr. Henri van der Hoeven Kliniek between 1993 and 1999 (average follow-up period 72.5 months). The patients were divided into four groups according to their typeof discharge: 1) discharge by the court in line with the hospital staff’s advice and after a transmural phase; 2) discharge by the court in line with the hospitalstaff’s advice, but without a preceding transmural phase; 3) discharge by thecourt against the hospital staff’s advice; and 4) readmission to another institution. Recidivism data (reconvictions) from the Ministry of Justice wererelated to the risk assessments. The base rate for violent recidivism was 36%,and 52% for general recidivism. The HCR-20 and PCL-R total scores demonstrated good predictive validity for violent recidivism (AUC = .82 and .75,respectively). The HCR-20 was a significantly better predictor of violentrecidivism than unstructured clinical judgment stated in hospital staff’s advice tothe court. In addition, the HCR-20 total score predicted significantly better than the PCL-R total score, although the difference in AUC values was no longer significant when the item ‘Psychopathy’ was removed from the HCR-20 total score

Chapter 4

82

overestimated the risk of violence in nonwhite men (Lidz et al., 1993; McNiel & Binder, 1995).1

Therefore, several authors have recommended employing structured risk assessment procedures in

order to optimize accuracy and validity (Borum, 1996; Webster et al., 1997a).

A risk assessment instrument that has drawn considerable international attention is the HCR-20

(Webster et al., 1997b). The HCR-20 is a checklist consistent with a structured professional judgment

(SPJ) approach and consists of 20 items representing risk factors for violence in the past (Historical

scale), present (Clinical scale) and future (Risk management scale). Research in various psychiatric

and forensic settings in different countries has demonstrated good interrater reliability and predictive

validity for the HCR-20 (Douglas et al., 2005). For instance, Douglas and colleagues (2003) found

good predictive validity for the HCR-20 in a sample of 100 forensic psychiatric patients. Moreover,

they demonstrated that the SPJ final risk judgments added incremental validity to the HCR-20 used in

an actuarial sense.

The way a forensic psychiatric patient leaves a treatment institution inevitably impacts on the risk of

recidivism. Research suggests that involuntary outpatient commitment following residential treatment

or a resocialization period in which the patient is supervised by probation officers, results in less

recidivism (Niemantsverdriet, 1993; Swanson et al., 2000). In The Netherlands, the transmural phase –

a resocialization phase in which the patient lives outside the secure forensic hospital, but is still

supervised and treated by staff from the hospital – is gaining in popularity. The rationale behind this

form of treatment is that patients are gradually and thoroughly prepared for their return to society

resulting in better integration and less relapse in violent behavior. Since 1991, patients involuntarily

admitted to the Dr. Henri van der Hoeven Kliniek can be discharged after having passed this

transmural treatment phase. Although the clinical experiences with this form of treatment are positive,

no systematic research examining the relationship between transmural treatment and (the risk of)

recidivism has been conducted thus far.

In this chapter, we present findings from a retrospective study on the interrater reliability and

predictive validity of the HCR-20 in a group of patients who were discharged between 1993 and 1999

from the Dr. Henri van der Hoeven Kliniek. Most of the patients in this forensic psychiatric hospital

were admitted under the judicial order terbeschikkingstelling (tbs) which can be translated as ‘disposal

to be treated on behalf of the state’. The tbs-order is imposed by the court on offenders who committed

a serious offense and are considered to have diminished responsibility for it because of severe

psychopathology. According to the Dutch Criminal Code, the court has to re-evaluate the patient every

one or two years (the latter period being set by the previous sentence) to determine whether the risk of

recidivism is still too high and treatment needs to be continued. At these annual/biannual reviews, the

hospital has to provide the court with a detailed description and evaluation of a patient’s treatment and

1 For a detailed discussion of the clinical-actuarial controversy, see the reviews of Douglas, Cox, and Webster (1999) and Litwack (2001).


83

a judgment about the risk of recidivism. The decision to terminate the tbs-order can only be made by

the court. We compared (the risk of) recidivism between four groups of offenders who are categorized

according to their type of discharge: 1) discharge by the court in line with the hospital staff’s advice

and after a transmural phase; 2) discharge by the court in line with the hospital staff’s advice, but

without a preceding transmural phase; 3) discharge by the court against the hospital staff’s advice,

with or without a transmural phase; and 4) readmission to another secure institution. These types of

discharge reflect different unstructured clinical judgments. Discharge in line with the hospital staff’s

advice after a transmural phase reflects the lowest judgment of risk, readmission to another secure

institution is considered the highest level of risk. Patients who are readmitted to another secure

forensic hospital have generally exhibited severe disruptive behavior (e.g., escape, aggressive

incidents), which could not be managed by hospital staff by other means (e.g., medication, highly

structured/individualized treatment).

The main objective of the present study was to determine the value of the HCR-20 in the prediction of

violence in Dutch forensic psychiatric patients and to identify differences in (risk of) recidivism

between the four groups. We hypothesized that patients who were discharged after a successful

transmural phase will recidivate less compared to patients in the other three groups because of a solid

preparation before return to society. In this way, it is possible to retrospectively examine if there is a

significant association between actual recidivism and the hospital staff’s perceived risk of recidivism

as stated in the advice to the court. We employed the PCL-R (Hare, 1991, 2003), because psychopathy

is one of the important historical risk factors in the HCR-20, and also examined its predictive validity.

Method Setting



Different types of discharge

In 1991, a new form of resocialization was started in the Dr. Henri van der Hoeven Kliniek; the

transmural treatment phase. Before 1991, patients were usually discharged after a probationary leave

in which they were supervised by probation officers. However, for most patients the change from

intramural treatment to probationary leave was too abrupt because of discontinuity in care. The goal of

the new transmural treatment is to allow patients a gradual adjustment to society. The hospital has

purchased and rented several houses in the city of Utrecht and also established a collaborative

agreement with a sheltered housing organization. During the transmural phase, the patient lives outside

the hospital, but is still treated and supervised by a specialized team from the hospital, sometimes in

collaboration with staff from the sheltered housing organization. The task of this specialized team

Chapter 4

84

which has regular contacts with the patient, is to supervise the patient and to be attentive to possible

precursors of criminal or violent relapse. With the team’s help, the patient can practice living on his

own, learn to resist temptations, build a social network and leisure activities, and apply the insights

and skills from the relapse prevention plan that was made during psychotherapy in the hospital.

Every one or two years, the court decides to terminate or prolong the tbs-order. There are two

possibilities: discharge in line with the hospital staff’s advice or discharge against the hospital staff’s

advice. Patients can be discharged after having passed the transmural phase or probation period or

directly from the hospital without a resocialization phase. One reason for the court to terminate the tbs-

order against the hospital staff’s advice is the principle of proportionality in which the court considers

the duration of treatment no longer reasonable and / or compatible with the (maximum) length of

imprisonment applicable to the index offense committed. Another reason may be that the judges do not

agree with the hospital staff’s appraisal of the recidivism risk of the patient. The fourth type of

discharge occurs if the hospital decides to ask the Ministry of Justice for a readmission to another

forensic institution. This usually takes place in cases of severe disruptive incidents and when the

relationship between the patient and hospital staff is disturbed to such an extent that a positive effect of

further treatment is considered highly unlikely. It should be noted that most of the patients in this latter

group suffer from severe personality disorders, not merely from Axis I disorders such as

schizophrenia.

In the present study, we identified four types of discharge:

1) Transmural. The patient was discharged by the court in line with the hospital staff’s advice, and

after the patient has passed the transmural phase;

2) Conform. The patient was discharged by the court in line with the hospital staff’s advice, without a

preceding transmural phase;

3) Contrary. The patient was discharged by the court against the hospital staff’s advice, some with a

transmural phase;

4) Readmission. The treatment is not terminated, instead the patient is readmitted to another forensic

psychiatric hospital or to a penitentiary institution.

Subjects

The sample consisted of 120 patients who were treated during a period of at least one year2 in the Dr.

Henri van der Hoeven Kliniek and were discharged between January 1993 and December 1999. This

time period was chosen because the first transmural patients were discharged in the year 1993, and a

follow up period of at least three years is recommended (Dolan & Coid, 1993). Between 1993 and

1999, 150 patients were discharged from the hospital. Of these 150 patients, 30 were discharged by the

2 There were three exceptions: two patients from the Contrary group stayed in the hospital for four and seven months, respectively. One patient left the hospital after ten months. These three patients were admitted from other forensic psychiatric institutions where they had been treated during a minimum period of three years.


85

court in line with the hospital staff’s advice after a transmural phase. Subsequently, we selected 90

patients who could be divided into three groups of 30 in accordance with the above described types of

discharge Conform, Contrary and Readmission. It should be noted that the majority of the non-selected

patients could not be included in one of the groups because they had a different judicial status. From

the Transmural group, three patients moved to sheltered housing, the rest to their own or their family’s

home. Five patients of the Conform group moved voluntarily to a non-secure psychiatric institution for

further treatment, two to sheltered housing and the rest to their own or their family’s home. All

patients from the Contrary group left to their own or family’s home. Eleven patients from this group

were discharged while they were still in the transmural phase, and 19 while they were still in the

hospital. Patients from the Readmission group were transferred to another forensic psychiatric hospital

(6), a selection institution for forensic psychiatric patients (12), or a penitentiary institution (11). The

residence of one patient was unknown. At the end of the study, December 2002, the place of residence

of the readmitted patients was searched in a national computer system containing data on all offenders

convicted to the tbs-order: 17 patients were still in a forensic psychiatric hospital under the tbs-order,

eight patients were discharged because the court had terminated their tbs-order, two patients had

unauthorized absences and one patient had died in a forensic psychiatric hospital. The residence of two

patients could not be retrieved from the computer system. Table 1 presents demographic, criminal and

treatment characteristics of the sample.

The majority of the patient sample was male, Dutch, single and unemployed at the time of the index

offense. Most patients in Dutch forensic psychiatric hospitals suffer from comorbid Axis II disorders

(according to the DSM-IV; APA, 1994), particularly cluster B disorders (see Hildebrand & de Ruiter,

2004; de Ruiter & Greeven, 2000). In general, substance use disorders occur in about 60% of all cases,

often in combination with Axis I and / or Axis II disorders; pure Axis I disorders (i.e., schizophrenia,

affective disorders, parafilia) are present in about 5% of the patients. Fourty percent of the sample had

committed homicide or attempted homicide (in 67% resulting in the death of the victim), 25% a sexual

offense, 21% a violent offense, 1% a property offense and 13% arson (in 94% with danger to persons).

Chapter 4

86


Transmural

n = 30

Conform

n = 30

Contrary

n = 30

Readmission

n = 30

Total

N = 120

Demographic

Mean age upon admission 26.7 26.1 25.7 23.4 25.5 Male 26 (87%) 24 (80%)a 28 (93%) 29 (97%)b 107 (89%) Dutch nationality 28 (93%) 24 (80%) 26 (87%) 23 (77%) 101 (84%) Upbringing in foster or children’s home 12 (40%) 14 (47%) 13 (43%) 18 (60%) 57 (48%) Single (at the time of the index offense) 21 (70%) 23 (77%) 21(70%) 24 (80%) 89 (74%) No education after primary school 11 (37%) 14 (47%) 12 (40%) 17 (57%) 54 (45%) Unemployed 12 (40%)a 13 (43%) 16 (53%) 21 (70%)b 62 (52%) Psychiatric

Out-patient treatment(s) 8 (27%) 6 (20%) 10 (33%) 3 (10%) 27 (23%) Inpatient admission(s) 10 (33%)c 10 (33%)c 12 (40%) 20 (67%)d 52 (43%) Substance(s) abuse 17 (57%)c 19 (63%)a 24 (80%) 26 (87%)b,d 86 (72%) Axis I disorder 4 (13%) 7 (24%) 1 (3%) 7 (24%) 19 (16%) Axis II disorder 23 (77%) 19 (63%) 26 (87%) 20 (69%) 88 (74%) Mean intelligence score 106.0b,f 104.8b 100.1e 94.4a 101.1 Index offenses

(Attempted) homicide 14 (47%) 11 (37%) 9 (30%) 14 (47%) 48 (40%) Sex offense 4 (13%)a 6 (20%) 14 (47%)b 6 (20%) 30 (25%) Violent offense 4 (13%) 7 (23%) 5 (17%) 9 (30%) 25 (21%) Property offense 1 (3%) - - - 1 (1%) Arson 7 (23%)b 6 (20%)b 2 (7%) 1 (3%)a 16 (13%) Victim was not a stranger 13 (48%) 14 (47%) 11 (37%) 13 (48%) 51 (43%) Mean duration of imprisonment in months 17.1 16.9 19.0 15.5 17.1 Mean number of previous convictions 1.7a 1.6a,e 3.0f 4.2b 2.6 Mean age at first conviction 22.2d 21.1d 18.7f 16.3c,e 19.6 Treatment

Mean duration of treatment in months 66.0b 54.4 67.1b 47.3a 58.7 Treatment included a probationary period 18 (60%)d 15 (50%)d 16 (53%)d 4 (13%)c 53 (44%) Serious incidents during treatment 10 (33%)c 14 (47%)c 15 (50%) c 26 (87%)d 65 (54%) Secluded in isolation room during treatment 2 (7%)a 6 (20%)a 5 (17%)a 14 (47%)b 23 (19%) Escaped from the hospital

7 (23%)c 7 (23%)c 8 (27%)c 23 (77%)d 45 (38%)

Note. a < b p < .05. c < d p < .01. e < f p <.05 (two-tailed). The differences were investigated with the F-test or

chi-square analysis. Number of patients whose intelligence scores were available: 17 transmural, 14 conform, 14 contrary, 18 readmission. Serious incidents = incidents for which the patient was secluded for at least two days in own room, recovery room or isolation room.

The table shows a number of significant differences between the four groups. Overall, the Readmission

group had more unfavorable demographic characteristics, especially when compared to the

Transmural and Conform groups: they more often had no work, multiple substance abuse, prior

admissions to inpatient psychiatric hospitals, obtained lower scores on intelligence scales and - albeit

not significant – more often grew up in foster care or institutional care and had a lower level of

education. Regarding criminal characteristics, it is notable that arsonists are over-represented in the

Transmural and Conform group and sex offenders in the Contrary group. Furthermore, we found

significant differences in the number of previous convictions and mean age at first conviction.


87

Readmitted patients compared to the other groups had more previous convictions and were younger at

their first conviction. Concerning behavior during treatment, the Readmission group compared to the

other groups had more unauthorized absences (escapes from hospital or not returning from leave for at

least two days) and had caused more serious incidents (e.g., violent behavior or drugs dealing) for

which they were secluded for at least two days in their own room, a recovery room or an isolation

room.

Instruments

HCR-20






Procedure

File information was gathered on 120 patients who were discharged between January 1993 and

December 1999 (admission between February 1984 and December 1996). In general, these files

consisted of psychological reports, reports to the court regarding treatment progress, treatment plans

and evaluations. Prior to coding, the files were screened by a research assistant who removed the

outcome of the hospital staff’s advice to the court. Next, we coded the Dutch versions of the HCR-20

and PCL-R on the basis of the file information. All information until the moment the patient left the

intramural setting was used. All raters were trained in coding the HCR-20 and PCL-R. The rating

procedure was performed while all raters were blind to the outcome (i.e., recidivism) and to the type of

discharge. It should be emphasized that the clinical decisions were not influenced by the HCR-20 and

PCL-R codings. At the time the patients in this sample were in treatment, hospital staff did not have

access to PCL-R scores, and the HCR-20 was not yet used in the hospital. In order to establish the

interrater reliability, three raters (in different compositions out of a group of four raters) independently

rated 30 cases that were randomly selected from the 120 cases and agreed upon a consensus score.

This consensus score was used for the analyses on predictive validity. Subsequently, the remaining

cases were divided among three raters.

Recidivism data

After all the files had been coded, recidivism data were retrieved from the Judicial Documentation

register of the Dutch Ministry of Justice. Recidivism was defined as a new conviction by the court for

an offense in accordance with Dutch criminal law. For the identification of violent offenses, we used

Chapter 4

88

the HCR-20 definition of violence: “violence is actual, attempted, or threatened harm to a person or

persons” (Webster et al., 1997b, p. 24). Furthermore, we explored new convictions for general

offenses, including property offenses, traffic offenses and drugrelated offenses. The follow-up period,

starting on the date of discharge by the court or readmission to another institution and ending on the

date of data gathering (December 1, 2002), varied from 36 to 114 months with an average of 72.5

months (SD = 22.7, Median = 71.0). The follow-up period for the Transmural group (60.9 months)

was significantly shorter compared to the period of the Conform (78.1 months; F (58, 60) = 5.9, p <

.01) and Readmission group (79.9 months; F (57, 59) = 4.3, p < .01), but not the Contrary group (70.8

months; F (58, 60) = 15.7, p = .07).


Survival analysis, also referred to as the Kaplan-Meier method, was used to calculate recidivism rates

(Schmidt & Wytte, 1988). This type of analysis takes into account the time the offender has been at

risk. Thus, it is possible to calculate the recidivism rate for a specific period despite the fact that the

follow-up periods of the patients diverge. The log rank statistic was used to test differences between

the four groups. The F-test was used to examine differences between the four groups in PCL-R and

HCR-20 mean scores, for differences in HCR-20 final risk judgments and psychopathy diagnoses

(PCL-R ≥ 26) we used chi-square analysis. The interrater reliability was examined by means of the

intraclass correlation coefficient (ICC), using the two-way random effect variance model and

consistency type (Shrout & Fleiss, 1979). Critical values we applied for single measure ICCs were:

ICC ≥ .75 = excellent; .60 ≤ ICC < .75 = good; .40 ≤ ICC < .60 = moderate; ICC < .40 = poor (Fleiss,

1986). The predictive validity of both instruments and the unstructured clinical judgment was

established with receiver operating characteristics (ROC) analyses (see for a description of this method

Douglas et al., 2005; Mossman, 1994; Rice & Harris, 1995; see also Chapter 3, p. 71). To compare the

obtained AUC values of the HCR-20, PCL-R and unstructured clinical judgment, we used the software

program AccuROC (Vida, 1997) which applies the non-parametric method described by DeLong et al.

(1988). Pearson point-biserial correlations were calculated for comparative purposes. Furthermore,

Cox regression analyses, which result in the hazard ratio (eB) that can be interpreted as the relative

risk, were conducted to evaluate whether the HCR-20 and PCL-R total scores and HCR-20 final risk

judgments add incremental validity to type of discharge as a predictor of violent recidivism. All

analyses were conducted using SPSS version 11.

Results Reconviction rates

Figure 1 presents the survival curves for violent reconvictions. As can be seen from the starting point

of these curves, a proportion of the patients already recidivated during their tbs-order. This was


89

especially the case for the Readmission group; eleven of seventeen still detained patients violently

recidivated during their tbs-order, either during their admission to the Dr. Henri van der Hoeven

Kliniek or to another institution.

In total, 36% of the patients recidivated with a violent offense, and when we accounted for the time the

patients had been at risk and used survival analysis this percentage was 39. There were significant

differences in the failure rates (computed with survival analyses) between the four groups on violent

recidivism: the Readmission group recidivated significantly more compared to the Transmural,

Conform and Contrary groups (violent recidivism rates 67 versus 27, 19 and 44, respectively; log rank

(3, 119) = 23.3, p < .001). The recidivism rate of the Contrary group was significantly higher

compared to the Conform group (log rank (1, 60) = 4.4, p < .05), but not to the Transmural group.

However, when we included the eleven patients who were discharged by the court against the hospital

staff’s advice while they were still in the transmural phase in the Transmural group instead of in the

Contrary group, the difference in violent recidivism between the Transmural and Contrary group was

significant (violent recidivism rates 22 versus 59, log rank (1, 60) = 5.2, p < .05). Patients with a PCL-

R score of 26 or above recidivated significantly more than patients with a lower score than 26 (violent

recidivism rates 69 versus 31, log rank (1, 119) = 19.7, p < .001; Odds ratio = 5.4, CI = 2.1 – 13.5),

and patients who scored above the HCR-20 median (= 26) recidivated significantly more compared to

those who scored below the median (violent recidivism rates 64 versus 28, log rank (1, 119) = 25.3, p

< .001; Odds ratio = 8.4, CI = 3.5 – 20.3). Furthermore, 52% of the total group was re-convicted for

any offense (all offenses), and when we accounted for time at risk and used survival analyses, this

percentage was 72. There were significant differences between on the one hand the Transmural and

Conform groups and on the other hand the Readmission group (50, 63 versus 79 respectively; log rank

(3, 119) = 11.7, p < .01). The recidivism rate of the Contrary group (71%) did not significantly differ

from the other three groups.

Chapter 4

90

Figure 1. Kaplan-Meier survival curve for violent re-offending during or after tbs-order (N=119)


The overall interrater reliability of the HCR-20 was good. The Historical scale, Clinical scale and total

score showed excellent reliability (ICCs = .89, .76, and .83, respectively), the final risk judgment good

reliability (ICC = .73), and the Risk management scale moderate reliability (ICC = .58). Two items -

both from the Risk management scale - demonstrated poor interrater reliability: ‘Lack of personal

support’ and ‘Stress’ (ICCs = .33 and .31, respectively).

Risk judgments and diagnosis of psychopathy

Table 2 shows the mean HCR-20 and PCL-R scores of the four groups. The Readmission group

compared to the other three groups had significantly higher scores on the Historical scale of the HCR-

20 and the PCL-R total score. Furthermore, the Contrary group obtained higher HCR-20 (subscales)

scores and PCL-R total scores compared to the Transmural and Conform groups, but lower scores on

the Historical scale and PCL-R total than the Readmission group. The final risk judgments and

diagnosis of psychopathy (PCL-R ≥ 26) are also presented in Table 2. Again, the Readmission group

compared to the Transmural and Conform group had more unfavorable judgments; almost all

readmitted patients were judged to pose a high risk and half of them fulfilled the criteria for

psychopathy. The Contrary group had more high risk judgments compared to the Transmural and

Conform group, but less compared to the Readmission group.

0,0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1,0

0 1 2 3 4 5 6 7Follow-up period in years

Cum

ulat

ieve

surv

ival

rate

PS

TransmuralConformContraryReadmission


91

Table 2. Mean HCR-20 and PCL-R scores (with SD in brackets), final risk judgments and diagnosis of

psychopathy (PCL-R ≥ 26)

Transmural n = 30

Conform

n = 30

Contrary

n = 30

Readmission

n = 30

Total

N = 120

HCR-20

Historical scale 12.6 (2.7)a 12.8 (2.9)a 14.6 (2.6)b,c 16.0 (2.6)b,d 14.0 (3.0) Clinical scale 3.7 (1.6)a 4.3 (2.1)c 5.4 (1.9)b,d 7.0 (1.3)b,d 5.1 (2.1) Risk management scale 6.5 (2.1)c 5.6 (1.7)c 7.6 (1.7)d 9.1 (1.1)d 7.2 (2.1) Total score 22.8 (5.3)a 22.8 (5.6)a 27.6 (5.4)b 32.0 (4.3)b 26.3 (6.4) Risk judgment: low 7 (23%)d 6 (20%)d 1 (3%)c 0 (0%)c 14 (12%) Risk judgment: moderate 15 (50%)b 17 (57%)b 13 (43%)b 2 (7%)a 47 (39%) Risk judgment: high 8 (27%)a,c 7 (23%)a,c 16 (53%) a,d 28 (93%)b 59 (49%)

PCL-R

1st ed. Factor 1 6.1 (2.7)a 6.8 (3.5) 7.9 (4.2) 8.5 (3.8)b 7.3 (3.7) 1st ed. Factor 2 7.9 (4.3)a 8.8 (4.2)a 10.2 (4.8)c 14.5 (3.3)b,d 10.4 (4.8) 2nd ed. Interpersonal 1.8 (1.5)c 2.4 (2.0) 2.8 (2.4) 2.9 (2.4)d 2.5 (2.1) 2nd ed. Affective 4.1 (1.4)a,c 4.6 (1.9) 5.0 (2.0)d 5.4 (1.7)b 4.8 (1.8) 2nd ed. Lifestyle 4.7 (2.3)a 5.0 (2.3)a 5.6 (2.8)a 8.1 (1.8)b 5.8 (2.7) 2nd ed. Antisocial 3.6 (2.5)a,c 3.9 (2.4)a 5.2 (2.7)a, d 7.0 (1.9) b 4.9 (1.9) Total score 15.4 (5.7)a 17.0 (6.7)a 20.2 (8.3)b,c 25.3 (6.3)b,d 19.5 (7.7) PCL-R ≥ 26

2 (7%)a,c 3 (10%)a 8 (27%)d 15 (50%)b 28 (23%)

Note. a < b p < .01. c < d p < .05 (two-tailed). 1st ed. = Hare’s PCL-R (Hare, 1991). 2nd ed. = Hare’s PCL-R

Second edition (Hare, 2003).

Predictive validity

Table 3 shows the AUC values and Pearson correlations of the HCR-20, PCL-R and unstructured

clinical judgment3 for violent and general recidivism. Figure 2 presents the ROC curves for the HCR-

20, PCL-R and unstructured clinical judgment for violent re-offending. The AUC values for violent

offenses were significantly above .50 for both the (subscales of the) HCR-20, the (factors of the) PCL-

R and the unstructured clinical judgment. However, the AUC values for the three measures differed

significantly. The HCR-20 (Historical and Risk management scale, total score and final risk judgment)

was significantly more accurate in predicting violent recidivism than the unstructured clinical

judgment (χ2 (1, 119) = respectively 4.4, 4.2, 7.4 and 4.5, p < .05). Besides, the HCR-20 total score

predicted significantly better than the PCL-R total score (χ2 (1, 119) = 4.5, p < .05). When the item

‘Psychopathy’ was removed from the HCR-20 total score, the AUC value of the HCR-20 total score

changed minimally from .822 (HCR-20 including item ‘Psychopathy’) to .817 (HCR-20 excluding

item ‘Psychopathy’). Although this change was very small, the difference in predictive validity

3 For the analyses on predictive validity, the four types of discharge were considered as a 4-point scale: transmural = 1 (lowest risk), conform = 2, contrary = 3, and readmission = 4 (highest risk).

Chapter 4

92

between the HCR-20 total score and PCL-R total score was no longer significant (χ2 (1, 119) = 3.2, p =

.08).

Table 3. Predictive validity of the HCR-20 and PCL-R (N=119)

Violent offending

General offending

AUC

SE

r

AUC

SE

r

HCR-20

Historical scale .80*** .04 .47** .70*** .05 .34** Clinical scale .77*** .04 .46** .67** .05 .30** Risk management scale .79*** .04 .47** .67** .05 .30** Total score .82*** .04 .52** .70*** .05 .35** Final risk judgment .79*** .04 .51** .66** .05 .30** PCL-R

1st ed. Factor 1 .63** .05 .23* .63* .05 .20* 1st ed. Factor 2 .79*** .04 .47** .70*** .05 .33** 2nd ed. Interpersonal .55 .06 .12 .58 .05 .13 2nd ed. Affective .67** .05 .29** .62* .05 .22* 2nd ed. Lifestyle .77*** .05 .45** .71*** .05 .36** 2nd ed. Antisocial .77*** .04 .45** .66** .05 .28** Total score .75*** .05 .43** .68** .05 .43** PCL-R ≥ 26 .65** .06 .34** .58 .05 .20* Unstructured clinical judgment

.68**

.05

.32**

.63*

.05

.22*

Note. *p < .05, ** p < .01, *** p < .001 (two-tailed). AUC = Area under the curve. SE = Standard error. r = Pearson point-biserial correlation. Violent offending = including sexual and homicide offenses. When item 7 ‘Psychopathy’ was removed from the HCR-20, the AUC values of the H scale was .779 and of the HCR-20 total score .817. 1st ed. = Hare’s PCL-R (Hare, 1991). 2nd ed. = Hare’s PCL-R Second edition (Hare, 2003).


93

Figure 2. ROC curves for HCR-20, PCL-R and unstructured clinical judgment for violent re-offending (N=119)

Note. HCR-20 total score is including item ‘Psychopathy’.

Next, we conducted Cox regression analyses. The unstructured clinical judgment was entered on block

1. The HCR-20 total score and PCL-R total score were entered on block 2 and the HCR-20 final risk

judgment was entered on block 3 by using the forward conditional method. The unstructured clinical

judgment produced a significant model fit (χ2 (1, 119) = 14.5, p < .001). The HCR-20 total score

produced a significant improvement to the model’s fit (χ2 change (1, 119) = 23.6, p < .001). Finally,

the HCR-20 final risk judgment produced a significant improvement to the model’s fit (χ2 change (1,

119) = 5.3, p < .05). In the final model, the HCR-20 total score (eB = 1.1, 95% CI = 1.0-1.2) and

final risk judgment (eB = 3.1, 95% CI = 1.2-8.4) were significant predictors of violent recidivism.

Discussion In this chapter, the relation between type of discharge and (risk of) recidivism was examined in a

group of treated forensic psychiatric patients. To our knowledge, this study is the first to compare

results of different risk assessment methods: unstructured clinical judgment (operationalized as type of

discharge), actuarial judgment (HCR-20 subscales and total scores) and structured professional

judgment (HCR-20 final risk judgment). The HCR-20 – both the actuarial scores and the final risk

1-specificity

Sen

sitiv

ity

Reference line Unstructured clinical judgment PCL-R total score HCR-20 total score

Chapter 4

94

judgment - was the best predictor of violent recidivism. The interrater reliability and predictive

validity of the HCR-20 we obtained are in line with previous findings (Douglas, 2001). In addition, the

PCL-R showed good predictive validity for violent re-offending, however, the HCR-20 total score

predicted significantly better than the PCL-R total score. Yet, the difference in AUC values was only

marginally significant after the item ‘Psychopathy’ was removed from the HCR-20 total score. This

trend is compatible with the results of a study in 193 civilly committed patients by Douglas, Ogloff,

and colleagues (1999) who compared AUC values of the HCR-20 and the Psychopathy

Checklist:Screening Version (PCL:SV; Hart et al., 1995) and found the HCR-20 to be a significantly

better predictor of violent behavior. The categorical diagnosis of psychopathy (PCL-R score ≥ 26)

showed significant, but only moderate predictive validity (AUC = .65). Most studies into the

predictive validity do not report AUC values for the categorical diagnosis of psychopathy, because

ROC analyses are less suitable to apply with dichotomous or trichotomous variables. However, the

odds ratio and correlation coefficient we found between the diagnosis of psychopathy and violent re-

offending resemble previous results (see Hare et al., 2000). When using the two factor model of the

first edition of the PCL-R, particularly Factor 2 predicted violent re-offending, whereas Factor 1

showed below moderate predictive validity. Our finding is in line with previous research (Grann,

Långström, Tengström, & Kullgren, 1999). With the PCL-R second edition four factor model, we

found significant predictive validity for the factors Affective, Lifestyle and Antisocial, but not for the

factor Interpersonal. To our knowledge, this is the first study that applies the new four factor model.

Although the unstructured clinical judgment predicted significantly better than chance, the predictive

accuracy for violent recidivism was weak and significantly worse compared to the HCR-20 actuarial

scores and structured final risk judgments. Our finding is in line with previous research that found

actuarial risk assessment to be superior to unstructured clinical judgment in predictive accuracy

(Gardner et al., 1996; Grove & Meehl, 1996).

Furthermore, the final risk judgment was found to add significant incremental validity to the HCR-20

used in a numerical sense. This is in line with studies that also found that the structured final risk

judgments added incremental validity to the HCR-20 numerical scores (Douglas et al., 2003; de Vogel,

& de Ruiter, submitted for publication, see Chapter 6). The same pattern was found for two other SPJ

instruments; the SVR-20 (Boer et al., 1997; Dempster, 1998; de Vogel et al., 2004, see Chapter 3) and

the SARA (Kropp et al., 1999; Kropp & Hart, 2000). In conclusion, our findings provide strong

support for the SPJ model of risk assessment.

The present study demonstrated several significant differences between the four discharge groups. On

the whole, the Readmission group compared to the other three groups showed more unfavorable

demographic, criminal and treatment characteristics, as well as recidivism rates, HCR-20 scores and

final risk judgments, and PCL-R scores/diagnosis. This finding alone attests to the validity of the

HCR-20 as a predictor of violent recidivism. The recidivism rates of the Readmission group are

worrying, especially considering the fact that a significant proportion of this group was in residential


95

treatment or detention while recidivating. Most notable are the high HCR-20 and PCL-R scores of the

readmitted patients: almost all were judged as high-risk and half of them were diagnosed as

psychopaths.

Next, it was striking that almost no significant differences could be detected between the Transmural

group and the Conform and Contrary groups. Indeed, we found no significant differences at all

between the Transmural and the Conform group. Our hypothesis that patients who have passed a

transmural treatment phase compared to patients who were discharged by the court in line with the

hospital staff’s advice, but without a preceding transmural phase, pose a lower risk for violent re-

offending because of a solid preparation before return to society was not confirmed. It must be noted,

however, that seven patients of the Conform group were voluntarily transferred to a general psychiatric

institution or sheltered living upon termination of the tbs-order, compared to three patients of the

Transmural group. Possibly, these patients recidivated less because they were still in care of an (albeit

non-secure) institution and receiving treatment. This suggestion is supported by the results of the

MacArthur Risk Assessment study that showed significantly less violent recidivism in psychiatric

patients receiving seven or more treatment sessions during a follow-up period compared to patients

receiving six or less treatment sessions after a short-term admission to a closed psychiatric ward

(Monahan et al., 2001).

Between the Transmural and Contrary group, we found significant differences in risk of recidivism as

rated with the HCR-20. The actual recidivism rate of the Contrary group is higher than of the

Transmural group, but this difference was not significant. A possible explanation is that eleven

patients of the Contrary group were in the transmural phase when the tbs-order was terminated by the

court. Perhaps, these patients had already benefited from the transmural phase although the hospital

did not believe they were ready to be discharged yet. This hypothesis was confirmed when we

included the eleven patients who had a termination of the tbs-order against the hospital staff’s advice

whilst in the transmural phase in the Transmural group instead of the Contrary group. In this case, the

difference in violent recidivism between the Transmural and Contrary group was indeed significant. It

should also be noted that half of the patients in both the Conform and Contrary group have

experienced a period of probationary leave under supervision of probation officers, some successful

and some not because they returned to the hospital. Thus, these patients have had a period of

practicing living outside the hospital. However, in The Netherlands, this probationary supervision is

more limited than the much more intensive transmural phase in which the patient is still supervised by

hospital staff.

To summarize, our hypothesis that patients who have passed a successful transmural phase and who

were discharged by the court in line with the hospital staff’s advice pose a lower risk of violent

recidivism compared to the other groups could not be confirmed. We did not find more favorable

recidivism outcome compared to patients who were discharged by the court in line with the hospital

staff’s advice and no transmural treatment phase. However, we did discover a reasonable outcome

Chapter 4

96

compared to patients who were readmitted or were discharged by the court against the hospital staff’s

advice, especially when considering the findings from the alternative post hoc analyses in which those

patients from the contrary group whose treatment had been terminated during the transmural phase

were added to the transmural group. Thus, a transmural phase seems to have a preventive effect in

terms of violent recidivism.

A number of limitations to the present study should be mentioned. The first limitation relates to the

retrospective design of the study. We could only use file information to code the HCR-20 and PCL-R

and the quality of these files differed, which may have influenced the codings. Moreover, in spite of

the fact that a research assistant had deleted the hospital staff’s recommendations to the court, in some

files we could discover clues regarding the hospital staff’s advice to the court. A second limitation is

that our sample was small, so differences between the groups may not be valid. Moreover, the sample

was derived only from one forensic psychiatric hospital, thereby limiting generalizability.

Nevertheless, we consider this group to be representative for Dutch offenders with a tbs-order, because

they are largely similar in demographic, psychiatric and criminal characteristics to the tbs-population

in general (see van Emmerik & Brouwers, 2001). The question of the generalizability of our findings

to patients in other jurisdictions deserves special attention. As already mentioned in the Method

section our patient sample consists largely of individuals with comorbid personality disorder and

substance use disorder. Psychotic and other Axis I disorders are present in a minority of cases. We are

aware that our forensic psychiatric sample is more similar to a general offender sample in North

America than to a North American forensic psychiatric sample, and this should be taken into

consideration when comparing our findings to those of other studies. A third limitation is that

recidivism data were retrieved from only one source, the Judicial Documentation register of the

Ministry of Justice. As a consequence, the reconviction rate is inevitably an underestimation of the

actual recidivism rate, because not all offenders are reported, apprehended and arrested.

Large-scale prospective studies across different settings and contexts, for instance, in the prison

system and outpatient forensic clinics are needed to confirm the predictive validity of the (Dutch)

HCR-20 found in this study. On the other hand, a number of problems might be encountered with

prospective research. The most important problem is that prospective predictive research will be

hampered by the clinical goals of risk assessment, i.e., risk management and prevention. Hart (1998a)

stated that predictions of violence are not passive assessments, but decisions that influence services

delivered to individuals: “Clinicians are bound - morally, ethically, and legally - to try to prove

themselves wrong when they predict violence and take every reasonable action to prevent violence” (p.

365). Thus, when clinicians perform HCR-20 risk assessments it is very likely that the outcome

influences decisions concerning leave, entry into a transmural treatment phase, or termination of

(mandatory) treatment and that high-risk patients will not be released from the hospital. Therefore,

retrospective studies such as the present study are particularly suitable to examine the predictive

validity of risk assessment instruments.


97

In conclusion, we propose two recommendations regarding the use of structured risk assessment in

forensic clinical practice. First, any accurate systematic risk assessment must provide information

regarding useful risk management strategies. We believe that high-risk cases, such as the patients in

the Readmission group, can be identified in an early phase of treatment. With the PCL-R, psychopathic

traits can be explored and with repeated measures of the HCR-20, clinicians and researchers can

monitor treatment progress. An early identification of high-risk patients makes it possible to adjust

treatment plans and design adequate risk management strategies to prevent violent recidivism both in

and outside the hospital. In addition, an early identification of high-risk patients could possibly prevent

the necessity of readmitting patients which usually causes feelings of failure in both hospital staff and

patient. Second, we want to stress that collaboration between researchers and clinicians is necessary

for optimizing risk assessment accuracy and prevention of violent re-offending. Intensive collaboration

between forensic mental health professionals from intramural settings, outpatient settings and sheltered

housing organizations can help to further develop aftercare methods.

The HCR-20 in personality disordered female offenders: A comparison with a matched sample of males

This chapter is a slightly revised version of Vogel, V. de, & Ruiter, C. de (in press). The HCR-20 in personality disordered female offenders: A comparison with a matched sample of males. Clinical Psychology and Psychotherapy, 12.

55

The HCR-20 in personality disordered female offenders

101

CHAPTER 5

The HCR-20 in personality disordered female offenders: A comparison with a matched sample of males

Gender is one of the most significant predictors of violence; regardless of age, ethnicity, culture and

socioeconomic status, men are significantly more often convicted for violent offenses than women

(Archer & McDaniel, 1995; Monahan et al., 2001). However, research also suggests that mental

disorder reduces the gender gap in violence, especially for inpatient aggression. Among psychiatric

patients, the base rate for (inpatient) violence is not significantly different for male and female patients

(Lidz et al., 1993; McNiel & Binder, 1990; Newhill et al., 1995; Nicholls, 2001; Tardiff, Marzuk,

Leon, Protera, & Weiner, 1997). Ross and colleagues (1998) found no sex differences between a

sample of 82 male and 49 female psychiatric patients in the occurrence of inpatient aggression.

However, regarding violence in the community after treatment, male patients were found to be four

times more likely than female patients to express any aggression.

Research has demonstrated that unstructured clinical judgment of violence risk is sensitive to sex-

based biases; clinicians tend to underestimate the risk of violence in female psychiatric patients (Lidz

et al., 1993; McNiel & Binder, 1995). More generally, research has revealed some important

limitations of unstructured clinical judgment, such as poor reliability and validity (Monahan, 1981;

Quinsey et al., 1998). Use of structured risk assessment instruments is recommended to avoid these

types of biases and to optimize reliability and validity of violence risk assessment (Borum, 1996). A

problem, however, is that existing structured risk assessment instruments are developed based on

violence risk research primarily in male samples. Thus, the question arises if the risk factors for

violence found in male samples are also valid for females and consequently, if the existing structured

This study examined the predictive validity of the HCR-20 in a sample of 42 female patients admitted to the Dr. Henri van der Hoeven Kliniek. The findings are compared to a matched sample of 42 male forensic psychiatric patients. Theinterrater reliability of the HCR-20 was good for both female and male patients. There were significant differences between female and male patients in meanHCR-20 item scores, but the mean H, C, and R-subscale scores and total score were comparable. The base rate for incidents of violence during treatment wassimilar for female (30%) and male patients (29%), but the base rate for violentrecidivism after discharge was significantly higher for the male sample (43%)compared to the female sample (13%). For male patients, the HCR-20 demonstrated good to excellent predictive validity for violent outcome (violentrecidivism and incidents of violence during treatment), however, predictive accuracy for female patients was much lower. In females, only the HCR-20 final risk judgment, but not the HCR-20 total score, demonstrated significant predictive validity for violent outcome.

Chapter 5

102

risk assessment instruments are suitable for use with female patients. Several authors have argued that

risk factors for violence in female samples are generally the same as in male samples and that existing

risk assessment instruments are likely valid for use with females (Blanchette, 1994; Harer & Langan,

2001; Simourd & Andrews, 1994; Strand & Belfrage, 2001). Loucks and Zamble (1999) compared the

characteristics of 100 female offenders to a sample of male offenders1, and although they found some

differences in the occurrence of important life experiences, these differences were not predictive of

criminal behavior. In contrast, others have argued that assessing risk for violence is different for

women compared to men because risk factors for women are closely linked to their unique experiences

as a woman, for instance, victimization (Chesney-Lind, 1989; Scarth & McLean, 1994) or to the fact

that social bonds are of greater importance to women than to men and that women are thus more

sensitive to disruptions in close relationships (see Funk, 1999; Odgers & Moretti, 2002). Funk (1999)

tested risk factors for reoffending in 388 male and 112 female juvenile delinquents on probation and

found several risk factors (e.g., child abuse or neglect, running away from home) that were

significantly predictive for females but not for males. Therefore, she concluded that risk factors for

females differ substantially from those of their male counterparts, that risk assessment instruments fail

to identify most female risk factors, and that separate risk assessment instruments for males and

females should improve classifications for risk of reoffending. To our knowledge, only one structured

risk assessment instrument has been developed especially for the assessment of risk in females; the

EARL-21G (Levene et al., 2001). Vitale and Newman (2001) stated that existing risk assessment

instruments have not yet been adequately tested to determine their generalizability to women.

A structured risk assessment instrument that has drawn considerable international attention is the

HCR-20 (Webster et al., 1997b). The HCR-20 is a checklist according to the structured professional

judgment (SPJ) approach. In the SPJ approach, the risk assessment is performed by a forensic clinician

by means of a standardized checklist, containing empirically derived risk factors for violence,

historical as well as dynamic factors. The HCR-20 consists of 20 items representing risk factors for

violence in the past (Historical scale), present (Clinical scale) and future (Risk management scale).

Research among various psychiatric and forensic samples in different countries has demonstrated good

interrater reliability and predictive validity for the HCR-20 (see Douglas et al., 2005). The HCR-20

was primarily developed on the basis of research in male samples and most research into the

psychometric properties of the HCR-20 has been done in male samples. Therefore, the question

whether the HCR-20 is also suitable for use with females seems important. Nicholls (2001) conducted

a retrospective study to evaluate the validity of the HCR-20 and the PCL:SV (Hart et al., 1995) for

assessing female patients’ risk for inpatient and community violence. She compared the results of 47

female patients to a matched sample of 47 male patients admitted to a forensic psychiatric hospital and

found the distribution of the mean HCR-20 and PCL:SV scores to be comparable. The HCR-20

1 The authors do not mention the number of males and whether the males were matched to the females.


103

showed good predictive accuracy for inpatient aggression for both male and female patients. The

predictive accuracy of the HCR-20 for community aggression was modest for both samples. Strand

and Belfrage (2001) conducted a retrospective study to investigate the utility of the HCR-20 in a

female forensic psychiatric sample. They compared the HCR-20 scores of 63 female and 85 male

patients admitted to two forensic psychiatric hospitals in Sweden and found some significant

differences in mean individual item scores, however, the mean subscale scores and total score did not

differ significantly. The authors thus concluded that the HCR-20 is suitable for use in female forensic

psychiatric patients, particularly to assess inpatient violence. A limitation of this research is that the

authors did not examine the predictive validity of the HCR-20 scores for violent outcome, which is the

most important aspect to decide whether the HCR-20 is adequate for female patients.

An important issue to keep in mind when assessing risk for future violence is that violence is a

multifaceted construct. Risk assessment should not only be directed at predicting the likelihood of

violence, but also take into account the severity, nature, frequency, and imminence of violence (Hart,

1998a). Research has shown that in general, the nature, severity and victims of violent offenses

committed by women are different from those committed by men. Female violence is less often sexual

in nature, less often characterized as instrumental and more often as reactive, less often resulting in

injury, more often relational and more often occurring in the residence (Monahan et al., 2001;

Nicholls, 2001; Odgers & Moretti, 2002). Summarizing the above suggests that the factors and

assessment of violence risk differs at least to a certain extent between female and male patients, and

that the utility of the existing structured risk assessment instruments for women has yet to be

convincingly proven.

In this chapter, we will present findings on the interrater reliability and predictive validity of the HCR-

20 in a sample of 42 female patients who have been admitted to the Dr. Henri van der Hoeven Kliniek.

The findings are compared to a matched sample of 42 male forensic psychiatric patients from the same

hospital. The aim of the present study was to examine if there are differences between female and male

forensic psychiatric patients regarding mean HCR-20 scores, interrater reliability and predictive

validity for violent outcome. In addition, we coded the PCL-R (Hare, 1991, 2003) and compared the

mean scores and predictive validity for violent outcome between female and male patients. Several

studies have been conducted into the use of the PCL-R in female samples. In general, a lower

prevalence of psychopathy among females compared to males was found (Grann, 2000; Salekin et al.,

1997; Vitale et al., 2002; Warren et al., 2003). Vitale and Newman (2001) reviewed the literature

regarding the PCL-R in female samples and found good support for the reliability, but modest support

for the predictive validity. They concluded that whereas the PCL-R might be able to postdict violent

behavior in the past, there is no evidence that the PCL-R can predict future violence in women. The

issue whether the PCL-R is suitable for the assessment of psychopathy in women is not settled. Some

have argued that the PCL-R is adequate for assessing psychopathy in women, since they found a

considerable degree of similarity to the construct of psychopathy in male offenders (Salekin et al.,

Chapter 5

104

1997; Warren et al., 2003). On the contrary, Vitale and colleagues (2002) believe that the findings thus

far are not sufficiently convincing to conclude towards a similarity of the PCL-R structure across

gender. They express concern that some PCL-R items are not adequately assessing the construct of

psychopathy as it is expressed in women.

Method Setting



Procedure

First, we collected archival data from the hospital records for 42 female patients admitted to the

hospital between 1985 and 2003. The Dutch versions of the HCR-20 (Philipse et al., 2000) and PCL-R

(Vertommen et al., 2002) were coded for all 42 women on the basis of all available file information.

There were three categories. (1) 15 women whose HCR-20 had already been coded in a recently

conducted retrospective study into the predictive validity of the HCR-20 (see de Vogel et al., 2004,

Chapter 4). In this study, the rating procedure was performed while all raters were blind to

reconviction data. (2) 23 females from an ongoing prospective study in which the HCR-20 is coded

independently by a researcher, a treatment supervisor and a groupleader. During a case conference, the

raters discuss their scores and agree upon a consensus score that was used for the analyses in the

present study (see de Vogel & de Ruiter, 2004, Chapter 2 and 6). (3) Four female patients who were

admitted to the hospital at the time of the current study, but had not been included in the prospective

study mentioned above. For these four cases, two raters independently and prospectively coded the

HCR-20 and agreed upon a consensus score that was used in the analyses. In order to establish the

interrater reliability, we used all codings performed by three independent raters, i.e., the 23 codings

from the prospective study and four cases from the retrospective study. The mean follow up period of

female patients from the retrospective study was 74.6 months (SD = 23.9, range = 26.7 – 109.6), and

from the prospective study 10.2 months (SD = 7.8, range = 0.2 – 26.3).

Second, we matched the women to 42 male patients on year of birth, type of index offense, ethnicity,

and type of psychopathology (i.e., Axis Ι, Axis II or comorbid Axis I and II, according to the DSM-IV;

APA, 1994). Regarding the index offenses, there were two women with a property offense without

violence, and only one male with the same index-offense, so we decided to match one of these two

women to a male with a property offense in combination with violence (see Table 1). The 42 men

were identified from a total sample of 205 male patients admitted between 1985 and 2003 and

obtained from two sources. (1) 21 cases from the recently conducted retrospective study into the

predictive validity of the HCR-20 (see above). (2) 21 cases from the ongoing prospective study (see


105

above). Interrater reliability was established for all codings performed by three independent raters, i.e.,

21 cases from the prospective study and seven of the 21 cases from the retrospective study. For male

patients from the retrospective study, the mean follow up period was 81.1 months (SD = 23.8, range =

46.1 – 114.9), and for the males from the prospective study 18.7 months (SD = 6.6, range = 4.7 –

26.3). The mean follow up period of men from the prospective study was significantly longer

compared to the mean follow up period of women from the prospective study (t (48) = 4.0, p < .01).

For the retrospective study, the mean follow up period of men and women did not differ significantly

(t (36) = .81, p = .42).

Instruments

HCR-20






Subjects

Table 1 presents demographic, psychiatric and criminal characteristics for the female and male sample.

There were a number of significant differences between female and male patients. Female patients

compared to male patients more often were involved in an intimate relationship at the time of the

index offense, had less often abused substances, were more often diagnosed with borderline

personality disorder (BPD) and less often with narcissistic personality disorder (NPD), obtained higher

scores on intelligence scales, particularly on verbal intelligence, and were older at the time of their first

conviction. Antisocial personality disorder (ASPD) was less prevalent in women than in men, although

the difference was marginally significant (χ2 (1, 61) = 3.6, p = .06). Only the antisocial, borderline and

narcissistic personality disorders (according to the DSM-IV), are reported in Table 1 because these

disorders are the most prevalent in both men and women in forensic psychiatric settings (see Coid et

al., 1999; Hildebrand & de Ruiter, 2004; de Ruiter & Greeven, 2000; Warren et al., 2002). There was a

trend that women more often had a relative or (ex) partner as victim (χ2 (1, 84) = 2.9, p = .08).

Chapter 5

106


Female patients n = 42

Male patients

n = 42

Demographic

Mean age upon admission 33.2 30.7 Dutch nationality 38 (91%) 40 (95%) Upbringing in foster or children’s home 13 (31%) 17 (41%) Single (at the time of the index offense) 24 (57%) 36 (86%)**

No education after primary school 19 (45%) 21 (50%) Unemployed (at the time of the index offense) 38 (91%) 33 (79%) Psychiatric

Prior out-patient treatment(s) 22 (52%) 15 (36%) Prior inpatient admission(s) 23 (55%) 24 (57%) Substance(s) abuse 27 (64%) 35 (83%)* Antisocial personality disorder✫ 8 (25%) 14 (48%) Borderline personality disorder✫ 24 (75%) 7 (24%)** Narcissistic personality disorder✫ 3 (9%) 10 (35%)* Mean intelligence scores: total✫✫ 111.3 105.1 Mean intelligence scores: verbal✫✫ 115.4 97.4** Mean intelligence scores: performance✫✫ 114.8 111.8 Offenses

(attempted) Homicide 26 (62%) 26 (62%) Sexual 1 (2%) 1 (2%) Violent 4 (10%) 5 (12%) Arson 9 (21%) 9 (21%) Property 2 (5%) 1 (2%) Victim was not a stranger 31 (74%) 26 (62%) Victim was (ex)partner or relative 15 (36%) 8 (19%) Mean duration of imprisonment in months 19.5 28.1 Mean number of previous convictions 1.9 6.4 Mean age at first conviction

27.2 21.1**

Note. ** p < .01. * p < .05 (two-tailed). ✫ Personality disorders were diagnosed with the SIDP-IV (Pfohl et

al., 1995) and available for 32 females and 29 males. ✫✫ Mean intelligence scores were available for 18 females and 21 males.

Violent outcome data

Violent outcome data were obtained from two sources. First, data on violent recidivism of the patients

from the retrospective study were retrieved from the Judicial Documentation register of the Ministry of

Justice. For the identification of violent offenses, we adopted the HCR-20 definition of violence:

“violence is actual, attempted, or threatened harm to a person or persons” (Webster et al., 1997b, p.

24). Second, data on incidents of violence during treatment were obtained from information bulletins

that are published daily in the hospital to inform patients and staff. In these bulletins, the most

important events of the day are reported, such as disruptive incidents that occurred during the last 24

hours, or positive results on urine analysis to detect if a patient has taken drugs. Disruptive incidents


107

are registered and assigned to one of four categories: verbal violence, verbal threat, physical violence,

and violation of hospital rules (see for details Hildebrand et al., 2004). Because the HCR-20 is

designed to assess risk for violence to others, we only used the category physical violence, and only

those incidents of physical violence directed towards other persons (e.g., staff or patients). For

instance, property damage was not included, unless the property damage occurred in the presence of

someone with the goal to frighten or threaten that person (e.g., smashing a cup of hot coffee against

the wall while someone is standing close by). HCR-20 scores and final risk judgments were related to

incidents of physical violence during treatment that occurred after the date of the risk assessment.

Incidents of violence during treatment and violent recidivism after discharge were collapsed into one

violent outcome variable.


Student’s t-test was used to examine differences between men and women in HCR-20 and PCL-R

mean scores. For differences in HCR-20 final risk judgments and psychopathy diagnoses (PCL-R ≥

26) we used chi-square analysis. The interrater reliability of the HCR-20 was examined by means of

the intraclass correlation coefficient (ICC), using the two-way random effect variance model and

consistency type (McGraw & Wong, 1996). Critical values used for single measure ICC’s are: ICC ≥

.75 = excellent; .60 ≤ ICC < .75 = good; .40 ≤ ICC < .60 = moderate; ICC < .40 = poor (Fleiss, 1986).

The predictive validity was established with receiver operating characteristics (ROC) analyses (see for

a description of this method Douglas et al., 2005; Mossman, 1994; Rice & Harris, 1995; see also

Chapter 3, p. 71). (To compare the obtained AUC values for men and women, we used AccuROC

version 2.5 (Vida, 1997) that applies the non-parametric method as described by DeLong, et al. (1988).

Pearson point-biserial correlations were computed for comparative purposes.

Results


The interrater reliability for female patients was good for the Historical scale, total score, and final risk

judgment (n = 27; ICC = .82, .75, and .74, respectively), and moderate for the Clinical scale and Risk

management scale (ICC = .55, and .51, respectively). Furthermore, for male patients we found good

interrater reliability for the Historical scale, Clinical scale, total score, and final risk judgment (n = 28;

ICC = .82, .70, .77, and .69, respectively), and moderate interrater reliability for the Risk management

scale (ICC = .49).

Risk judgments and psychopathy

Table 2 presents the mean scores and standard deviations for the HCR-20 items, subscales, and total

score for both female and male patients. As can be seen from this table, the mean HCR-20 subscales

Chapter 5

108

and total scores did not differ significantly between the female and male sample. However, there were

significant differences on some individual HCR-20 item scores. Female patients received significantly

lower scores on the items ‘Young age at first violent incident’, ‘Psychopathy’, and ‘Negative

attitudes’. On the contrary, female patients compared to male patients received significantly higher

scores on the items ‘Relationship instability’ and ‘Impulsivity’. Regarding the HCR-20 final risk

judgments, women were significantly more often judged as moderate risk, while men were

significantly more often judged as high-risk. The mean HCR-20 total score per final risk judgment

category for female patients was: low: 21.6 (range = 10 – 29); moderate: 26.2 (range = 19 – 32); high:

30.2 (range = 23 – 37). For male patients, the mean HCR-20 total score per final risk judgment

category was: low: 20.5 (range = 12 – 29); moderate: 26.1 (range = 19 – 33); high: 31.8 (range = 21 –

37). For both men and women the mean HCR-20 total scores differed significantly between the low,

moderate and high risk cases (women: F (2, 42) = 7.8, p < .01; men: F (2, 42) = 17.2, p < .001). There

were no significant differences between men and women in the mean HCR-20 total scores per final

risk judgment (low: t (22) = -.40, p = .67; moderate: t (34) = -.08, p = .94; high: t (28) = .99, p = .33).

Frequently coded other considerations differed somewhat for female and male patients. The three most

frequently coded other considerations for male patients were financial problems (6), lack of prospects

for the future (5), and violent fantasies (4) (number of codings for females: 2, 3 and 2, respectively).

The three most frequently coded other considerations for female patients were forming a new intimate

relationship (e.g., problematic partner choice) (18), care for children (5), and prostitution (4) (number

of codings for males: 2, 1 and 1, respectively).

The mean PCL-R Factor 1, Factor 2, and total score and the categorical diagnosis of psychopathy

(PCL-R ≥ 26) are shown in Table 2. Female patients compared to males received significantly lower

mean scores on Factor 1 (t (84) = 2.3, p < .05), however, the differences in the mean Factor 2 and total

score were not or only marginally significant (t (84) = 1.6, p = .11; t (84) = 1.8, p = .08, respectively).

Male patients compared to females were more often diagnosed as psychopathic (PCL-R ≥ 26),

although this difference was marginally significant (χ2 (1, 84) = 3.1, p = .08).


109

Table 2. Mean HCR-20 and PCL-R scores (standard deviations in brackets), final risk judgments, and psychopathy diagnosis

Female patients n = 42

Male patients

n = 42

Historical items

1. Previous violence 2.0 (.22) 1.9 (.26) 2. Young age at first violent incident 1.2 (.65) 1.5 (.55)* 3. Relationship instability 1.9 (.30) 1.7 (.51)* 4. Employment problems 1.4 (.80) 1.5 (.67) 5. Substance use problems 1.3 (.87) 1.5 (.80) 6. Major mental illness .83 (.85) .90 (.85) 7. Psychopathy .38 (.54) .71 (.71)* 8. Early maladjustment 1.7 (.52) 1.8 (.40) 9. Personality disorder 1.9 (.45) 1.9 (.33) 10. Prior supervision failure 1.4 (.80) 1.4 (.86) Clinical items

1. Lack of insight 1.4 (.54) 1.5 (.63) 2. Negative attitudes .98 (.78) 1.3 (.75)* 3. Active symptoms of major mental illness .26 (.59) .19 (.45) 4. Impulsivity 1.7 (.51) 1.3 (.74)** 5. Unresponsive to treatment 1.1 (.70) 1.1 (.59) Risk management items

1. Plans lack feasibility 1.1 (.68) 1.2 (.74) 2. Exposure to destabilizers 1.4 (.54) 1.4 (.62) 3. Lack of personal support 1.3 (.66) 1.2 (.70) 4. Noncompliance with remediation attempts 1.0 (.63) 1.1 (.65) 5. Stress 1.8 (.38) 1.9 (.26) Historical scale

14.0 (2.9)

14.9 (3.0)

Clinical scale 5.4 (2.0) 5.4 (2.3) Risk management scale 6.6 (1.9) 6.8 (2.1) Total score 25.9 (5.5) 27.1 (6.5) PCL-R

Factor 1 6.1 (2.9) 7.8 (3.9)* Factor 2 8.5 (4.2) 10.0 (4.6) Total score 16.5 (6.2) 19.4 (8.5) Final risk judgments and diagnosis of psychopathy

N (%)

N (%)

HCR-20: Low 11 (26%) 11 (26%) HCR-20: Moderate 21 (50%) 13 (31%)* HCR-20: High 10 (24%) 18 (43%)* PCL-R ≥ 26

4 (10%) 10 (24%)

Note. * p < .01. ** p < .05 (two-tailed).

Chapter 5

110

Violent outcome

First, violent reconvictions after discharge from the hospital were calculated for the 15 female and 21

male patients from the retrospective study. Significantly more male ex-patients compared to female

ex-patients were convicted for a violent reoffense: nine (43%) of 21 males versus two (13%) of 15

females (χ2 (1, 36) = 3.6, p < .05; Odds ratio = 4.9, 95% CI = 1.1 – 32.9). Next, we computed the

number of patients who had been physically violent towards others during their stay in the hospital for

the 27 female and 21 male patients from the prospective study. There was no significant difference

between women and men: eight (30%) of 27 female patients were registered to have been physically

violent during their hospital stay, versus six (29%) of 21 male patients. Examples of physical violence

during treatment were throwing hot coffee to staff, hitting staff or fellow patients, and seizing someone

by the throat.

Predictive validity

Table 3 shows the AUC values and Pearson correlations of the HCR-20 subscales and total scores for

both female and male patients regarding violent outcome, and Figures 1 and 2 present the ROC curves

for the HCR-20 for violent outcome. For female patients, only the AUC value of the HCR-20 final risk

judgment was significantly above .50. Similarly, only the correlation between the HCR-20 final risk

judgment and violent outcome was significant. The difference in violent outcome between female

patients who were judged to pose a low, moderate or high risk was significant (χ2 (2, 42) = 16.2, p <

.001, violent outcome respectively 0%, 14% and 77%). Female patients who scored above the median

(HCR-20 total score = 26.6) compared to those who scored below did not have significantly more

violent outcome (29% versus 19%).

For male patients, the AUC values for violent outcome were significantly above .50 for all HCR-20

subscales, the total score and final risk judgment. Also, Pearson correlations between the HCR-20

subscale scores, total score, final risk judgment and violent outcome were significantly positive. The

difference in violent outcome between male patients who were judged to pose a low, moderate or high

risk was significant (χ2 (2, 42) = 24.4, p < .001, violent outcome respectively 0%, 8% and 78%). Male

patients who scored above the median (HCR-20 total score = 28.5) compared to those who scored

below had significantly more violent outcome (χ2 (1, 42) = 12.5, p < .001, 62% versus 10%). When we

compared the AUC values for violent recidivism after discharge to AUC values for incidents of

violence during treatment, we found no substantial differences in predictive accuracy in both the

female and male sample. The HCR-20 Risk management scale and total score were significantly more

accurate in predicting violent recidivism in men than in women (Z-statistic = 2.9 and 2.5, respectively,

p < .01, two-tailed).


111

Table 3. Predictive validity of the HCR-20 and PCL-R for female and male patients

Violent outcome by females n = 42

Violent outcome by males

n = 42

AUC

SE

r

AUC

SE

r

HCR-20

Historical scale .63 .11 .22 .83** .06 .54* Clinical scale .61 .10 .17 .75** .08 .42*

Risk management scale .52 .11 .07 .88** .05 .62* Total score .59 .11 .20 .88** .05 .59* Final risk judgment .86** .07 .57* .91** .05 .70* PCL-R

Factor 1 .36 .08 -.21 .64 .09 .24 Factor 2 .41 .11 -.10 .84** .06 .58* Total score .34 .10 -.21 .74* .08 .42* PCL-R ≥ 26

.50 .11 .01 .63 .09 .28

Note. *p < .01. ** p < .001 (two-tailed). AUC = Area Under the Curve. SE = Standard Error. r = Pearson point-

biserial correlation. Violent outcome = incidents of violence during treatment or violent recidivism after discharge.

Table 3 also presents the AUC values and Pearson correlations of the PCL-R factors and total scores

for both female and male patients regarding violent outcome. For the female patients, none of the AUC

values or Pearson correlations were significant. For the male patients, AUC values for violent outcome

were significantly above .50 for the PCL-R Factor 2 and total score, and correlations between violent

outcome and PCL-R Factor 2 score and total score were significant. The PCL-R Factor 1, Factor 2 and

total score were significantly more accurate in predicting violent recidivism in men than in women (Z-

statistic = 2.3, 3.4, and 3.1, respectively, p < .01, two tailed).

Chapter 5

112

Figure 1. ROC curves of HCR-20 total score and final risk judgment for violent outcome in a sample of 42 female patients

Figure 2. ROC curves of HCR-20 total score and final risk judgment for violent outcome in a sample of 42 male patients

Reference line HCR-20 final risk judgment HCR-20 total score

1-specificity

1-specificity

Sen

sitiv

ity

Sens

itivi

ty

Reference line HCR-20 final risk judgment HCR-20 total score


113

Discussion In this study, a sample of 42 female forensic psychiatric patients was compared to a matched sample of

42 male forensic psychiatric patients on base rates of violent outcome, HCR-20 and PCL-R scores and

predictive validity of the latter instruments. We found several significant differences between women

and men, most importantly in the predictive validity of the HCR-20 and PCL-R, but also in mean

HCR-20 individual item scores, base rates for violence after discharge from the hospital and sample

characteristics.

First, we found some significant differences in sample characteristics, despite our matching procedure.

Female patients more often had a diagnosis of borderline personality disorder and less often a

narcissistic personality disorder or antisocial personality disorder. This is in line with the study by

Strand and Belfrage (2001) and also with research that suggests that borderline personality disorder is

much more common among women (Weisman, 1993). We believe the large proportion of female

borderline patients has a considerable impact on the interpretation of our results, for instance, on

differences in HCR-20 item scores (see below). Furthermore, women obtained higher scores on

intelligence scales, especially on verbal intelligence, were significantly older at the time of their first

conviction, and - albeit not significant – had fewer previous convictions than men.

Second, there were no significant differences in mean HCR-20 subscale and total scores for male and

female patients and this finding resembles those of Strand and Belfrage (2001) and Nicholls (2001).

Our finding that female patients had significantly lower mean scores on ‘Young age at first violent

incident’, ‘Negative attitudes’ and significantly higher mean scores on ‘Impulsivity’ is in line with

Strand and Belfrage (2001). In the present study, we also found significantly lower mean scores on

‘Psychopathy’ and significantly higher mean scores on ‘Relationship instability’ for women. An

explanation for the higher mean scores on ‘Relationship instability’ and ‘Impulsivity’ could be that

both factors are criteria for the borderline personality disorder diagnosis which was highly prevalent in

our female sample. The lower mean score on ‘Young age at first violent incident’ is in accordance

with previous research that showed a later onset of criminal behavior in girls as compared to boys

(Silverthorn & Frick, 1999). The lower score on ‘Negative attitudes’ could be explained by the fact

that in general, women have different motives for their violent offenses compared to men, more often

reactive and relational, and less instrumental or resulting from criminogenic needs (see Crick &

Grotpeter, 1995). This hypothesis is confirmed by the lower prevalence of antisocial personality

disorder we found in our female patients. The lower mean score on the item ‘Psychopathy’ is in line

with the lower mean scores on the PCL-R, although the differences in mean Factor 2 scores and total

scores were not or only marginally significant. The lower prevalence of psychopathy among female

patients is in line with previous research into psychopathy in females (Grann, 2000; Salekin et al.,

1997; Vitale et al., 2002; Warren et al., 2003).

Chapter 5

114

Third, regarding violent outcome we found the base rate for incidents of violence during treatment to

be similar for female and male patients. This was also demonstrated in other studies (Lidz et al., 1993;

Nicholls, 2001). The base rates for incidents of violence during treatment in our study (women: 30%;

men: 29%) are similar to the base rates for physical inpatient aggression (women: 30%; men: 27.5%)

found in Nicholls’ study (2001). Male patients were found to be five times more likely to be convicted

for a violent reoffense after discharge from the hospital than female patients. This finding resembles

the finding of Ross et al. (1998) who found male patients to be four times more likely than female

patients to express any aggression. Thus, although there was no gender difference in the base rate of

incidents of violence during treatment, base rates for community violence were significantly different.

A possible explanation for this difference is that female violence in the community is often less visible

and more subtle or manipulative, for instance, in domestic violence or child abuse. Research has

demonstrated that the prevalence rate of domestic violence by women is comparable or even higher

than the prevalence rate of domestic violence by men (Magdol et al., 1997). Domestic violence is less

likely to come to the attention of the criminal justice system than violence committed in the public

environment, which is much more commonly committed by men. Moreover, the police often respond

differently to violence when it is committed by a female perpetrator versus a male perpetrator. Pajer

(1998) has described this gender bias in the justice system, i.e., the reluctance to arrest women coupled

with a tendency toward psychiatric referrals.

Fourth, the interrater reliability of the HCR-20 in the present study was in line with previous studies

(see Douglas et al., 2005). We found no substantial differences in interrater reliability between men

and women.

Finally, we found poor predictive validity for the HCR-20 scores for women compared to good to

excellent predictive validity for men. Notable, however, was the good predictive validity of the HCR-

20 final risk judgment for both female (AUC = .86) and male patients (AUC = .91). Thus, while a

simple addition of individual HCR-20 risk factors was not adequate in predicting violence risk in our

female patients, the SPJ method based on the HCR-20 seemed to perform well. For our male sample,

the structured final risk judgment yielded the highest AUC value and this is in line with previous

research that demonstrated the structured final risk judgment to add incremental validity to the HCR-

20 total score used in an actuarial sense (Douglas et al., 2003; de Vogel & de Ruiter, submitted for

publication, see Chapter 6). The same was found for the SARA (Kropp et al., 1999), a SPJ guideline

for the assessment of relational violence (Kropp & Hart, 2000) and the SVR-20 (Dempster, 1998; de

Vogel et al., 2004, see Chapter 3). The poor predictive validity found for the HCR-20 total score in

female patients is in contrast with the results of Nicholls (2001). A possible explanation is the

difference in the samples that were studied. The patients in Nicholls’ sample were mainly suffering

from Axis I disorders (87%), and only 4% received a diagnosis of borderline personality disorder. This

is in sharp contrast to our sample where three quarters of the women suffered from borderline

personality disorder and Axis I disorders were usually not the primary diagnosis. Furthermore,


115

Nicholls used a different definition of violence, for instance, she also considered property damage and

verbal aggression, while in our study, we limited ourselves to physical violence towards others. In our

study, the PCL-R demonstrated to be a good predictor of violence for male patients, but not for female

patients. This finding is in line with previous studies that found good predictive validity for future

violence in (mainly) male samples (Hemphill et al., 1998; Salekin et al., 1996), but modest predictive

validity for future violence in female samples (see Vitale & Newman, 2001). Thus, the results of our

study suggests that the PCL-R is not a valid assessment of the psychopathy construct in Dutch female

forensic psychiatric patients.

A number of limitations to the present study should be mentioned. First of all, the design of the study

was mixed, because we combined patients and violent outcome data from a retrospective study and a

prospective study. The reason to mix the patients from a retrospective and a prospective design was to

obtain a large enough sample. In The Netherlands, women make up only 5% of the tbs-population.

Second, the violent outcome data may have been an underestimate of actual violence. The violent

recidivism data were retrieved from only one source, the Judicial Documentation register of the

Ministry of Justice. As a consequence, the reconviction rate is inevitably an underestimation of the

actual recidivism rate, because not all offenders are reported, apprehended and arrested. With regard to

the prospective outcome data, incidents of physical violence are not always reported on the

information bulletins. For example, it is possible that incidents of physical violence between patients

are not observed by staff or told by patients to staff. Third, the sample sizes were relatively small and

only derived from one site. Larger samples would have resulted in increased power. However, given

that there is such a paucity of research on female forensic psychiatric patients, we believe even

matched samples of limited size such as ours can make a contribution to the knowledge base.

Our findings demonstrate that the method of SPJ, i.e., systematically rating risk factors, integrating

and weighing information is effective in both male and female patients. For research purposes, we

recommend researchers who conduct studies in mixed gender samples to report the results on

predictive validity of risk assessment instruments separately for men and women, because reporting

the results jointly could lead to distorted conclusions. Perhaps in different patients, the HCR-20 will

show good predictive validity, like in Nicholls’ study with primarily Axis I disordered women or in

civil psychiatric samples (see Nicholls, Ogloff, Douglas, & Grant, manuscript under review). Risk

assessment research in female forensic psychiatric patients is still a relatively unexplored area.

Although women are only a minority in forensic psychiatry, it seems that in the past two decades

female aggression is on the rise, especially among young girls (English, 1993; Mertens, Grapendaal, &

Docter-Schamhardt, 1998; Odgers & Moretti, 2002). More knowledge on specific violence risk factors

in women and the risk management strategies needed to prevent repeated violence in women is

desirable. This is also important from a public mental health perspective because research has

demonstrated an intergenerational transfer of risk of aggression between mothers and children;

Chapter 5

116

mothers with a history of violent offense(s) more often have disruptive, aggressive children (Serbin et

al., 1998).

Structured Professional Judgment of violence risk in forensic clinical practice: A prospective study into the predictive validity of the Dutch HCR-20

This chapter is a slightly revised version of Vogel, V. de, & Ruiter, C. de (submitted for publication). Structured Professional Judgment of violence risk in forensic clinical practice: A prospective study into the predictive validity of the Dutch HCR-20. The authors wish to thank all clinicians and researchers who participated in this study. Special thanks go to Cécile Vandeputte-van de Vijver who functioned as workshop trainer together with the first author and also participated as a researcher in the study.

66

Structured professional judgment of violence risk in forensic clinical practice

119

CHAPTER 6

Structured professional judgment of violence risk in forensic clinical practice: A prospective study into the predictive validity of the Dutch HCR-20

During the last two decades, research into risk factors for violence, the development of risk assessment

instruments and research into the psychometric properties of these instruments has expanded

enormously. To date, numerous structured risk assessment instruments are available for mental health

professionals working in forensic or general psychiatry or in the penitentiary system. Risk assessment

instruments can be divided into strictly actuarial and structured professional judgment (SPJ)

instruments. Actuarial instruments are developed solely based on risk factors that are empirically

related to (sexually) violent behavior. These instruments are relatively simple to code - according to

fixed rules and not necessarily by a forensic expert - and contain predominantly static, historical

factors that are added up according to an algorithm to reach a conclusion regarding the risk of

recidivism. Examples of actuarial instruments are the Violence Risk Appraisal Guide (VRAG; Harris

& Rice, 1997) for violent behavior, and the Static-2002 (Hanson & Thornton, 2002) for sexual

violence. In the SPJ approach, the risk assessment is performed by a forensic clinician by means of a

standardized checklist, containing empirically derived risk factors for (sexual) violence, static as well

as dynamic factors. The essential difference between the actuarial and the SPJ approach is in how the

final risk judgments are arrived at; in actuarial instruments by a fixed algorithm and in SPJ guidelines

by (structured) human decision making.1

A SPJ risk assessment instrument that is internationally well known and the subject of numerous

1 See Douglas, Cox, and Webster, 1999 and Otto, 2000, for a more detailed overview of risk assessment approaches.

In this prospective study, the Dutch version of the HCR-20 was coded independently by three rater groups (researchers, treatment supervisors and group leaders) for 127 male mentally disordered offenders admitted to the Dr.Henri van der Hoeven Kliniek. During case conferences, the three ratersdiscussed their ratings and reached consensus on their ratings and final risk judgment. HCR-20 ratings were related to incidents of physical violence during treatment. Overall, the predictive validity of the HCR-20 was good. We found no differences between researchers and treatment supervisors in predictiveaccuracy. Group leaders performed worse compared to the other two rater groups. The consensus rating was the best predictor. Implications for structuredviolence risk assessment in clinical practice are discussed.

Chapter 6

120

studies is the HCR-20 (Webster et al., 1997b). This instrument consists of 20 items representing risk

factors for violence in the past, present and future. Research in various psychiatric and forensic

settings in different countries has demonstrated good interrater reliability and predictive validity for

the HCR-20 (see Douglas et al., 2005 for a review). For instance, Douglas and colleagues (2003)

found good predictive validity for the HCR-20 in a sample of 100 forensic psychiatric patients.

Moreover, they demonstrated that the HCR-20 structured final risk judgment added incremental

validity to the HCR-20 used in an actuarial sense, i.e. a simple addition of the scores on the 20 items.

An important limitation of many studies into the HCR-20 – which is designed for the prediction and

management of future violence - is their retrospective design (see also Dernevik, 2004). Only a few

prospective studies into the predictive validity of the HCR-20 have been published so far (e.g.,

Belfrage et al., 2000; Dernevik, Grann, & Johansson, 2002; Dolan & Khawala, 2004). Another

limitation of most studies into the HCR-20 concerns their ecological validity, i.e., their relevance to

actual clinical risk assessment practice. In most published studies, the HCR-20 is coded by

independent researchers, not by practicing clinicians. Generally, these researchers did not know the

patients personally and coded the HCR-20 exclusively based on file information. Recently, Webster et

al. (2002) referred to this problem and argued that “much more in situ research needs to be

accomplished with instruments like the HCR-20” (p. 189).

When the HCR-20 is employed in clinical practice for the assessment of risk of future violence and in

leave decision-making, ratings by experienced clinicians are required (Webster et al., 1997b).

Furthermore, in clinical practice it is customary that the treatment staff is responsible for leave

decisions (Dernevik et al., 2001). However, there is also some doubt about the objectivity of clinicians,

especially clinicians who are closely involved in the treatment of the patient (for a more detailed

discussion see: Dernevik et al., 2001; Litwack & Schlesinger, 1999; de Vogel & de Ruiter, 2004,

Chapter 2). Thus, the question arises who is most likely to conduct accurate risk assessments: the

objective, more distant researcher-assessor or the experienced clinician who knows the patient

personally. Possibly, the consensus between the researcher and clinician will be the most accurate in

predicting violence. To our knowledge, no studies have yet been published that examine differences in

predictive accuracy of structured violence risk assessment instruments, such as the HCR-20, between

clinicians and researchers or between individual ratings and consensus ratings by a group of raters.

However, a few studies addressed the issue of multiple raters. McNiel and colleagues (2000) examined

whether the predictive accuracy of clinical assessments of violence risk improves when there is

agreement between multiple clinicians (physicians and nurses). They found that when two clinicians

reached similar conclusions these were more accurate than the conclusions of either clinician alone

when their assessments disagreed. Huss and Zeiss (2004) found that individual clinicians demonstrated

poor ability to predict violence among general psychiatric patients, but that the accuracy of the risk

assessments improved much when they were aggregated as “group” decisions. It should be noted that

in both of these studies, the clinicians did not actually meet and discuss; their ratings were aggregated


121

by the research group. In conclusion, it is important to examine if there are differences between

researchers and clinicians in the accuracy of their risk assessments and to compare this accuracy with

the consensus between researchers and clinicians.

In this chapter, results of a prospective study are presented which started in January 2001. The

authorized Dutch version of the HCR-20 (Philipse et al., 2000) was coded for 127 male patients

admitted to the Dr. Henri van der Hoeven Kliniek by both clinicians (group leaders and treatment

supervisors) and independent researchers. In a previous study, we have examined the interrater

reliability of the Dutch HCR-20 and differences between researchers and clinicians in coding the

HCR-20 in 60 patients from this hospital, a subgroup of the present sample (de Vogel & de Ruiter,

2004, see Chapter 2).2 Overall, the interrater reliability of the HCR-20 was good. The group leaders

gave significantly lower HCR-20 scores than the researchers. There were no significant differences

between the mean HCR-20 scores of treatment supervisors and researchers, but there was a significant

difference in the interpretation of the scores: treatment supervisors had more ‘low risk’ final judgments

than researchers. The goals of the present study were to establish the predictive validity of the Dutch

HCR-20 and to gain insight into differences in risk assessment accuracy between (1) researchers,

treatment supervisors and group leaders, and (2) individual ratings and consensus ratings. Also,

following Douglas and colleagues (2003), we wanted to examine if the HCR-20 structured final risk

judgment adds incremental validity to the HCR-20 actuarial score.

Method Setting



Subjects

The current sample included 127 men. The mean age at admission was 32.9 (SD = 9.6, range = 17-66).

The majority of the patients was of Dutch nationality (80%). About half of the patients were

unemployed (49%) and 60% were single at the time of the index offense. The majority of the patients

had been convicted before their tbs-order (77%) with an average number of 5.0 (SD = 6.1, range = 0-

30) prior convictions. The index offenses were: 44% (attempted) homicide, 33% sexual offenses, 16%

other violent offenses (e.g., robbery) and 7% arson. The mean length of stay in the hospital was 3.7

years (SD = 2.4, range = 0-12). More than half of the patients had abused substances in the past (8%

alcohol, 15% drugs, and 44% multiple substances). In 5% of the patients, only an Axis I disorder

(according to the DSM-IV; APA, 1994), was diagnosed; 66% met the criteria for one or more Axis II

2 This sample comprised 53 men and 7 women. In the present study, the women were excluded from the analyses (see Procedure).

Chapter 6

122

disorders, particularly cluster B disorders3 and in 28% there was comorbidity of Axis I and II

disorders.4 The majority of the patients had a history of psychiatric treatment; 49% had been admitted

to a psychiatric institution and 24% had received outpatient treatment. The mean Psychopathy

Checklist-Revised (PCL-R; Hare, 1991, 2003) total score of the patients was 21.5 (SD = 8.4, range =

2-38).

Instruments

HCR-20



Procedure

All raters were trained in coding the HCR-20 during a one-day workshop given by a senior clinical

psychologist and a research psychologist. In this workshop, the relevant empirical literature was

discussed and the HCR-20 coding procedure was practiced using file information and videotapes of

actual cases. Raters were instructed to use the HCR-20 manual and all available file information for all

cases.

In the period of this study - January 1, 2001 until June 1, 2004 - the HCR-20 was coded for 127

patients who can be divided into different groups according to their phase in treatment. In the course of

treatment in the hospital, a number of specific phases can be distinguished in which the liberties of a

patient can be increased and at that time the risk of violence needs to be (re-)evaluated. These phases

are: when a patient has his first unsupervised leave from the hospital and when a patient is to enter the

transmural treatment phase. The HCR-20 was coded for patients who were entering the above two

phases (n = 9 and 28, respectively), and for patients who were already in the transmural treatment

phase (n = 24). For all of these cases, the Risk management items were coded for the context outside.

The HCR-20 was also coded for patients who were newly admitted to the hospital (n = 49) and for

inpatients at the request of their treatment team (n = 17), for instance, when they had questions about

treatment progress. For these two types of patients, the Risk management items were coded for the

context inside (risk of inpatient violence).

A researcher, group leader and treatment supervisor independently coded the HCR-20 for each case.

When the patient was a sex offender, the Dutch version of the SVR-20 (Boer et al., 1997; authorized

Dutch version: Hildebrand et al., 2001) was coded in addition to the HCR-20.5 All raters had access to

3 In Dutch forensic psychiatry, cluster B personality disorders are the most prevalent (see Hildebrand & de Ruiter, 2004; de Ruiter & Greeven, 2000). 4 Axis I diagnoses were lifetime clinical diagnoses based on consensus between four raters (see Hildebrand & de Ruiter, 2004), Axis II disorders were diagnosed with the Structured Interview for DSM-IV Personality (SIDP-IV; Pfohl et al., 1995). 5 Results of the SVR-20 are not included in this study but can be expected within one or two years.


123

file information that consisted of psychological reports, reports to the court regarding treatment

progress and recommendations for termination or prolongation of the tbs-order, treatment plans and

evaluations. Raters agreed upon a consensus score and a final risk judgment during a case conference.

In these case conferences, raters also discussed the possibility of additional risk factors, protective

factors and risk management strategies. The case conferences lasted on average about one hour and

were considered useful by both researchers and clinicians. The results of the consensus meetings were

used by staff to develop risk management strategies or for decision-making regarding leave or entry

into the transmural treatment phase. In this sense, the HCR-20 judgments were used in the way that

they are intended by the original developers of the instrument.

Initially, there were HCR-20 ratings of 149 patients; 127 men and 22 women. In a previous study,

which included all women from the present study, we examined differences between 42 male and 42

matched female patients in mean scores and predictive validity of the HCR-20 (see de Vogel & de

Ruiter, in press, Chapter 5).6 Besides several differences in sample characteristics and mean individual

HCR-20 item scores, we found that, except for the final risk judgment, the HCR-20 did not

significantly predict violent recidivism in women as it did in men. Therefore, we decided to exclude

the female patients from the present study. During the time course of this study, three (2%) patients

had died (two by suicide, one by natural death). We decided not to exclude these patients because they

all had a reasonably long follow up period (19, 17 and 13 months, respectively). Also, during the time

course of this study, 20 (16%) patients were discharged from the hospital; 19 because their tbs-order

had been terminated by the court and one patient was readmitted to another forensic psychiatric

hospital. We did not possess information on violent recidivism after termination of the tbs-order. Their

mean follow up period in treatment was 29.8 months (SD = 8.3, range = 9-37) for this group of 20

patients. We considered this follow up period reasonably long and decided not to exclude these

patients from the analyses. Furthermore, it should be noted that 19 (15%) patients were assessed more

than once, because their leave situation had changed, for instance, they started with transmural

treatment. The most recent risk assessment was used.

HCR-20 scores and final risk judgments were related to incidents of physical violence during

treatment that occurred after the date of the most recent risk assessment (see Violent outcome data).

Violent outcome data were collected until June 1, 2004. The mean follow up period of the 127 patients

was 21.5 months (SD = 10.9, range = 1-37).

Raters

The researchers (N = 9) were all Master’s level clinical psychologists of the Research department, who

are responsible for psychological assessment and empirical research in the hospital. The researchers

are not in a treatment relationship with patients and do not have intensive contact with them, but they

6 The HCR-20 was developed based on research in predominantly male samples.

Chapter 6

124

all know the patients to some extent. The treatment supervisors (N = 8) have a supervising and

planning role in the treatment of around 20 patients; they were all senior clinicians, mostly clinical

psychologists or psychotherapists. The professional background of the group leaders (N = 59) varied,

but most of them had relevant higher vocational or academic training (e.g., nursing, social work,

psychology). Group leaders conduct the daily and practical supervision and spend most of their time

with the patients.

Violent outcome data

To identify incidents of physical violence, we adopted the HCR-20 definition of violence: “violence is

actual, attempted, or threatened harm to a person or persons” (Webster et al., 1997b, p. 24). Violent

outcome data were obtained from information bulletins that are published daily in the hospital to

inform patients and staff. In these bulletins, the most important events of the last 24 hours are reported,

such as disruptive incidents or positive results on urine analysis to detect if a patient has taken drugs.

Incidents could have occurred inside the hospital (inpatient violence) or outside the hospital, for

instance, for patients who were in the transmural treatment phase. We did not obtain data on violent

recidivism after termination of the tbs-order from the Ministry of Justice, because it was a rather small

group whose tbs-order had been terminated by the court (N = 20) and their mean follow up period after

discharge was quite brief (15 months, SD = 8.8, range = 4-34) compared to their mean follow up

period in (transmural) treatment (29.8 months, SD = 8.3, range = 9-37). Disruptive incidents were

registered by the research psychologist and assigned to one of four categories: verbal abuse, verbal

threat, physical violence, and violation of hospital rules (see Hildebrand, de Ruiter, & Nijman, 2004

for details of the coding system). In this study, we focused on physical violence, more specifically on

incidents of physical violence directed towards other persons, because the HCR-20 is designed to

assess risk of violence to others. For instance, property damage alone was not included, unless the

property damage occurred with the goal to frighten or threaten another person (e.g., smashing a cup of

hot coffee against the wall while someone is standing close by). In order to examine if the HCR-20 is

able to predict different types of violence, we also considered verbal abuse and verbal threat.


The F-test was used to examine differences between researchers, group leaders and treatment

supervisors on HCR-20 subscales and total scores. For differences in HCR-20 final risk judgments,

chi-square tests were used. The predictive validity was established with receiver operating

characteristics (ROC) analyses (see for a description of this method Douglas et al., 2005; Mossman,

1994; Rice & Harris, 1995; see also Chapter 3, p. 71). To compare the AUC values for the HCR-20

ratings of the three rater groups and the consensus, we used AccuROC version 2.5 (Vida, 1997) that

applies the non-parametric method as described by DeLong et al. (1988). Pearson point-biserial

correlations and survival analyses, i.e., Cox regression (event history analyses) and Kaplan-Meier (see


125

Tabachnick & Fidell, 2001) were conducted for comparative purposes. Survival analyses control for

unequal follow up periods between patients. Cox regression analyses, which result in the hazard ratio

(eB) that can be interpreted as the relative risk, were conducted to determine which HCR-20 items were

significant predictors. Cox regression analyses were also conducted to evaluate whether the HCR-20

final risk judgment added incremental validity to the HCR-20 actuarial scores. All analyses were

conducted using SPSS version 11.

Results Violent outcome

Nineteen patients (15%) committed a total of 27 incidents of physical violence during the period of

this study, of which fourteen committed one, three committed two, one committed three and one four.

Accounting for time the patients had been at risk and using survival analysis, the failure rate was 23%.

Examples of violent incidents were hitting another patient, attacking a staff member and throwing a

table towards a window behind which staff members were standing. Two incidents occurred outside

the hospital, the rest inside the hospital.7 Most of the incidents of physical violence (82%) were

classified as mildly serious and 18% as serious. In 63% of the incidents, staff members were the

victim, in 30% other patients and in 7% the patient’s girlfriend. Furthermore, 47 (37%) patients were

registered for incidents of verbal abuse and 24 (20%) for incidents of verbal threat.

Risk judgments

Table 1 presents mean HCR-20 scores of the patients as coded by the three different rater groups, as

well as mean HCR-20 scores as agreed upon in case conferences. Group leaders, compared to

researchers and treatment supervisors, gave significantly lower scores on the Risk management items

and HCR-20 total score. There were no significant differences in mean HCR-20 scores between

researchers and treatment supervisors. The mean HCR-20 consensus scores were higher - although not

significantly - than the mean HCR-20 scores of the three individual rater groups. Table 1 also presents

the HCR-20 final risk judgments. There were no significant differences between the rater groups in

final risk judgments.

7 Five of seven patients that were assessed for the context outside had an incident of physical violence inside the hospital. Four of these patients were in the transmural phase. Although these patients live outside the hospital, they frequently visit the hospital, for instance, to attend work or psychotherapy.

Chapter 6

126

Table 1. Risk assessments (N= 127)

HCR-20 mean scores (SD)

HCR-20 final risk judgments

H scale

C scale

R scale

Total

Low

Moderate

High

Researchers

14.5 (3.1)

5.3 (2.1)

6.3 (2.2)a

26.1 (6.1) a

24%

45%

31%

Treatment supervisors 14.3 (3.4) 5.3 (2.2) 6.2 (2.2)a 25.8 (6.1) a 30% 46% 24% Group leaders 14.0 (3.4) 5.0 (2.0) 5.3 (2.2) b 24.1 (5.8) b 21% 43% 35% Consensus

14.8 (3.1) 5.5 (2.1) 6.4 (1.9) a 26.8 (5.6) a 28% 48% 24%

Note. a > b p < .05 (two-tailed). H scale = Historical scale. C scale = Clinical scale. R scale = Risk management

scale. SD = standard deviation. Predictive validity of the HCR-20 consensus ratings

AUC values and Pearson correlations for the HCR-20 subscales, total scores and final risk judgments

as agreed upon by the three raters in case conferences were highly significant for incidents of physical

violence during treatment (see Table 2 and Figure 1).

Patients who scored above the HCR-20 median of 27 compared to those who scored below had

significantly more incidents of physical violence (failure rates as computed with Kaplan-Meier

analysis: 2 versus 43, log rank (1, 127) = 15.8, p < .001; Odds ratio = 21.6, 95% CI = 2.8-167.2). The

difference in failure rates between patients who were judged to be low, moderate or high risk was also

significant (respectively 0, 8, and 64, log rank (2, 127) = 34.9, p < .001). Next, we conducted Cox

regression analyses. The HCR-20 subscale scores were entered on block 1. The HCR-20 final risk

judgment was entered on block 2 using the forward conditional method. In block 1, the HCR-20

subscales scores produced a significant model fit (χ2 (3, 127) = 22.9, p < .001). In block 2, the HCR-20

final risk judgment produced a significant improvement to the model’s fit (χ2 change (1, 127) = 6.8, p

< .01).


127

Figure 1. ROC curve HCR-20 consensus ratings for physical violence (N = 127)

Sen

sitiv

ity

1-specificity

Reference line

Historical scale

Clinical scale

Risk management scale

Total score

Final risk judgment

Tabl

e 2.

Pr

edic

tive

valid

ity o

f the

HC

R-20

for p

hysi

cal v

iole

nce

(N =

127

)

C

onse

nsus

R

esea

rche

rs

T

reat

men

t sup

ervi

sors

Gro

up le

ader

s

AU

C

SE

r

AU

C

SE

r

AU

C

SE

r

AU

C

SE

r

His

toric

al sc

ale

.7

7***

.05

.3

2**

.7

3***

.06

.2

7**

.7

4***

.06

.2

8**

.7

5***

.06

.2

9**

Clin

ical

scal

e .8

0***

.05

.36**

.7

6***

.06

.31**

.7

5***

.05

.31**

.6

6* .0

6 .1

9*

Ris

k m

anag

emen

t sca

le

.79**

* .0

5 .3

5**

.74**

* .0

6 .2

9**

.71**

* .0

5 .2

7**

.63

.07

.16

Tota

l sco

re

.85**

* .0

4 .4

3**

.79**

* .0

5 .3

5**

.81**

* .0

5 .3

6**

.75**

* .0

5 .3

0**

Fina

l ris

k ju

dgm

ent

.8

6***

.04

.49**

.7

7***

.06

.35**

.7

5***

.05

.33**

.6

4* .0

7 .1

9*

N

ote.

* p

< .0

5. **

p <

.01.

*** p

< .0

01 (

two-

taile

d). A

UC

= A

rea

unde

r the

cur

ve. S

E =

Stan

dard

err

or. r

= P

ears

on p

oint

-bis

eria

l co

rrel

atio

n.


129

Furthermore, we wanted to examine how the individual HCR-20 items perform in predicting physical

violence in our sample. Table 3 shows the AUC values1 and Pearson correlations for the consensus

HCR-20 item scores and violent incidents. Items 2, 4, 5, 7, 11, 12, 14, 15, 16, 17, and 19 had

significant AUC values and correlations. When Cox regression analyses were conducted, the full

model with all HCR-20 items was found to be significant (χ2 (20, 127) = 43.7, p < .01). Next, the

forward conditional method was used to determine which HCR-20 items were significant predictors of

incidents of physical violence. In the final model, items 2 (eB = 6.4, 95% CI = 1.5-28.0), 15 (eB = 3.4,

95% CI = 1.5-8.1), and 17 (eB = 3.4, 95% CI = 1.2-10.0) were significant predictors of incidents of

physical violence.

Although not displayed in the tables and figure, we also computed AUC values and Pearson

correlations for the HCR-20 consensus ratings with respect to incidents of verbal abuse and verbal

threat. We found significant predictive accuracy of the HCR-20 for both verbal abuse (total score:

AUC = .72, SE = .05, r = .36, p < .01; final risk judgment: AUC = .65, SE = .05, r = .28, p < .01) and

verbal threat (total score: AUC = .79, SE = .05, r = .36, p < .01; final risk judgment: AUC = .71, SE =

.05, r = .31, p < .01).

1 It should be noted that ROC analyses are less appropriate to apply with dichotomous or trichotomous variables. Still, we believed it was important to examine the predictive accuracy per item and AUC value are easy to understand and provide comparison values with other similar studies.

Chapter 6

130

Table 3. Predictive validity of the HCR-20 consensus items for physical violence (N = 127)

AUC

SE

r

Historical items 1. Previous violence .48 .07 -.12 2. Young age at first violent incident .72** .06 .32** 3. Relationship instability .60 .06 .16 4. Employment problems .65* .06 .21* 5. Substance use problems .67* .06 .24** 6. Major mental illness .52 .06 .01 7. Psychopathy .71** .06 .29** 8. Early maladjustment .57 .07 .11 9. Personality disorder .53 .07 .08 10. Prior supervision failure .58 .06 .15

Clinical items

11. Lack of insight .70** .06 .27** 12. Negative attitudes .71** .07 .27** 13. Active symptoms of major mental illness .47 .05 -.05 14. Impulsivity .72** .06 .29** 15. Unresponsive to treatment .73*** .07 .32**

Risk management items

16. Plans lack feasibility .69** .07 .26** 17. Exposure to destabilizers .74*** .06 .33** 18. Lack of personal support .61 .07 .16 19. Noncompliance with remediation attempts .67* .06 .25** 20. Stress

.57 .07 .13

Note. * p < .05. ** p < .01. *** p < .001 (two-tailed). AUC = Area Under the Curve. SE = Standard Error. r =

Pearson point-biserial correlation.

Differences between raters in accuracy of risk assessments

AUC values and Pearson correlations for the HCR-20 ratings of the three rater groups were significant

for incidents of physical violence (see Table 2). One exception is the AUC value and correlation for

the Risk management scale coded by the group leaders. With AccuROC we computed if there were

significant differences in AUC values between the three rater groups and between individual group

and consensus ratings. Group leaders compared to researchers had a significantly lower AUC value for

the final risk judgment (χ2 (1, 127) = 6.3, p < .01). Group leaders’ ratings compared to consensus

ratings also had significantly lower AUC values for the Clinical and Risk management scales, total

score and final risk judgment (χ2 (1, 127) = respectively 6.8, 4.9, 4.6 and 20.1, p < .05). The AUC

value for the HCR-20 consensus final risk judgment was significantly higher than the individual final

risk judgment of researchers, treatment supervisors and group leaders (χ2 (1, 127) = respectively 6.9,

5.3, and 20.1, p < .01).


131

Discussion This is the first prospective study into the predictive validity of the Dutch HCR-20 in forensic clinical

practice with multiple raters, including treating clinicians. The results of this study, which explored

differences in predictive accuracy between researchers and clinicians and between individual and

consensus ratings, provide strong support for the SPJ model of risk assessment.

The base rate of incidents of physical violence during treatment in this study was rather low compared

to other studies (Dernevik et al., 2002; Nicholls, 2001; Ross et al., 1998). However comparison of base

rates of (inpatient) violence from different studies is complicated because of differences in samples,

settings, length of follow up periods, and definitions of violence. The low base rate of physical

violence during treatment is possibly due to the structured, restrictive environment in which patients

live inside a secure hospital. Another explanation might be the use of adequate risk management

strategies by staff (e.g., isolating a patient when violence is expected).

We found good predictive validity of the HCR-20 for incidents of physical violence during treatment.

This resembles findings from previous studies (Belfrage et al., 2000; Dernevik et al., 2002; Gray et al.,

2003; Ross et al., 1998). The AUC values for the HCR-20 consensus total scores and final risk

judgment are quite high (AUC = .85 and .86, respectively) compared to those found in other studies in

forensic psychiatric samples (AUC values for HCR-20 total scores ranged from .57 to .84; see Douglas

et al., 2005). A possible explanation is that our study was prospective and conducted in actual clinical

practice implying that all raters personally knew the patients, had access to comprehensive file

information and also had the opportunity to observe and monitor patients. This is in contrast to file

based studies in which researchers retrospectively coded the HCR-20 from file information and did not

know the subjects. The final risk judgment added significant incremental validity to the HCR-20

subscales scores, a finding similar to Douglas et al.’s (2003). The same was found for the SARA

(Kropp et al., 1999), a SPJ guideline for the assessment of relational violence (Kropp & Hart, 2000)

and the SVR-20 (Dempster, 1998; de Vogel et al., 2004, see Chapter 3). Furthermore, we found that in

our sample the consensus HCR-20 ratings were also predictive of verbal abuse and verbal threat. This

finding is in accordance with a recently conducted study in 34 mentally disordered offenders that

found the HCR-20 to be predictive of both verbal and physical aggression, but not of self harm (Gray

et al., 2003). With respect to the predictive accuracy of the HCR-20 items and subscales as agreed

upon by the three raters in case conferences, the three subscales were found to have comparable

predictive accuracy. However, although the differences between the AUC values for the subscales

were small, the AUC values for the individual items show that several of the Historical items were not

predictive in our sample, whereas most of the dynamic items were. The same pattern was found in two

Swedish studies with high-risk samples (Belfrage et al., 2000; Strand et al., 1999). A possible

explanation is that most of the Historical risk factors are highly prevalent in high-risk samples and thus

do not discriminate between cases. In our sample, 125 (98%) patients had a score of 2 on ‘Previous

Chapter 6

132

violence’ and 114 (90%) had a score of 2 on ‘Personality disorder’. ‘Psychopathy’ was one of the

Historical items that did demonstrate significant predictive accuracy, however. We want to emphasize

this, because in our experience some mental health professionals decide to omit the ‘Psychopathy’

item because the administration of the PCL-R is time-consuming and requires trained raters (see also

Webster et al., 2002). Our finding underlines the statement of Hart (1998a) that “psychopathy is a

factor that should be considered in any assessment of violence risk” (p. 368). Most of the dynamic

items were significant in predicting incidents of physical violence during treatment. In a recently

conducted study in 100 psychiatric patients, the Clinical subscale was found to be specifically

predictive of inpatient violence in the short term, whereas the Historical subscale was not (McNiel,

Gregory, Lam, Binder, & Sullivan, 2003). Dynamic items that were not predictive in our sample were

‘Active symptoms of major mental illness’ (almost absent: 106 (84%) patients had a score of 0 on this

item), ‘Lack of personal support’ and ‘Stress’ (highly prevalent: 105 (83%) patients had a score of 2

on this item). Items that were most predictive (i.e., remained significant predictors in stepwise Cox

regression analyses) were ‘Young age at first violent incident’, ‘Unresponsive to treatment’, and

‘Exposure to destabilizers’.

Next, differences were explored between the three rater groups in HCR-20 ratings and predictive

accuracy. The group leaders compared to the other two groups gave significantly lower scores on the

Risk management items. Regarding the HCR-20 mean scores and final risk judgments, no differences

were found between researchers and treatment supervisors. In our previous study with 60 patients, we

found that treatment supervisors compared to researchers significantly more often judged patients as

low risk. In the present study, this difference was no longer significant. It should be noted that the

results of these first 60 risk assessments were presented to treatment staff. Possibly, the treatment

supervisors were influenced by the results and changed their way of rating the HCR-20. It is worth

noting that the consensus scores by the three raters were higher - although not significantly - than the

scores of the three individual rater groups. This did not seem to affect the final risk judgment,

however. The trend that group ratings lead to higher scores than individual ratings has been found

before. For instance, Logan and Watt (2001) found that group ratings of the SVR-20 in 32 sex

offenders were higher than individual ratings.

Regarding the predictive accuracy, not many differences between researchers and clinicians were

found. When we started this research project, we expected differences between the researchers and the

clinicians (group leaders and treatment supervisors), because of their different roles in the forensic

setting. There was no difference, however, between researchers and treatment supervisors in accuracy

of their risk assessments. In the previous study, feelings of clinicians towards their patients were found

to be associated with the risk assessments, for instance, the feeling of being controlled and

manipulated by the patient was related to higher HCR-20 scores (de Vogel & de Ruiter, 2004, see

Chapter 2). The present findings suggest that treatment supervisors’ feelings towards their patients did

not interfere with the accuracy of their risk assessments. The finding that experienced clinicians were


133

as accurate in using the HCR-20 as researchers, who were much more used to using structured

instruments, is important because the HCR-20 is intended to be used by clinicians in their daily

practice. The group leaders compared to the other two groups performed worse in predicting violence

with the HCR-20. We offer three possible explanations. First, there was a large number of group

leaders (N = 59) who participated in this study. Many group leaders conducted only one (n = 22), two

(n = 13), or three (n = 11) risk assessments. Thus, group leaders compared to researchers and treatment

supervisors gained less experience in coding the HCR-20. Second, group leaders compared to

treatment supervisors and researchers were younger and less experienced in the forensic field. A third

explanation is related to their role in treatment and their proximity to patients. Possibly, the group

leaders’ feelings towards their patients did interfere with their ability to objectively assess the risk of

violence. For instance, several group leaders indicated that they found it difficult to be objective about

a patient when they had just experienced an emotional outburst of this patient (see also Chakhssi &

Hilterman, 2004).

Interestingly, the consensus risk assessments performed better than the risk assessments of the

individual rater groups. This is especially true for the consensus final risk judgment which was

significantly better than the judgment of the three rater groups individually. Thus, ratings based on

elaborate discussion with colleagues are superior to individual ratings. To our knowledge, no studies

have been published before that compare HCR-20 consensus ratings to individual ratings. However,

our finding is in line with previous studies that found higher predictive validity of clinical violence risk

assessments when there was agreement between clinicians (McNiel et al., 2000) or when clinicians’

ratings were aggregated as “group” ratings (Huss & Zeiss, 2004). In conclusion, the findings

demonstrate that the method of SPJ, i.e., systematically rating risk factors, integrating and weighing

information to arrive upon a final risk judgment and discussion with colleagues is effective in

predicting future violence risk.

There are several limitations to this study. First, prospective predictive research is hampered by the

clinical goals of risk assessment, i.e., risk management and prevention (Dernevik et al., 2002; Hart,

1998a). Hart (1998a) stated that predictions of violence are not passive assessments, but decisions that

influence services delivered to individuals: “Clinicians are bound - morally, ethically, and legally - to

try to prove themselves wrong when they predict violence and take every reasonable action to prevent

violence” (p. 365). In our study, clinicians were able to use the results of the HCR-20 ratings, for

instance, for decisions concerning leave. Thus, it is very likely that risk management was influenced

by the results of the risk assessment, for instance, high-risk patients were not released from the

hospital, or were separated if the risk of inpatient violence was judged to be high. So, the AUC values

we obtained were already high, but might have been even higher if the results had not been used to

manage risk. Second, the sample was derived from only one Dutch forensic psychiatric hospital,

thereby limiting generalization. Nevertheless, we consider this group to be representative of Dutch

offenders with a tbs-order, because they are largely similar in demographic, psychiatric and criminal

Chapter 6

134

characteristics to the total population of tbs-offenders (see van Emmerik & Brouwers, 2001). Third,

the mean follow up period of this study was rather limited, some patients had a very short follow up

period of only one or two months. On the other hand, we conducted survival analyses that take into

account time at risk. Fourth, we found a rather low base rate of violence. Although we conducted ROC

analyses that are insensitive to base rates, the low base rate might have had an effect on the Cox

regression analyses. A final limitation is that data regarding violent outcome were not always reliable.

Incidents of physical violence are not always reported on the information bulletins. For example, it is

possible that incidents of physical violence between patients are not observed by staff or told to staff.

This is the case for inpatients, but even more so for patients who are in the transmural treatment phase

or who can go outside the hospital without supervision. It should be noted, however, that most of these

limitations would have had a negative effect on the predictive accuracy of the HCR-20, thus, the

findings might have been even stronger without these limitations.

Based on our findings and experiences we would like to conclude with some recommendations for the

use of the HCR-20 in forensic clinical practice. Although it is clearly stated in the HCR-20 manual

(Webster et al., 1997b) and still more recently pointed out by Webster and colleagues (2002), we want

to emphasize again that raters should be trained and experienced in performing risk assessments with

the HCR-20. In addition, they should keep their skills up to date by advanced training, keeping up with

the literature on violence risk assessment, and performing risk assessments on a regular basis. In this

study, there was a requirement for group leaders to conduct at least one risk assessment every 6

months or they had to repeat the training. Furthermore, we strongly recommend to have more than one

rater coding the HCR-20, preferably raters with different roles in treatment, for instance, an objective,

more distant person like a researcher or diagnostician, and an experienced clinician who knows the

patient well. Structured discussion about the risk factors in a case conference is very useful and can

improve the accuracy of the risk assessment. Moreover, it can help to design risk management

strategies. The identification of possible protective factors is important because the aim is to minimize

violence risk. Finally, it is important to repeat the violence risk assessment every time the context

changes, for instance, when the liberties of patients are expanded.

General discussion 77

General discussion

137

CHAPTER 7

General discussion

Main research findings

Overall, the studies in this thesis provide strong empirical support for the method of structured

professional judgment (SPJ) of (sexual) violence risk. The retrospective studies demonstrate that both

the HCR-20 and SVR-20 have good interrater reliability and predictive validity for respectively

violent and sexual recidivism in violent and sexual offenders admitted to a forensic psychiatric

hospital. Moreover, the SVR-20 is significantly more accurate in predicting sexual recidivism than an

actuarial instrument, the Static-99. The SVR-20 also predicts non-sexual violent recidivism in sexual

offenders. The HCR-20 is significantly more accurate in predicting violent recidivism than the

unstructured clinical judgment as reflected in the hospital staff’s advice to the court. The prospective

studies show good interrater reliability among three raters and significant validity for the HCR-20

scores and final risk judgment in predicting violent incidents during treatment. Furthermore, it was

found that treatment supervisors are as accurate as researchers in predicting violent incidents during

treatment with the HCR-20. The HCR-20 ratings as agreed upon by the researchers, treatment

supervisors and group leaders in the consensus yield the highest predictive accuracy for violent

incidents during treatment. Although the HCR-20 subscales and total scores demonstrate high

accuracy in predicting violence – both during and after treatment - in male forensic psychiatric

patients, we did not find significant accuracy for the HCR-20 scores in predicting violence in female

patients. However, the HCR-20 final risk judgment demonstrate good predictive validity in both male

and female patients, implying that violence risk factors are different for women compared to men, but

that the method of SPJ is suitable for offenders of both sexes.

In all predictive validity studies in this thesis, the final risk judgment was found to add significantly to

the HCR-20 or SVR-20 used in an actuarial way (i.e., summing the individual item scores). Similar

findings are reported by Douglas and colleagues (2003) for the HCR-20, by Dempster (1998) for the

SVR-20 and by Kropp and Hart (2000) for the SARA. Thus, risk assessment is not just the summing

of scores on risk factors, the surplus value of the SPJ method is in systematically collecting, reviewing,

combining, weighing, and integrating information, and discussion with colleagues - preferably from

different disciplines - to reach a final risk judgment upon (sexual) violence risk.

In this chapter, the most important findings of the research presented in thisthesis are summarized and several limitations and strengths are discussed.Furthermore, implications for clinical practice and suggestions for future research are provided.

Chapter 7

138

Comparison of the main research findings to those of previous studies

Our findings regarding the interrater reliability and predictive validity of the HCR-20 are concur with

results obtained in North-American studies (see Douglas et al., 2005) and Swedish studies (see

Dernevik, 2004). Also, the findings related to interrater reliability and predictive validity of the SVR-

20 are in line with those from a previous study into the SVR-20 (see Dempster, 1998).

This thesis contains the first prospective study into the psychometric properties of the Dutch HCR-20

and the first study into the psychometric properties of the Dutch SVR-20. To date, only one other

study into the Dutch HCR-20 was published to which we can compare our results. In a retrospective

study into the predictive validity of the Dutch HCR-20 in a group of 69 forensic psychiatric patients,

Philipse and colleagues (2002) also found that particularly the final risk judgment predicted relapse in

violent offending.

Next to these studies into the HCR-20 in The Netherlands, Dutch researchers have performed studies

into other Dutch structured risk assessment instruments. Recently, Philipse (2005) finished his

dissertation research into the interrater reliability and predictive validity of a checklist he developed

containing 48 dynamic risk factors identified by twelve clinicians from different forensic psychiatric

hospitals in The Netherlands. He found interrater reliability levels similar to those found for the HCR-

20 but poor predictive validity for violent recidivism. The total score on the clinically derived dynamic

risk factors and the overall clinical judgment of risk of recidivism (item 48 in the checklist) were not

significantly related to violent or sexual recidivism (AUC = .42 and .43, respectively). Philipse

concluded that clinical judgment of dynamic risk factors needs to be improved, for instance, by

multidisciplinary discussion and elaborate operationalisation of dynamic risk factors. The poor

predictive validity Philipse found for clinical judgment and his conclusion stated above is in line with

our findings that clinical judgment can be improved by using structured instruments (see Chapter 4)

and by multidisciplinary discussion (Chapter 6). Hilterman (2001, 2004) developed the Leave Risk

Assessment, an actuarial risk assessment scale containing both static and dynamic risk factors

specifically designed for the assessment of recidivism during leave by Dutch offenders with the tbs-

order. He found good interrater reliability for this instrument and high correlation coefficients with the

HCR-20 subscales. The total score on the Leave Risk Assessment scale seemed to perform better than

the total score of the HCR-20 for the specific context of recidivism during leave (AUC = .81 and .70,

respectively). Finally, a group of researchers performed a large-scale retrospective study into the

interrater reliability and predictive validity of the HKT-30, a Dutch structured risk assessment

instrument for general recidivism that was developed on the basis of several national and international

instruments like the HCR-20 and SVR-20 (Canton, van der Veer, van Panhuis, Verheul, & van den

Brink, 2004a, 2004b). The HKT-30 was coded by three raters on the basis of ‘pro Justitia’ reports.1 In

1 ‘Pro justitia’ reports are psychiatric / psychological reports to the court containing information on the defendant’s responsibility for the alleged offense and the defendant’s future violence risk (see de Ruiter and Hildebrand (2003) and the Introduction for more information on the Dutch tbs-order).

General discussion

139

a group of 74 defendants, they found good interrater reliability for the subscales scores and total score

of the HKT-30. In a group of 123 defendants, they found significant predictive accuracy for serious

(violent) recidivism (AUC = .72) and moderate predictive accuracy for minor (non-violent) recidivism

(AUC = .62). It should be noted that the researchers in this study used the first version of the HKT-30.

In 2002, the HKT-30 was thoroughly revised and a second version of the HKT-30 became available.

When comparing the results of the first version of the HKT-30 to the results of the HCR-20 in the

present thesis (Chapter 4 and 6), we draw the preliminary conclusion that the Dutch HCR-20 is more

accurate in predicting violent recidivism in offenders with a violent history than the first version of the

HKT-30. However, the results are not easy to compare because of important differences in samples

and study designs. Currently, there is a large-scale national retrospective study underway comparing

the predictive validity of the second version of the HKT-30, the HCR-20 and the SVR-20. We will

have to await the results (and the results of prospective studies comparing the instruments) before we

can definitely conclude which instrument is best suited for the prediction of violence in Dutch forensic


Strengths of this thesis

The main strength of this thesis is that the studies were conducted in a forensic clinical setting with

multiple raters, both independent researchers and treating clinicians. Since the HCR-20 was developed

for use in daily clinical practice, it is surprising that in most studies to date, the checklist was coded by

researchers who did not personally know the persons to be judged (see also Dernevik, 2004). In the

present thesis, the HCR-20 was used in the way it was intended to, i.e., by clinicians in their daily

work. Another strength of this thesis is that it contains both prospective and retrospective studies.

Finally, this thesis contains the first prospective study into the Dutch HCR-20 and the first study into

the Dutch SVR-20.

Limitations of this thesis

Several limitations of (the research presented in) this thesis should be mentioned. First, in this thesis

we did not prospectively examine predictive validity of the HCR-20 for violent recidivism. We

retrospectively examined the predictive validity of the HCR-20 for violent recidivism (Chapter 4) and

prospectively for inpatient violence or violence during transmural treatment (Chapter 6). The reason

for this was that the group of patients with HCR-20 ratings in the prospective study who were

discharged was too small (N = 19) and their mean follow up period in the community was too brief

(15.4 months, SD = 8.8, range 4-34). The predictive validity of the HCR-20 for violent recidivism

after termination of the tbs-order needs to be established in future prospective research. The same is

true for the SVR-20. In this thesis, we only provided retrospective results on the predictive validity of

the SVR-20 (Chapter 3). The group of sexual offenders with SVR-20 ratings who were discharged was

also too small (N = 4). More prospective research is needed to confirm the predictive validity of the

Chapter 7

140

SVR-20 for future sexual recidivism. With respect to the predictive validity for incidents of sexual

violence during treatment, we considered the group of sexual offenders whose sexual violence risk had

been assessed prospectively too small (N = 37). Moreover, incidents of sexual violence are very rarely

observed in the hospital, possibly because these type of incidents are in fact rare, but it is also possible

that these type of incidents are not reported to staff. Also, child molesters do not have the opportunity

to offend, simply because there are no children present in the hospital.

A second limitation relates to the samples in the studies. All studies were conducted in one forensic

psychiatric hospital in The Netherlands, the Dr. Henri van der Hoeven Kliniek. Thus, caution is

warranted regarding the generalizability of the findings in this thesis. However, we believe that the

patients in this hospital are representative for Dutch forensic psychiatric patients, because they do not

differ in demographic, psychiatric and criminal history factors from patients in other Dutch forensic

psychiatric hospitals (see van Emmerik & Brouwers, 2001). We do not know, however, if the

researchers and clinicians that participated in this research are comparable to raters in other

institutions. Furthermore, the sample sizes in the studies were relatively small which might have

affected the results of some of the statistical analyses. This is especially true for the subsamples with

different types of discharge in Chapter 4 and the female versus male sample in Chapter 5. Larger

samples would have resulted in increased power. However, given the paucity of research on female

forensic psychiatric patients and patients with different types of discharge, we believe even samples of

limited size such as ours can make a contribution to the knowledge base. Another limitation with

respect to the samples in this thesis is that in Chapter 2 and 4, the samples comprised both men and

women. Later, in the study described in Chapter 5, we found several important differences between

men and women in predictive validity of the HCR-20. Based on these results, we recommended

researchers to report the results on predictive validity of risk assessment instruments separately for

men and women, because reporting the results jointly could lead to distorted conclusions.

A final limitation concerns the reliability of (sexual) violent outcome data in both the prospective and

retrospective studies. The rates of (sexual) violent recidivism or violence during treatment found in the

studies can be taken as an underestimation of the actual base rates, because not all offenders are

arrested or convicted and in the hospital, not all violent incidents are observed by or told to staff.

Implications for clinical practice

Based on the results of and the experiences during the studies reported in this thesis, we would like to

provide recommendations with respect to three issues: 1) the use of SPJ checklists; 2) the link between

risk assessment and risk management; and 3) risk communication. Furthermore, we want to make

some cautionary notes with respect to the use of the HCR-20 and SVR-20 in forensic clinical practice.

1. The use of SPJ checklists

General discussion

141

First, the most important implication for clinical practice to be derived from this thesis is that the use

of SPJ checklists is highly recommended to accurately assess violence risk. The use of SPJ checklists

is particularly recommended as a way for clinicians to structure their thinking. This means that

violence risk assessment not only concerns the scores on the risk factors; the true relevance of the SPJ

method is in systematically collecting, reviewing, weighing, integrating, and combining information

needed to code the items. Recently, the importance of the use of structured risk assessment instruments

was recognized by Dutch policy makers. As of July 2005, the Ministry of Justice mandates all forensic

psychiatric institutions admitting patients with the tbs-order in The Netherlands to perform a structured

risk assessment at least once a year (e.g., before an unsupervised leave or start of transmural treatment)

and to base their judgment on these structured methods. The instruments to be used are the HCR-20

(including the PCL-R) or the HKT-30 (including the PCL-R) and, in case of a sexual offender the

SVR-20 (including the PCL-R). In addition, the Ministry of Justice strongly advises multidisciplinary

consensus to arrive at the risk judgment. We believe this is a valuable new policy for Dutch forensic

practice, however, we want to make two comments. First, as yet, we advise forensic mental health

professionals to use the HCR-20 (including the PCL-R) instead of the HKT-30, because the

psychometric properties of the HCR-20 have been demonstrated to be excellent in numerous

international studies and, for Dutch forensic psychiatry, in the present thesis and by Philipse et al.

(2002). However, the predictive validity of the second version of the HKT-30 has still to be proven.

Second, we believe that not only mental health professionals working in forensic psychiatric

institutions with tbs-offenders should perform structured risk assessments, but also mental health

professionals working in, for instance, forensic youth institutions and outpatient forensic setting. We

also believe that mental health professionals who conduct ‘pro Justitia’ reports (see Introduction and p.

141) should use the same structured risk assessment instruments. The task of these mental health

professionals is not only to report on the offender’s responsibility for the alleged crime, but also on the

offender’s violence risk since the court can only decide to impose the tbs-order in case of high risk to

society. Concluding, we want to underline the importance that all mental health professionals in

forensic practice apply the same risk assessment methods because it results in greater transparency and

improved communication, and because in this way forensic psychiatric patients can be adequately

monitored throughout their stay in the criminal justice system.

The recommendation to use SPJ checklists should be accompanied by several guidelines how to use

the checklists in daily practice. We want to stress the importance of the guidelines that are provided in

the manuals of the checklists. Although it has been stated before (see Webster et al., 2002; Chapter 1,

2 and 6), the importance of these guidelines cannot be emphasized enough. For example, the rater

should meet several qualifications, among others, appropriate training, and expertise in interviewing,

administration and interpretation of standardized tests. Furthermore, the rater is strongly advised to use

the manual at every risk assessment, frequently consult colleagues, and to keep up with new research

findings and developments (Webster et al., 1997b). Next to the guidelines that were already provided

Chapter 7

142

by others, we want to make two additional recommendations. For clinical practice, we recommend

coding by multiple raters from different disciplines and arriving at consensus judgment during case

conferences. This thesis clearly demonstrated significantly higher predictive accuracy for the

consensus ratings compared to the individual ratings. During case conferences, possible effects of rater

bias can be ruled out, raters can sharpen their understanding of the items, correct each other, share

information that is not available to everyone, discuss the meaning of the items, and discuss possible

additional risk factors or protective factors and risk management strategies. Furthermore, following the

plea of Webster and colleagues (2002) for proper use of the HCR-20, we recommend to develop

quality standards to guarantee high quality risk assessments and to avoid misuse of the HCR-20 by

untrained or inexperienced raters. Mental health professionals who perform risk assessments with SPJ

checklists and report their results to, for instance, the court should be registered and certified. To

obtain a certificate, the rater should have followed an official training (i.e., approved by the authors of

the checklist) in coding the checklists. Such a training should include a presentation of up to date

empirical knowledge as well as practice with real life cases. In addition, we advise that relatively

inexperienced raters (such as the group leaders in our setting) should be supervised by experienced

raters for at least ten risk assessments. In general, forensic mental health professionals should always

be cognizant of their professional responsibilities and work according to prevailing ethical standards

and the most recent empirical knowledge when they judge violence risk and report the results to the

person assessed, colleagues or the court.

2. The link between risk assessment and risk management

A second implication for clinical practice is that risk assessment should lead to risk management, in

other words, mental health professionals should act upon the results of the risk assessment. Any

adequate violence risk assessment should provide guidelines for treatment aimed at reducing violence

risk. Raters may, for instance, use the HCR-20 Companion guide (Douglas et al., 2001) to translate the

risk factors into risk management strategies. Also, violence risk assessment used in clinical practice

should have consequences for decisions regarding needed level of security, implying that patients who

are judged in the consensus as high-risk are not to be released. Table 1 shows that in the studies in this

thesis, the majority of the patients who were judged as high-risk recidivated with violent behavior

(50%-69%) whereas none of the low-risk cases did. We believe that the rates of reoccurring violent

behavior by the high-risk cases are unacceptably high considering the main goal of the tbs-order to

protect society from high-risk offenders. Realistic and well-thought out risk management programs

could lower the risk to a moderate or low level; only then might these former high-risk patients be

released under restricted and controlled conditions. A more complicated issue arises when mental

health professionals have to decide upon moderate-risk cases. In the studies in this thesis, a substantial

proportion of the patients were judged as moderate-risk and as can be seen in Table 1 only a small

minority of them (6%-8%) committed a new (sexual) violent offense. In these cases, we recommend

General discussion

143

intensive monitoring and repeated risk assessments at regular intervals. Especially the Clinical and

Risk management items of the HCR-20 can aid in monitoring treatment progress and the needed level

of security.

Table 1. Final risk judgments, mean HCR-20 / SVR-20 scores and violent outcome in three studies (Chapters

2, 4, and 6)

n

Mean

HCR-20 score

Range of

HCR-20 scores

Violent outcome

Study in Chapter 4

Low 14 15.7 12-21 0 (0%) Moderate 47 23.5 15-31 7 (6%) High 58 31.0 18-37 36 (62%) Study in Chapter 6

Low 36 21.8 11-33 0 (0%) Moderate 61 26.9 14-36 4 (7%) High

30 32.0 22-37 15 (50%)

n

Mean

SVR-20 score

Range of

SVR-20 scores

Sexual recidivism

Study in Chapter 2

Low 20 15.9 10-25 0 (0%) Moderate 36 19.8 11-27 4 (8%) High

65 28.3 18-40 45 (69%)

3. Risk communication

Although this thesis did not specifically examine violence risk communications, we want to make

some recommendations for mental health professionals in clinical practice concerning this topic.

Mental health professionals are advised not to use numbers (e.g., the HCR-20 total score) in their risk

communications to the court or to colleagues, but to be descriptive, for example, describe the most

important (combinations of) risk factors and provide risk management strategies. Table 1 shows that

although the mean total score increases per risk judgment category, the total score is not consistently

related to the final risk judgment level. For instance, in the study described in Chapter 6, a patient with

a HCR-20 total score of 33 was judged to pose a low risk and did not show violent behavior, whereas

in the research in Chapter 4, a patient obtained a HCR-20 total score of 18, but was judged to be a high

risk case and did recidivate violently. Important issues to be reported in the risk communication are

not only the likelihood of (sexual) violence, but also the nature, severity, frequency and imminence of

Chapter 7

144

risk (see also Appendixes I and II). The formulation of the risk assessment should be clear and

unambiguous to avoid misunderstanding.

Cautionary notes regarding the use of the HCR-20 and SVR-20 in clinical practice

Although the present thesis demonstrated convincible evidence for the validity of the HCR-20 and

SVR-20, two cautionary notes should be made here for use of these instruments in clinical practice.

The first note is that caution is warranted regarding the use of the HCR-20 in female forensic

psychiatric patients. The HCR-20 subscales and total scores were not predictive for violence in female

patients. Thus, several of the risk factors in the HCR-20 do not seem to be relevant for the prediction

of violence in women. However, the final risk judgment was significantly predictive, implying that the

SPJ method of risk assessment is useful for female patients. Raters working with female forensic

psychiatric patients should be very careful when interpreting the HCR-20 ratings, keep up with

literature on female violence and always consult colleagues who also rated the HCR-20 when they

judge the risk of violence in women. The same caution is warranted for the use of the SVR-20 in

female forensic psychiatric patients. To our knowledge, the SVR-20, which was developed from

research in male only populations, has never been validated in female samples. In the period of this

dissertation research in our hospital (2001-2004), we had only one female sexual offender for whom

the SVR-20 was coded.

The second note is that caution is warranted regarding the use of the SVR-20 for the prediction of

future sexual violence. Although the retrospective study in this thesis demonstrates good predictive

validity for the SVR-20 for sexual recidivism, we do not have prospective data to confirm this

predictive accuracy. Large-scale, longitudinal studies are needed to establish the predictive validity in

different types of sexual offenders. Furthermore, we want to mention some limitations of the SVR-20

we encountered in our research, especially in comparison with the HCR-20.2 An important limitation

of the SVR-20 – compared to the HCR-20 - is that the checklist comprises few dynamic risk factors,

which limits the usefulness of this checklist in clinical practice, for instance, with respect to risk

management. Furthermore, the empirical basis of the HCR-20 is much stronger than that of the SVR-

20. Notable in the manual of the SVR-20 is the lack of empirical evidence for the inclusion of items

such as ‘Suicidal / homicidal ideation’, and ‘Lacks realistic plans’. In these items, the inclusion is

justified with the statement “This factor is likely a risk marker” (e.g., Boer et al., 1997, p. 52). Future

research should provide evidence for the inclusion of these items. A risk factor that was often

mentioned by raters to be lacking in the SVR-20 is ‘Problems with sexual self-regulation’. Hanson and

2 The HCR-20 (Webster et al., 1997b) is an improved, revised version of the first HCR-20 (Webster et al., 1995), while the SVR-20 is still the first version. Furthermore, it should be noted that two of the authors of the SVR-20 recognized several limitations of the SVR-20 and developed in collaboration with several colleagues the Risk for Sexual Violence Protocol (RSVP; Hart, Kropp, Laws, Klaver, Logan, & Watt, 2003), which can be seen as an “evolved form of the SVR-20” (Laws, 2004). The RSVP incorporates some new features, such as expanded consideration of risk factors related to risk management that will probably increase the utility of the RSVP in risk management and forensic decision-making.

General discussion

145

Harris (2000) stated that problems with sexual self-regulation – strong sexual urges that sexual

offenders feel entitled to act out - is one of the most distinctive risk factors for sexual offending (see

also Hanson et al., 1994).

Suggestions for future research

The results of the studies in this thesis call for continued research in our setting. Most importantly,

prospective research into the predictive validity of the HCR-20 and SVR-20 for violent recidivism and

sexual recidivism is needed. Furthermore, it would be interesting to systematically observe what

happens during the case conferences in order to examine why the consensus ratings yield higher

predictive accuracy than individual ratings. What is the contribution of each individual rater and how

do the raters reach consensus? More generally, research is needed into the interrater reliability and

predictive validity of SPJ checklists in different samples and settings in The Netherlands, for instance,

penitentiary and ambulatory settings. Next to these suggestions for future research in The Netherlands,

we offer four general topics of research: 1) violence risk management; 2) violence risk

communication; 3) protective factors for violence risk; and 4) risk factors in specific groups of

forensic psychiatric patients.

1. Violence risk management

First, one of the most important research areas in violence risk assessment research is violence risk

management. How do clinicians act upon the results of the violence risk assessment? Can adequate

violence risk assessment actually provide guidelines for effective, risk reducing treatment? Not many

studies addressed the issue of linking risk assessment and risk management. A major difficulty in these

types of research is the lack of adequate control groups. Douglas and Kropp (2002) have provided

directions for research with respect to violence risk management. For example, they advise

prospective, repeated measures studies using survival analyses.

2. Violence risk communication

The second suggestion for future research regards violence risk communication and the relationship

between violence risk communication and decision-making. How do decision-makers (both mental

health professionals and administrators / judges) evaluate violence risk communications? Do decision-

makers understand violence risk communications and how do they act upon violence risk

communications? Large-scale studies are needed that examine how decision-makers judge violence

risk communications and if they subsequently base their actual decisions upon the violence risk

communications. Vignette studies can be useful, but more importantly, studies that address real-life

risk communications are desirable to better understand and ultimately improve risk communications.

In the Netherlands, one can think of studies examining how the court judges the quality and

comprehensibility of the forensic psychiatric hospital’s advices for or against termination of the tbs-

Chapter 7

146

order. It might be interesting to examine if the court follows these advices and if not, what the

arguments are for not following the advices. This is also important with respect to prevention of

recidivism because research has demonstrated that patients with a termination of the tbs-order against

the hospital’s advice have significantly higher rates of violent recidivism compared to patients with a

termination of the tbs-order in line with the hospital’s advice (see Chapter 4).

3. Protective factors

A third topic that deserves much more attention in empirical research is protective factors (see also

Rogers, 2000). Studies that address protective factors, i.e., factors that diminish risk of (sexual) violent

recidivism, particularly in adult samples, are scarce. Research is needed into theoretical models of

protective factors, the identification of protective factors, effects of these protective factors in reducing

violence risk in different samples and the interaction between risk factors and protective factors.

Based on the clinical experiences of the raters in our setting (Chapter 2 and 6), the available empirical

knowledge on factors that reduce recidivism and knowledge on situational risk factors for recidivism,

we developed a preliminary checklist for protective factors, the Structured Assessment of Protective

Factors (SAPROF; de Vogel, de Ruiter, & Bouman, in preparation). This checklist is developed to be

used in conjunction to the HCR-20 or SVR-20. The goals of the SAPROF are to provide a more

balanced risk assessment, with both risk and protective factors and to provide guidelines for risk

management. Future research into the interrater reliability and predictive validity of this checklist and

its relationship with the HCR-20 or SVR-20 should demonstrate if this checklist holds promise for

forensic practice.

4. Risk factors in specific groups of forensic psychiatric patients

A fourth suggested area of future research is on specific (combinations of) violence risk factors in

subgroups of forensic psychiatric patients, for instance, women and subtypes of sexual offenders. For

example, risk assessment instruments for sexual violence do not differentiate between different types

of sexual offenders, most importantly rapists versus child molesters. There are suggestions, however,

that the two groups differ with respect to risk factors for sexual recidivism. For example, Hanson and

Morton (2004) stated that sexual deviation is more important in child molesters than in rapists.

Possibly, the knowledge on risk factors in specific subgroups can lead to the development of more

refined versions of SPJ checklists.

Some concluding remarks regarding future research

More generally, future research into risk factors could lead to new, improved versions of the SPJ

checklists. The checklists are to be seen as ‘work in progress’ (Webster et al., 1997b)3, adding new

3 Currently, the authors are working on revisions to the HCR-20 (Douglas, personal communication, January 28, 2005).

General discussion

147

empirically derived risk factors and / or omitting non-relevant factors is encouraged. Our study has, for

example, directed attention to a number of other risk factors, some of which might be valuable

additions to the HCR-20 (e.g., financial problems, lack of prospects for the future) and SVR-20

(problems with sexual self-regulation, social isolation). Future research can provide evidence for the

relevance of these kinds of factors. In sum, our knowledge on violence risk assessment has increased

tremendously during past few decades, however, future research continue to be needed to show the

ultimate effectiveness of violence risk assessment, i.e., more effective risk management and reduced

rates of (sexual) violent recidivism and thus, a safer society.

Summary

149

Summary

Background and goal of this thesis

The assessment of risk of future (sexual) violent behavior is one of the most important tasks of mental

health professionals in forensic psychiatry. A carefully conducted risk assessment before a

probationary leave or termination of (mandatory) treatment can help appraise and manage the risk of

recidivism in an adequate way and thereby prevent serious (sexual) violent offenses. Until recently, the

best known and most widely used method in practice, at least in the Netherlands, was the unstructured

clinical judgment approach that is exclusively based on the professional expertise of the mental health

professional. However, research has revealed some important limitations of this unstructured clinical

judgment, such as poor reliability and validity (e.g., Monahan, 1981). Therefore, several authors have

recommended the use of more structured risk assessment procedures in order to optimize the accuracy

of violence risk assessments (Borum, 1996).

The past two decades, research into risk factors for (sexual) violence, the development of structured

risk assessment instruments and research into the psychometric properties of these instruments has

expanded enormously. An important distinction among structured risk assessment instruments can be

made between the actuarial and the structured professional judgment (SPJ) approach. Actuarial

instruments are developed on the basis of risk factors that are empirically related to (sexual) violent

behavior. These instruments are relatively simple to code - according to fixed rules and not necessarily

by a forensic expert - and contain predominantly static, non-changeable factors. The scores on the

factors are added up according to a fixed algorithm to reach a conclusion on the risk of recidivism. In

the SPJ model, the risk assessment is performed by a forensic clinician by means of a standardized

checklist containing empirically derived risk factors for (sexual) violence, historical as well as

dynamic factors. Essential in the SPJ method is that the forensic clinician not only rates and sums the

items, but also uses his / her expertise and knowledge to interpret, integrate, combine and weigh the

risk factors. The final risk is judged in terms of low, moderate or high. By coding the checklist, the

forensic clinician gains insight into the most relevant risk factors of the patient and can use this

information for formulating risk management strategies. The Historical, Clinical, Risk management-20

(HCR-20; Webster et al., 1997b) is a well known and extensively studied SPJ checklist for the

assessment of future violence risk. The Sexual Violence Risk-20 (SVR-20; Boer et al., 1997) is a SPJ

checklist for sexual violence.

The main goal of this thesis is to examine if the Dutch versions of the HCR-20 and SVR-20 are

suitable for the prediction of future (sexual) violence in Dutch forensic practice.

General conclusion of this thesis

The studies in this thesis provide strong support for the method of SPJ of violence risk in Dutch

forensic clinical practice. The method of SPJ – reviewing, integrating, weighing and discussing risk

Summary

150

factors and then arriving at the final risk judgment - is effective. The HCR-20 demonstrates good

interrater reliability and predictive validity for violent recidivism and incidents of violence during

treatment and predicted significantly better than the unstructured clinical judgment. The SVR-20 has

good interrater reliability and predictive validity for sexual recidivism and predicted significantly

better than the actuarial judgment.

Summary of the chapters

In Chapter 1, the state of the art with respect to violence risk assessment is discussed. The different

approaches to violence risk assessment – unstructured clinical judgment, actuarial risk assessment and

structured professional judgment – are reviewed. Furthermore, specific risk factors and the

implications for violence risk assessment are set out for four different forensic psychiatric patient

groups: patients with a major mental disorder, patients with a personality disorder, patients who

committed sexual offenses, and female forensic psychiatric patients. Finally, two topics are discussed

that have not yet received much empirical attention; violence risk communication and violence risk

management.

In Chapters 2 through 6, the empirical studies of this thesis are discussed. The aim of the prospective

study described in Chapter 2 was to establish the interrater reliability of the HCR-20 and to gain

insight into differences between researchers and clinicians in coding the HCR-20. The HCR-20 was

coded by two independent researchers and two independent clinicians (treatment supervisor and group

leader) for 60 patients admitted to the Dr. Henri van der Hoeven Kliniek. Overall, the interrater

reliability of the HCR-20 was good. The group leaders gave significantly lower HCR-20 scores than

the researchers. There were no significant differences between the mean HCR-20 scores of treatment

supervisors and researchers, but there was a significant difference in the interpretation of the scores:

treatment supervisors had more ‘low risk’ judgments than researchers. Furthermore, it was found that

feelings of clinicians towards their patients were associated with their risk judgment. Feelings of being

controlled and manipulated by the patient were related to higher HCR-20 scores, whereas positive

feelings (helpful, happy, relaxed) were related to lower risk judgments

In Chapter 3, a retrospective study is presented into the interrater reliability and predictive validity of

two risk assessment instruments for sexual violence. The SVR-20 and the Static-99 (an actuarial risk

assessment instrument) were coded from file information of 122 sexual offenders who were admitted

to the Dr. Henri van der Hoeven Kliniek between 1974 and 1996. The interrater reliability of the SVR-

20 was good, of the Static-99 excellent. After the files had been coded, recidivism data (reconvictions)

were retrieved from the Ministry of Justice and related to the risk assessments. The predictive validity

of the SVR-20 for sexual recidivism after discharge was good, of the Static-99 moderate. The SVR-20

final risk judgment was a significantly better predictor of sexual recidivism than the Static-99 risk

category. Furthermore, the SVR-20 final risk judgment added significant incremental validity to the

SVR-20 subscales scores.

Summary

151

The retrospective study in Chapter 4 examined the predictive validity of the HCR-20 and the

Psychopathy Checklist-Revised (PCL-R; Hare, 1991, 2003) for violent recidivism after discharge.

Both instruments were coded on the basis of file information of 120 patients discharged from the Dr.

Henri van der Hoeven Kliniek between 1993 and 1999. The patients were divided into four groups

according to their type of discharge:

1) discharge by the court in line with the hospital staff’s advice and after a transmural phase;

2) discharge by the court in line with the hospital staff’s advice, but without a preceding transmural

phase;

3) discharge by the court against the hospital staff’s advice;

4) readmission to another institution.

These types of discharge reflect different unstructured clinical judgments. Discharge in line with the

hospital staff’s advice after a transmural phase reflects the lowest judgment of risk, readmission to

another secure institution is considered to represent the highest level of risk. After the files had been

coded, recidivism data (reconvictions) were retrieved from the Ministry of Justice and related to the

risk assessments. The HCR-20 and PCL-R total scores demonstrated good predictive validity for

violent recidivism. The HCR-20 was a significantly better predictor of violent recidivism than the

unstructured clinical judgment. In addition, the HCR-20 total score predicted significantly better than

the PCL-R total score, although the difference in AUC values was no longer significant when the item

‘Psychopathy’ was removed from the HCR-20 total score.

The study in Chapter 5 examined the interrater reliability and predictive validity of the HCR-20 in a

sample of 42 female patients admitted to the Dr. Henri van der Hoeven Kliniek. The findings are

compared to a matched sample of 42 male forensic psychiatric patients. The interrater reliability of the

HCR-20 was good for both female and male patients. The mean H, C, and R- subscale scores and total

score were comparable, however, there were significant differences between female and male patients

in mean scores on a number of HCR-20 items. Female patients compared to male patients received

significantly higher scores on the items ‘Relationship instability’ and ‘Impulsivity’ and significantly

lower scores on the items ‘Young age at first violent incident’, ‘Psychopathy’, and ‘Negative

attitudes’. For male patients, the HCR-20 total score demonstrated good to excellent predictive validity

for violent outcome (violent recidivism and inpatient violence), however, predictive accuracy for

female patients was much lower. In females, only the HCR-20 final risk judgment, but not the HCR-20

total score, demonstrated significant predictive validity for violent outcome.

Chapter 6 describes the continuation of the study in Chapter 2. The aim of this prospective study was

to establish the predictive validity of the HCR-20 for incidents of violence during treatment. In this

study, the HCR-20 was coded independently by three rater groups (researchers, treatment supervisors

and group leaders) for 127 male mentally disordered offenders admitted to the Dr. Henri van der

Hoeven Kliniek. During case conferences, the three raters discussed their HCR-20 item ratings and

reached consensus on their ratings and final risk judgment. HCR-20 ratings were related to incidents of

Summary

152

physical violence during treatment. Overall, the predictive validity of the HCR-20 as coded by

consensus was good. The final risk judgment added significant incremental validity to the HCR-20

subscale scores. We found no differences between researchers and treatment supervisors in predictive

accuracy. Group leaders performed worse compared to the other two rater groups. A possible

explanation is that there were many group leaders (N = 59) who participated in this research and that

most of them had little experience in coding the HCR-20. The consensus between researchers,

treatment supervisors and group leaders yielded the highest predictive accuracy. This is especially true

for the consensus final risk judgment which was significantly better than the judgment of the three

rater groups individually. Items that were most predictive (i.e., remained significant predictors in

stepwise Cox regression analyses) were ‘Young age at first violent incident’, ‘Unresponsive to

treatment’, and ‘Exposure to destabilizers’.

Chapter 7 summarizes the most important findings of the studies presented in this thesis.

Furthermore, some strengths and limitations of the studies are discussed and implications for clinical

practice, training and suggestions for future research are provided.

Samenvatting

153

Samenvatting

Achtergrond en doelstelling van dit proefschrift

Het inschatten van het risico van (seksueel) gewelddadig gedrag is één van de belangrijkste taken van

gedragsdeskundigen in de forensische psychiatrie. Een accurate risicotaxatie voorafgaand aan

bijvoorbeeld een (proef)verlof of bij verlengingsadviezen van ter beschikking gestelden is essentieel

en kan gewelddadig gedrag helpen voorkomen. Nog niet zo lang geleden werd in de meeste

Nederlandse forensisch psychiatrische instellingen het risico van geweld ingeschat volgens het

zogenaamd ongestructureerd klinisch oordeel, dat wil zeggen, gebaseerd op de kennis en ervaring van

clinici. Uit onderzoek is gebleken dat aan deze ongestructureerde klinische inschatting een aantal

nadelen verbonden zijn, zoals lage betrouwbaarheid en validiteit (o.a. Monahan, 1981). Meer

gestandaardiseerde, systematische risicotaxatie wordt dan ook sterk aanbevolen (Borum, 1996).

De afgelopen twintig jaar is het (hoofdzakelijk Noord-Amerikaanse) onderzoek naar risicofactoren

voor toekomstig gewelddadig gedrag sterk toegenomen. Op basis hiervan zijn verschillende

risicotaxatie instrumenten ontwikkeld. Er kan onderscheid gemaakt worden tussen twee typen

risicotaxatie instrumenten; actuariële instrumenten en instrumenten volgens het gestructureerd klinisch

oordeel (SPJ; structured professional judgment). Actuariële instrumenten zijn ontwikkeld op basis van

uit empirisch onderzoek gevonden factoren die samenhangen met gewelddadig gedrag. Deze

instrumenten zijn relatief eenvoudig te scoren, niet noodzakelijk door een deskundige, volgens een

vastomlijnd coderingssysteem en bevatten voornamelijk statische, niet (ten positieve) veranderbare

factoren. De scores worden opgeteld volgens een vast algoritme om tot een risicobeoordeling te

komen. Bij het gestructureerd klinisch oordeel wordt de risicotaxatie uitgevoerd door een deskundige

met behulp van een checklist. Deze checklist bevat zowel statische als dynamische risicofactoren

waarvan empirisch onderzoek heeft aangetoond dat ze met geweld samenhangen. Het essentiële aan

deze methode is dat de deskundige niet alleen de factoren scoort en bij elkaar optelt om tot een

conclusie over het risico te komen, maar daarnaast zijn/haar kennis en ervaring gebruikt om de

factoren te interpreteren, te integreren, combineren en wegen. Het eindoordeel over het risico van

geweld wordt door de deskundige ingeschat als laag, matig of hoog. Door het scoren van de checklist

krijgt de deskundige meer inzicht in de risicofactoren van de patiënt en kan op basis daarvan

behandelingsstrategieën gericht op het verminderen van recidiverisico opstellen. De Historical,

Clinical, Risk management-20 (HCR-20; Webster e.a., 1997b) is een checklist voor het voorspellen

van gewelddadig gedrag en is één van de meest bekende en onderzochte instrumenten volgens het

gestructureerd klinisch oordeel. De Sexual Violence Risk-20 (SVR-20; Boer e.a., 1997) is een

vergelijkbare checklist voor het inschatten van seksueel gewelddadig gedrag.

De algemene doelstelling van dit proefschrift is het onderzoeken of de Nederlandse versies van de

HCR-20 en SVR-20 geschikt zijn voor het voorspellen van (seksueel) gewelddadig gedrag in de

Nederlandse forensische klinische praktijk.

Samenvatting

154

Algemene conclusie van dit proefschrift In dit proefschrift werd gevonden dat de methode van het gestructureerd klinisch oordeel van grote

waarde is voor het voorspellen van het risico van toekomstig (seksueel) geweld in de Nederlandse

forensische klinische praktijk. De methode van het gestructureerd klinisch oordeel - het kritisch

beschouwen, integreren, wegen en bediscussiëren van risicofactoren en op basis daarvan tot een

eindoordeel komen – is effectief. De HCR-20 heeft een goede betrouwbaarheid en predictieve

validiteit voor gewelddadige incidenten tijdens de behandeling en voor gewelddadige recidive na

behandeling en voorspelt het risico van geweld significant beter dan de ongestructureerd klinische

inschatting. De SVR-20 heeft een goede betrouwbaarheid en predictieve validiteit voor seksuele

recidive na behandeling en voorspelt significant beter dan de zuiver actuariële inschatting van het

risico van seksueel geweld.

Samenvatting hoofdstukken In Hoofdstuk 1 wordt de state of the art betreffende risicotaxatie van toekomstig (seksueel)

gewelddadig gedrag besproken. De verschillende methoden van risicotaxatie – het ongestructureerd

klinisch oordeel, actuariële risicotaxatie en het gestructureerd klinisch oordeel – worden besproken en

met elkaar vergeleken. Verder worden specifieke risicofactoren en aandachtspunten genoemd voor de

risicotaxatie bij vier subgroepen forensisch psychiatrische patiënten: patiënten met een DSM-IV As 1

stoornis, patiënten met een DSM-IV AS II stoornis, patiënten die een seksueel delict hebben gepleegd

en vrouwelijke forensisch psychiatrische patiënten. Tot slot worden in dit hoofdstukken twee

onderwerpen besproken waar tot nu toe relatief weinig empirisch onderzoek naar is verricht; risico-

communicatie en risicomanagement.

In Hoofdstuk 2 tot en met 6 worden de empirische onderzoeken van deze dissertatie besproken. Het

doel van het prospectieve onderzoek in Hoofdstuk 2 was het bepalen van de

interbeoordelaarsbetrouwbaarheid van de HCR-20 en onderzoeken of er verschillen bestaan tussen

onderzoekers en behandelaars in het scoren van de HCR-20. De HCR-20 werd gescoord door twee

onafhankelijke onderzoekers en twee onafhankelijke clinici (een groepsleider en hoofd behandeling)

voor 60 patiënten uit de Dr. Henri van der Hoeven Kliniek. Over het geheel genomen was de

interbeoordelaarsbetrouwbaarheid van de HCR-20 scores en het eindoordeel goed. Groepsleiders

gaven significant lagere scores op de HCR-20 items dan onderzoekers. De gemiddelde HCR-20

subschaal- en totaalscores gegeven door onderzoekers en hoofden behandeling verschilden niet

significant van elkaar. Hoofden behandeling interpreteerden de scores echter anders dan onderzoekers:

zij gaven significant vaker het eindoordeel ‘laag’. Verder bleek uit dit onderzoek dat gevoelens van

groepsleiders en hoofden behandeling ten opzichte van hun patiënten van invloed waren op de

risicotaxatie. Wanneer behandelaars zich overheerst of gemanipuleerd voelden door de patiënt gaven

Samenvatting

155

zij hogere risicobeoordelingen met de HCR-20, terwijl positieve gevoelens (behulpzaam, blij,

ontspannen) gerelateerd waren aan lagere risicobeoordelingen.

In Hoofdstuk 3 wordt een retrospectief onderzoek besproken naar de

interbeoordelaarsbetrouwbaarheid en de predictieve validiteit van twee risicotaxatie instrumenten voor

seksueel geweld. De SVR-20 en de Static-99 (een actuarieel risicotaxatie instrument) werden gescoord

op basis van dossierinformatie voor 122 ex-patiënten die een seksueel delict hadden gepleegd en

tussen 1974 en 1996 opgenomen waren geweest in de Dr. Henri van der Hoeven Kliniek. De

interbeoordelaarsbetrouwbaarheid van de SVR-20 was goed, van de Static-99 uitstekend. Nadat alle

dossiers waren gescoord werden de recidivegegevens van de patiënten bij het Ministerie van Justitie

opgevraagd en gerelateerd aan de SVR-20 en Static-99 coderingen. De predictieve validiteit van de

SVR-20 voor seksuele recidive na behandeling was goed, van de Static-99 matig. Het SVR-20

eindoordeel voorspelde seksuele recidive significant beter dan de Static-99 risicocategorie. Verder had

het SVR-20 eindoordeel een significante meerwaarde ten opzichte van de SVR-20 totaalscore.

In het retrospectieve onderzoek beschreven in Hoofdstuk 4 werd de predictieve validiteit onderzocht

van de HCR-20 en de Psychopathie Checklist-Revised (PCL-R; Hare, 1991, 2003) voor gewelddadige

recidive. Beide instrumenten werden op basis van dossierinformatie gescoord voor 120 ex-patiënten

die tussen 1993 en 1999 vertrokken uit de Dr. Henri van der Hoeven Kliniek. De 120 patiënten werden

verdeeld in vier groepen van 30 naar gelang de wijze waarop zij de kliniek hadden verlaten:

1) beëindiging van de tbs maatregel door de rechter conform het advies van de kliniek en na een

transmurale fase;

2) beëindiging van de tbs maatregel door de rechter conform het advies van de kliniek zonder

voorafgaande transmurale fase;

3) beëindiging van de tbs maatregel door de rechter contrair het advies van de kliniek;

4) ter herselectie aangeboden aan een andere kliniek.

De wijze van uitstroom werd beschouwd als een ongestructureerd klinisch oordeel van het risico van

geweld. Beëindiging van de tbs maatregel door de rechter conform het advies van de kliniek en na een

transmurale fase werd gezien als het laagste risico niveau, ter herselectie aangeboden door de kliniek

als hoogste. Nadat alle dossiers waren gescoord werden de recidivegegevens van de patiënten bij het

Ministerie van Justitie opgevraagd en gerelateerd aan de HCR-20 en PCL-R coderingen. De HCR-20

en PCL-R totaalscores hadden een goede predictieve validiteit voor gewelddadige recidive na

behandeling. De HCR-20 voorspelde significant beter dan het ongestructureerd klinisch oordeel.

Daarnaast bleek de HCR-20 totaalscore een significant betere voorspeller dan de PCL-R totaalscore,

hoewel het verschil in AUC waarden niet langer significant was wanneer het item ‘Psychopathie’ uit

de HCR-20 totaalscore werd gelaten.

In het onderzoek dat beschreven wordt in Hoofdstuk 5 werden de interbeoordelaarsbetrouwbaarheid

en de predictieve validiteit onderzocht van de HCR-20 bij 42 vrouwelijke patiënten uit de Dr. Henri

van der Hoeven Kliniek. De resultaten werden vergeleken met een gematchte groep van 42 mannelijke

Samenvatting

156

patiënten De interbeoordelaarsbetrouwbaarheid van de HCR-20 was voor zowel de mannelijke als de

vrouwelijke patiënten goed. Er werden geen significante verschillen gevonden in HCR-20 subschaal

scores en totaalscores. Wel waren er verschillen op individuele HCR-20 items. Vrouwen scoorden

significant hoger op ‘Instabiliteit van relaties’ en ‘Impulsiviteit’ en significant lager op ‘Jonge leeftijd

bij eerste gewelddadige incident’, ‘Psychopathie’ en ‘Negatieve opvattingen’. De predictieve validiteit

van de HCR-20 subschaal- en totaalscores en het eindoordeel voor geweld in de groep mannelijke

patiënten was goed. Voor vrouwelijke patiënten bleek alleen het HCR-20 eindoordeel significant te

voorspellen. De HCR-20 subschaal en totaalscores waren geen significante voorspellers van geweld

bij de vrouwelijke patiënten.

Hoofdstuk 6 beschrijft het vervolg op het onderzoek in Hoofdstuk 2. Het doel van dit prospectieve

onderzoek was het bepalen van de waarde van de HCR-20 voor het voorspellen van gewelddadige

incidenten tijdens de behandeling. In dit onderzoek werd de HCR-20 gescoord door drie

onafhankelijke codeurs (onderzoekers, groepsleiders en hoofden behandeling) in een groep van 127

mannelijke patiënten opgenomen in de Dr. Henri van der Hoeven Kliniek. Tijdens

consensusbesprekingen werden de coderingen door de codeurs bediscussieerd en werd er tot

overeenstemming gekomen wat betreft itemcoderingen en het eindoordeel. Over het algemeen was de

predictieve validiteit van de HCR-20 zoals gecodeerd in de consensus goed. Het consensus

eindoordeel had een significante meerwaarde ten opzichte van de opgetelde HCR-20 scores.

Onderzoekers en hoofden behandeling waren nagenoeg even accuraat in het voorspellen van

gewelddadige incidenten met behulp van de HCR-20. De groepsleiders voorspelden wat minder goed.

Een mogelijke verklaring is dat er veel verschillende groepsleiders (N = 59) in dit onderzoek hebben

geparticipeerd en de meesten weinig ervaring hadden in het scoren van de HCR-20. De consensus

tussen groepsleiders, onderzoekers en hoofden behandeling had de hoogste predictieve validiteit. Het

consensus eindoordeel was significant beter dan van de drie groepen codeurs afzonderlijk. Verder

werd gevonden dat met name de dynamische risicofactoren (Klinische en Risicohanteringsfactoren)

goed voorspelden, vooral de factoren ‘Reageert niet op behandeling’ en ‘Blootstelling aan

destabiliserende factoren’. Historische factoren die goed voorspelden waren: ‘Jonge leeftijd bij eerste

gewelddadige incidenten’ en ‘Psychopathie’.

Hoofdstuk 7 vat de meest belangrijke bevindingen uit dit proefschrift samen. Beperkingen en sterke

punten van de onderzoeken in het proefschrift worden besproken, evenals de implicaties voor

opleiding en klinische praktijk. Tot slot worden suggesties gegeven voor toekomstig onderzoek op dit

terrein.

References

157

References

Abel, G.G., Becker, J.V., Cunningham-Rathner, J., Mittelman, M., & Rouleau, J.L. (1988). Multiple

paraphilic diagnoses among sex offenders. Bulletin of the Academy of Psychiatry and the Law, 16,

153-168.

Abram, K.M., & Teplin, L.A. (1991). Co-occurring disorders among mentally ill jail detainees:

Implications for public policy. American Psychologist, 46, 1036-1045.

Ackerman, M.J. (1999). Essentials of forensic psychological assessment. New York: Wiley.

Alexander, M.A. (1999). Sexual offender treatment efficacy revisited. Sexual Abuse: A Journal of

Research and Treatment, 11, 101-116.

American Psychiatric Association (1987). Diagnostic and Statistical Manual of Mental Disorders (3rd

ed., rev.). Washington, DC: Author.

American Psychiatric Association (1994). Diagnostic and Statistical Manual of Mental Disorders (4th

ed.). Washington, DC: Author.

Andrews, D.A., & Bonta, J. (2000). The Level of Service-Inventory-Revised. Toronto, Multi-Health

Systems.

Andrews, D.A., Bonta, J., & Hoge, R.D. (1990). Classification for effective rehabilitation.

Rediscovering psychology. Criminal Justice and Behavior, 17, 19-52.

Appelbaum, P., Robbins, P., & Monahan, J. (2000). Violence and delusions: Data from the MacArthur

violence risk assessment study. American Journal of Psychiatry, 157, 566-572.

Arbisi, P.A. (2004). Review of the HCR-20: Assessing risk for violence. From B.S. Plake, J.C. Impara,

& R.A. Spies (Eds.), The fifteenth mental measurements yearbook [Electronic version]. Retrieved

September 1, 2004, from the Buros Institute’s Test Reviews Online website:

http://www.unl.edu/buros.

Arboldea-Florez, J., & Stuart, H.L. (2000). The future for risk research. The Journal of Forensic

Psychiatry, 11, 506-509.

Archer, D., & McDaniel, P. (1995). Violence and gender. In R.B. Ruback, & N.A. Weiner (Eds.),

Interpersonal violent behaviors: Social and cultural aspects (pp. 63-87). New York, NY: Springer.

Arseneault, L., Moffitt, T., Caspi, A., Taylor, P., & Silva, P. (2000). Mental disorders and violence in a

total birth cohort: Results from the Dunedin Study. Archives of General Psychiatry, 57, 979-986.

Augimeri, L.K., Koegl, C.J., Webster, C.D., & Levene, K.S. (2001). Early Assessment Risk List for

Boys (EARL-20B): Version 2. Toronto: Earlscourt Child and Family Centre.

Barbaree, H.E., & Marshall, W.L. (1989). Erectile responses among heterosexual child molesters,

father-daughter incest offenders, and matched non-offenders: Five distinct age preferences profiles.

Canadian Journal of Behavioral Science, 21, 70-82.

Barbaree, H.E., & Seto, M.C. (1997). Pedophilia: Assessment and treatment. In: R. Laws, & E.

References

158

O’Donohue (Eds.), Sexual deviance: Theory, assessment, and treatment (pp. 175-193). New York:

Guildford Press.

Barsetti, I., Earls, C.M., Lalumière, M.L., & Bélanger, N. (1998). The differentiation of intrafamilial

and extrafamilial heterosexual child molesters. Journal of Interpersonal Violence, 13, 275-286.

Barratt, E.S., & Slaughter, L. (1996). Mental illness and violence. Current Opinion in Psychiatry, 9,

393-397.

Beck, J.C., & Wencel, H. (1998). Violent crime and Axis I psychopathology. In A.E. Skodol (Ed.),

Psychopathology and violent crime (pp. 1-27). Washington, DC: American Psychiatric Press.

Beek, D.J. van, Doncker, D. de, & Ruiter, C. de (2001). Static-99. Inschatten van het risico van

seksueel gewelddadige recidive bij volwassen seksuele delinquenten [Static-99. Assessment

of the risk of sexually violent recidivism in adult sex offenders]. Utrecht, The Netherlands: Forum

Educatief.

Belfrage, H. (1998a). Implementing the HCR-20 scheme for risk assessment in a forensic psychiatric

hospital: Integrating research and clinical practice. The Journal of Forensic Psychiatry, 9, 328-338.

Belfrage, H. (1998b). Making risk predictions without an instrument. International Journal of Law

and Psychiatry, 21, 59-64.

Belfrage, H., & Douglas, K.S. (2002). Treatment effects on forensic psychiatric patients measured

with the HCR-20 violence risk assessment scheme. International Journal of Forensic Mental

Health, 1, 25-36.

Belfrage, H., Fransson, G., & Strand, S. (2000). Prediction of violence using the HCR-20: A

prospective study in two maximum-security correctional institutions. The Journal of Forensic


Benezech, M., Bourgeois, M.L., & Yevasage, J. (1980). Violence in the mentally ill: A study of 547

patients at a French hospital for the criminally insane. Journal of Nervous and Mental Disease,

168, 698-700.

Berman, M.E., Fallon, A.E., & Coccaro, E.F. (1998). The relationship between personality

psychopathology and aggressive behavior in research volunteers. Journal of Abnormal Psychology,

107, 651-658.

Bernstein, P.L. (1996). Against the gods: The remarkable story of risk. New York: Wiley.

Binder, R.L. (1999). Are the mentally ill dangerous? Journal of the American Academy of Psychiatry

and Law, 27, 189-201.

Blackburn, R., & Coid, J.W. (1999). Empirical clusters of DSM-III personality disorders in violent

offenders. Journal of Personality Disorders, 13, 18-34.

Blanchette, K. (1994). Classifying female offenders for correctional interventions. Forum on

Corrections Research, 6, 36-41.

Bland, R., & Orn, H. (1986). Family violence and psychiatric disorder. Canadian Journal of


References

159

Blumenthal, S., & Lavender, T. (2000). Violence and mental disorder. A critical aid to the assessment

and management of risk. London: Kingsley Publishers.

Boer, D.P., Hart, S.D., Kropp, P.R., & Webster, C.D. (1997). Manual for the Sexual Violence Risk-20.

Professional guidelines for assessing risk of sexual violence. Vancouver, BC: British Columbia

Institute against Family Violence.

Bonta, J., Law, M., & Hanson, K. (1998). The prediction of criminal and violent recidivism among

mentally disordered offenders: A meta-analysis. Psychological Bulletin, 123, 123-142.

Borum, R. (1996). Improving the clinical practice of violence risk assessment: Technology, guidelines

and training. American Psychologist, 51, 945-956.

Borum, R., Bartel, P., & Forth, A. (2002). SAVRY: Manual for the Structured Assessment of Violence

Risk in Youth (Version 1: Consultation edition).Tampa, FL: Florida Mental Health Institute,

University of South Florida.

Bourgeois, M.L., & Benezech, M. (2001). Criminal dangerousness, psychopathology and psychiatric

comorbidity. Annales Medico Psychologiques, 159, 475-486.

Brown, G.R., & Anderson, B. (1991). Psychiatric morbidity in adult inpatients with childhood

histories of sexual and physical abuse. American Journal of Psychiatry, 148, 55-61.

Buchanan, A. (1997). The investigation of acting on delusions as a tool for risk assessment in the

mentally disordered. British Journal of Psychiatry, 170, 12-16.

Buchanan, A. (2001). Book review. HCR-20. Assessing risk for violence, Version 2. Criminal

Behavior and Mental Health, 11, supplement, 77-78.

Buchanan, A., Reed, A., Wessely, S., Garety, P., Taylor, P.J., Grubin, D., & Dunn, G. (1993). Acting

on delusions. II: The phenomenological correlates of acting on delusions. British Journal of


Buss, A.H., & Durkee, A. (1957). An inventory for assessing different kinds of hostility. Journal of

Consulting Psychology, 21, 343-349.

Canton, W.J., Veer, T.S. van der, Panhuis, P.J.A. van, Verheul, R., & Brink, W. van den (2004a). De

betrouwbaarheid van risicotaxatie in de pro Justitia rapportage. Een onderzoek met behulp van de

HKT-30 [The reliability of risk assessment in ‘pro Justitia’ reports. An investigation on the basis of

the HKT-30]. Tijdschrift voor Psychiatrie, 46, 537-542.

Canton, W.J., Veer, T.S. van der, Panhuis, P.J.A. van, Verheul, R., & Brink, W. van den (2004b). De

voorspellende waarde van risicotaxatie bij de rapportage pro Justitia. Onderzoek naar de HKT-30

en de klinische inschatting [The predictive validity of risk assessment in ‘pro Justitia’ reports. A

study of the HKT-30 and clinical assessment]. Tijdschrift voor Psychiatrie, 46, 525-535.

Chakhssi, F., & Hilterman, E. (2004). Psychiatric staff’s perceptions and judgment of violence risk.

Paper presented at the Fourth Annual Conference of the International Association of Forensic

Mental Health Services, Stockholm, Sweden.

Chesney-Lind, M. (1989). Girls’ crime and woman’s place: Toward a feminist model of female

References

160

delinquency. Crime and Delinquency, 35, 5-29.

Cleckley, H. (1976). The mask of sanity (5th ed.). St.Louis, MO: Mosby.

Coccaro, E.F. (1989). Central serotonin and impulsive aggression. British Journal of Psychiatry, 155,

52-62.

Coccaro, E.F., Berman, M.E., & Kavoussi, R.J. (1997). Assessment of life history of aggression:

Development and psychometric characteristics. Psychiatry Research, 73, 147-157.

Coid, J.W. (2000). Axis II disorders and motivation for serious criminal behavior. In A.E. Skodol

(Ed.), Psychopathology and violent crime (pp. 53-97). Washington, DC: American Psychiatric

Press.

Coid, J.W. (2003). The co-morbidity of personality disorder and lifetime clinical syndromes in

dangerous offenders. The Journal of Forensic Psychiatry and Psychology, 14, 341-366.

Coid, J., Kahtan, N., Gault, S., & Jarman, B. (1999). Patients with personality disorder admitted to

secure forensic psychiatric services. British Journal of Psychiatry, 175, 528-536.

Cooke, D., & Michie, C. (2001). Refining the construct of psychopathy: Towards a hierarchical

model. Psychological Assessment, 13, 171-188.

Cooper, C. (2004). Review of the HCR-20: Assessing risk for violence. From B.S. Plake, J.C. Impara,

& R.A. Spies (Eds.), The fifteenth mental measurements yearbook [Electronic version]. Retrieved

September 1, 2004, from the Buros Institute’s Test Reviews Online website:

http://www.unl.edu/buros.

Costello, E.J., Edelbrock, C.S., Duncan, M.K., & Kalas, R. (1984). Testing of the NIMH Diagnostic

Interview Schedule for Children (DISC) in a clinical population: Final report to the Center for

Epidemiological studies, NIMH. Pittsburgh: University of Pittsburgh.

Cote, G., & Hodgins, S. (1992). The prevalence of major mental disorders among homicide offenders.

International Journal of Law and Psychiatry, 15, 89-99.

Crick, N.R., & Grotpeter, J.K. (1995). Relational aggression, gender, and social-psychological

adjustment. Child Development, 66, 710-722.

Dawes, R.M., Faust, D., & Meehl, P.E. (1989). Clinical versus actuarial judgment. Science, 243, 1668-

1674.

DeLong, E.R., DeLong, D.M., & Clarke-Pearson, D.L. (1988). Comparing the Areas Under two or

more correlated Receiver Operating Characteristics Curves: A nonparametric approach. Biometrics,

44, 837-854.

Dempster, R.J. (1998). Prediction of sexually violent recidivism: A comparison of risk assessment

instruments. Unpublished master’s thesis, Simon Fraser University, Vancouver, British Columbia,

Canada.

Dempster, R.J., & Hart, S.D. (2002). The relative utility of fixed and variable risk factors in

discriminating sexual recidivists and nonrecidivists. Sexual abuse: A Journal of Research

and Treatment, 14, 121-138.

References

161

Dernevik, M., & Douglas, K.S. (2002). The role of context and training in the accuracy of risk factors.

Paper presented at the 12th European Conference on Psychology and Law. Leuven, Belgium:

September 14-17.

Dernevik, M., Falkheim, M., Holmqvist, R., & Sandell, R. (2001). Implementing risk assessment

procedures in a forensic psychiatric setting: Clinical judgement revisited. In D.P. Farrington,

C.R. Hollin, & M. McMurran (Eds.), Sex and violence: The psychology of crime and risk

assessment (pp. 83-101). London: Routledge.

Dernevik, M., Grann, M., & Johansson, S. (2002). Violent behaviour in forensic psychiatric patients:

Risk assessment and different risk-management levels using the HCR-20. Psychology, Crime and

Law, 8, 93-111.

Doren, D.M. (1998). Recidivism base rates, predictions of sex offender recidivism, and the ‘sexual

predator’ commitment laws. Behavioral Sciences and the Law, 16, 97-114.

Dougherty, D.M., Bjork, J.M., Huckabee, H.C.G., Moeller, F.G., & Swann, A.C. (1999). Laboratory

measures of aggression and impulsivity in women with borderline personality disorder. Psychiatry

Research, 85, 315-326.

Douglas, K.S., & Belfrage, H. (2001). Use of the HCR-20 in violence risk management:

Implementation and clinical practice. In K.S. Douglas, C.D. Webster, S.D. Hart, D. Eaves, &

J.R.P. Ogloff (Eds.), HCR-20 violence risk management companion guide (pp. 41-58). Vancouver:

Mental Health, Law, and Policy Institute.

Douglas, K.S., Cox, D.N., & Webster, C.D. (1999). Violence risk assessment: Science and practice.

Legal and Criminological Psychology, 4, 149-184.

Douglas, K.S., & Kropp, P.R. (2002). A prevention-based paradigm for violence risk assessment.

Criminal Justice and Behavior, 29, 617-658.

Douglas, K.S., Ogloff, J.R.P., & Hart, S.D. (2003). Evaluation of a model of violence risk assessment

among forensic psychiatric patients. Psychiatric Services, 54, 1372-1379.

Douglas, K.S., Ogloff, J.R.P., Nicholls, T.L., & Grant, I. (1999). Assessing risk for violence among

psychiatric patients: The HCR-20 violence risk assessment scheme and the Psychopathy

Checklist: Screening Version. Journal of Consulting and Clinical Psychology, 67, 917-930.

Douglas, K.S., & Webster, C.D. (1999a). Predicting violence in mentally and personality disordered

individuals. In R. Roesch, S.D. Hart, & J.R.P. Ogloff (Eds.), Psychology and law: The state of

discipline (pp. 175-239). New York: Kluwer Academic.

Douglas, K.S., & Webster, C.D. (1999b). The HCR-20 violence risk assessment scheme. Concurrent

validity in a sample of incarcerated offenders. Criminal Justice and Behavior, 26, 3-19.

Douglas, K.S., Webster, C.D., Hart, S.D., Eaves, D., & Ogloff, J.R.P. (Eds.). (2001). HCR-20 violence

risk management companion guide. Vancouver, British Columbia, Canada: Mental Health, Law,

and Policy Institute, Simon Fraser University.

Douglas, K.S., Guy, L.S., & Weir, J. (2005). HCR-20 violence risk assessment scheme: Overview and

References

162

annotated bibliography. Available: http://www.sfu.ca/psychology/groups/faculty/hart/violink.htm.

Douglas, K.S., Yeomans, M., & Boer, D.P. (in press). Comparative validity analyses of multiple

measures of violence risk in a sample of criminal offenders. Criminal Justice and Behavior.

Dr. Henri van der Hoeven Stichting (2003). Jaarverslag 2003 [Year report 2003]. Utrecht, The

Netherlands: Dr. Henri van der Hoeven Stichting.

Dvoskin, J.A., & Heilbrun, K. (2001). Risk assessment and release decision making: Toward resolving

the great debate. The Journal of American Academy of Psychiatry and the Law, 29, 6-10.

Edens, J.F. (2001). Misuses of the Hare Psychopathy Checklist-Revised in court. Two case examples.

Journal of Interpersonal Violence, 16, 1082-1093.

Elbogen, E.B., Calkins Mercado, C., Scalora, M.J., & Tomkins, A.J. (2002). Perceived relevance of

factors for violence risk assessment: A survey of clinicians. International Journal of Forensic

Mental Health, 1, 37-47.

Else, L.T., Wonderlich, S.A., Beatty, W.W., Christie, D.W., & Staton, R.D. (1993). Personality

characteristics of men who physically abuse women. Hospital and Community Psychiatry, 44, 54-

58.

Emmerik, J.L. van, & Brouwers, M. (2001). De terbeschikkingstelling in maat en getal: Een

beschrijving van de tbs-populatie in de periode 1995-2000 [The tbs-order in numbers and figures.

A description of the tbs population in the period 1995-2000]. Ministerie van Justitie, The Hague,

The Netherlands: Dienst Justitiële Inrichtingen.

English, K. (1993). Self-reported crime rates of women prisoners. Journal of Quantitative

Criminology, 9, 357-381.

Erkel, A.R. van, & Pattynama, P.M. (1998). Receiver Operating Characteristic (ROC) analysis: Basic

principles and applications in radiology. European Journal of Radiology, 27, 88-94.

Eronen, M., Angermeyer, M.C., & Schulze, B. (1998). The psychiatric epidemiology of violent

behavior. Social Psychiatry and Psychiatric Epidemiology, 33, supplement 1, 13-23.

Eronen, M., Hakola, P., & Tiihonen, J. (1996). Mental disorders and homicidal behavior in Finland.

Archives of General Psychiatry, 53, 497-501.

Estroff, S.E., Swanson, J.W., Lachiotte, W.S., Swartz, M., & Bolduc, M. (1998). Risk reconsidered:

Targets of violence in the social networks of people with serious psychiatric disorders. Social

Psychiatry and Psychiatric Epidemiology, 33, supplement 1, 95-101.

Estroff, S.E., & Zimmer, C. (1994). Social networks, social support and violence among persons with

severe, persistent mental illness. In J. Monahan, & H.J. Steadman (Eds.), Violence and mental

disorder (pp. 259-295). Chicago: The University of Chicago Press.

Estroff, S.E., Zimmer, C., Lachiotte, W.S., & Benoit, J. (1994). The influence of social networks and

social support on violence by persons with serious mental illness. Hospital and Community


Faust, D., & Ziskin, J. (1988). The expert witness in psychology and psychiatry. Science, 241, 31-35.

References

163

Firestone, P., Bradford, J.M., McCoy, M., Greenberg, D.M., Curry, S., & Larose, M.R. (2000).

Prediction of recidivism in extrafamilial child molesters based on court-related assessments. Sexual

Abuse: A Journal of Research and Treatment, 12, 203-221.

Firestone, P., Bradford, J.M., McCoy, M., Greenberg, D.M., Larose, M.R., & Curry, S. (1999).

Prediction of recidivism in incest offenders. Journal of Interpersonal Violence, 14, 511-531.

Fleiss, J.L. (1986). The design and analysis of clinical experiments. New York: Wiley.

Fuller, J., & Cowan, J. (1999). Risk assessment in a multi-disciplinary forensic setting:

Clinical judgement revisited. The Journal of Forensic Psychiatry, 10, 276-289.

Funk, S.J. (1999). Risk assessment for juveniles on probation. A focus on gender. Criminal

Justice and Behavior, 26, 44-68.

Furby, L., Weinrott, M.R., & Blackshaw, L. (1989). Sex offender recidivism: A review. Psychological

Bulletin, 105, 3-30.

Garb, H.N. (1998). Studying the clinician. Washington, DC: American Psychological Association.

Gardner, D.L., Leibenluft, E., O’Leary, K.M., & Cowdry, R.W. (1991). Self-ratings of anger and

hostility in borderline personality disorder. Journal of Nervous and Mental Disease, 179, 157-161.

Gardner, W., Lidz, C.W., Mulvey, E.P., & Shaw, E.C. (1996). Clinical versus actuarial predictions of

violence in patients with mental illnesses. Journal of Consulting and Clinical Psychology, 64, 1-8.

Gendreau, P., & Andrews, D.A. (1990). What the meta-analyses of the offender treatment literature

tells us about “what works”. Canadian Journal of Criminology, 32, 173-184.

Ghandi, N., Tyrer, P., Evans, K., McGee, A., Lamont, A., & Harrison-Read, P. (2001). A randomized

controlled trial of community-oriented and hospital-oriented care for discharged psychiatric

patients: Influence of personality disorder on police contacts. Journal of Personality Disorders, 15,

94-102.

Grann, M. (2000). The PCL-R and gender. European Journal of Psychological Assessment, 16,

147-149.

Grann, M., Långström, N., Tengström, A., & Kullgren, G. (1999). Psychopathy (PCL-R) predicts

violent recidivism among criminal offenders with personality disorders in Sweden. Law and

Human Behavior, 23, 205-216.

Grann, M., Långström, N., Tengström, A., & Stålenheim, E.G. (1998). The reliability of file-based

retrospective ratings of psychopathy with the PCL-R. Journal of Personality Assessment, 70, 416-

426.

Grann, M., & Pallvik, A. (2002). An empirical investigation of written risk communication in

forensic psychiatric evaluations. Psychology, Crime and Law, 8, 113-130.

Gray, N.S., Hill, C., McGleish, A., Timmons, D., MacCulloch, M.J., & Snowden, R.J. (2003).

Prediction of violence and self-harm in mentally disordered offenders: A prospective study of the

efficacy of HCR-20, PCL-R, and psychiatric symptomatology. Journal of Consulting and Clinical

Psychology, 71, 443-451.

References

164

Green, B., Pedley, R., & Whittingham, D. (2004). A structured clinical model for violence risk

intervention. International Journal of Law and Psychiatry, 27, 349-359.

Greenberg, D.M. (1998). Sexual recidivism in sex offenders. Canadian Journal of Psychiatry, 43,

459-465.

Greenberg, S., & Shuman, D. (1997). Irreconcilable conflict between therapeutic and forensic roles.

Professional Psychology: Research and Practice, 28, 50-57.

Grisso, T., & Appelbaum, P. (1991). Is it unethical to offer predictions of future violence? Law and


Groth, A.N., Longo, R.E., & McFadin, J.B. (1982). Undetected recidivism among rapists and child

molesters. Crime and Delinquency, 28, 450-458.

Grove, W.M., & Meehl, P.E. (1996). Comparative efficiency of informal (subjective, impressionistic)

and formal (mechanical, algorithmic) prediction procedures: The clinical-statistical controversy.

Psychology, Public Policy, and Law, 2, 293-323.

Grove, W.M., Zald, D.H., Lebow, B.S., Snitz, B.E., & Nelson, C. (2000). Clinical versus mechanical

prediction: A meta-analysis. Psychological Assessment, 12, 19-30.

Grubin, D. (1997a). Inferring predictors of risk: Sex offenders. International Review of Psychiatry, 9,

225-231.

Grubin, D. (1997b). Predictors of risk in serious offenders. British Journal of Psychiatry, 170, 17-21.

Grubin, D. (1998). Sex offending against children: Understanding the risk. Police research series

paper 99. London: Home Office.

Grubin, D., & Wingate, S. (1996). Sexual offence recidivism: Prediction versus understanding.

Criminal Behaviour and Mental Health, 6, 349-359.

Hagan, M.P., & Gust-Brey, K.L. (1999). A ten-year longitudinal study of adolescent rapists upon

return to the community. International Journal of Offender Therapy and Comparative

Criminology, 43, 448-458.

Hanson, R.K. (1997). The development of a brief actuarial risk scale for sexual offense recidivism.

(User Report No. 97-04). Ottawa: Department of the Solicitor General of Canada.

Hanson, R.K., & Bussière, M.T. (1998). Predicting relapse: A meta-analysis of sexual offender

recidivism studies. Journal of Consulting and Clinical Psychology, 66, 348-362.

Hanson, R.K., Gizzarelli, R., & Scott, H. (1994). The attitudes of incest offenders: Sexual entitlement

and acceptance of sex with children. Criminal Justice and Behavior, 21, 187-202.

Hanson, R.K., & Harris, A.J.R. (2000). Where should we intervene? Dynamic predictors of sexual

offense recidivism. Criminal Justice and Behavior, 27, 6-35.

Hanson, R.K., & Morton-Bourgon, K. (2004). Predictors of sexual recidivism: An updated meta-

analysis. Ottawa: Public Works and Government Services Canada.

Hanson, R.K., Steffy, R.A., & Gauthier, R. (1993). Long-term recidivism of child molesters. Journal

of Consulting and Clinical Psychology, 61, 646-652.

References

165

Hanson, R.K., & Thornton, D. (1999). Static-99: Improving actuarial risk assessments for sex

offenders (User report No.1999-02). Ottawa: Department of the Solicitor General of Canada.

Hanson, R.K., & Thornton, D. (2000). Improving risk assessment for sex offenders: A comparison of

three actuarial scales. Law and Human Behavior, 24, 119-136.

Hanson, R.K., & Thornton, D. (2002). Notes on the development of the Static-2002. Ottawa: Public

Works and Government Services Canada.

Hare, R.D. (1980). A research scale for the assessment of psychopathy in criminal populations.

Personality and Individual Differences, 1, 111-119.

Hare, R.D. (1991). Manual for the Hare Psychopathy Checklist-Revised. Toronto, Ontario: Multi-

Health Systems.

Hare, R.D. (1998a). The Hare PCL-R: Some issues concerning its use and misuse. Legal and

Criminological Psychology, 3, 99-119.

Hare, R.D. (1998b). Psychopathy, affect and behavior. In D.J. Cooke, A.E. Forth, & R.D. Hare

(Eds.), Psychopathy: Theory, research and implications for society (pp. 105-137). Dordrecht:

Kluwer.

Hare, R.D. (2003). Hare Psychopathy Checklist-Revised Second Edition. Technical Manual.

Toronto, Ontario: Multi-Health Systems.

Hare, R.D., Clark, D., Grann, M., & Thornton, D. (2000). Psychopathy and the predictive validity of

the PCL-R: An international perspective. Behavioral Sciences and the Law, 18, 623-645.

Harer, M.D., & Langan, N.P. (2001). Gender differences in predictors of prison violence: Assessing

the predictive validity of a classification system. Crime and Delinquency, 47, 513-536.

Harris, A.J.R., & Hanson, R.K. (2004). Sex offender recidivism: A simple question. Ottawa: Public

Works and Government Services Canada.

Harris, A.J.R., Phenix, A., Hanson, R.K., & Thornton, D. (2003). Static-99 coding rules Revised –

2003. Available: http://www.sgc.gc.ca

Harris, G.T., & Rice, M.E. (1996). The science in phallometric measurement of male sexual interest.

Current Directions in Psychological Science, 5, 156-160.

Harris, G.T., & Rice, M.E. (1997). Risk appraisal and management of violent behavior. Psychiatric

Services, 48, 1166-1176.

Harris, G.T., Rice, M.E., & Cormier, C.A. (1993). Violent recidivism of mentally disordered

offenders: The development of a statistical prediction instrument. Criminal Justice and Behavior,

20, 315-335.

Harris, G.T., Rice, M.E., & Cormier, C.A. (2002). Prospective replication of the Violence Risk

Appraisal Guide in predicting violent recidivism among forensic patients. Law and Human

Behavior, 26, 377-394.

Harris, G.T., Rice, M.E., & Quinsey, V.L. (1998). Appraisal and management of risk in sexual

aggressors: Implications for criminal justice policy. Psychology, Public Policy, and Law, 4, 73-

References

166

115

Hart, S.D. (1998a). Psychopathy and the risk for violence. In D.J. Cooke, A.E., Forth, & R.D. Hare


Kluwer.

Hart, S.D. (1998b). The role of psychopathy in assessing risk for violence: Conceptual and

methodological issues. Legal and Criminological Psychology, 3, 121-137.

Hart, S.D. (2001a). Assessing and managing violence risk. In K.S. Douglas, C.D. Webster, S.D. Hart,

D. Eaves, & J.R.P. Ogloff (Eds.), HCR-20 violence risk management companion guide (pp. 13-

25). Vancouver, British Columbia, Canada: Mental Health, Law, and Policy Institute, Simon

Fraser University.

Hart, S.D. (2001b). Risk assessment: Possibilities and impossibilities. Retrieved December, 8, 2003

from Simon Fraser University Website: http://www.sfu.ca/psychology/groups/faculty/hart/

violink.htm.

Hart, S.D. (2001c). HCR-20 work sheet. Vancouver, BC, Canada: Simon Fraser University, Mental

Health, Law, and Policy Institute.

Hart, S.D. (Ed.). (2002). Swedish studies on Psychology, Crime and Law [Special issue]. Psychology,

Crime and Law, 8.

Hart, S.D., Cox, D., & Hare, R.D. (1995). The Hare Psychopathy Checklist: Screening Version

(PCL:SV). Toronto, Ontario, Canada: Multi-Health Systems.

Hart, S.D., Dutton, D.G., & Newlove, T. (1993). The prevalence of personality disorder among wife

assaulters. Journal of Personality Disorders, 7, 329-341.

Hart, S.D., Kropp, P.R., Laws, D.R., Klaver, J., Logan, C., & Watt, K.A. (2003). The risk for sexual

violence protocol. Burnaby, BC: Simon Fraser University.

Hart, S.D., Laws, D.R., & Kropp, P.R. (2003). The promise and peril of sex offender risk assessment.

In T. Ward, D.R. Laws, & S.M. Hudson (Eds.), Sexual deviance. Issues and controversies (pp. 207-

225). Londen: Sage Publications.

Hart, S.D., Webster, C.D., & Douglas, K.S. (2001). Risk management using the HCR-20: A general

overview focusing on historical factors. In K.S. Douglas, C.D. Webster, S.D. Hart, D. Eaves, &

J.R.P. Ogloff (Eds.), HCR-20 violence risk management companion guide (pp. 13-25). Vancouver,

British Columbia, Canada: Mental Health, Law, and Policy Institute, Simon Fraser University.

Heilbrun, K. (1997). Prediction versus management models relevant to risk assessment: The

importance of legal decision-making context. Law and Human Behavior, 21, 347-359.

Heilbrun, K., Dvoskin, J., Hart, S.D., & McNiel, D.E. (1999). Violence risk communication:

Implications for research, policy, and practice. Health, Risk and Society, 1, 91-106.

Heilbrun, K., O’Neill, M., Strohman, L., Bowman, Q., & Philipson, J. (2000). Expert approaches to

communicating violence risk. Law and Human Behavior, 24, 137-148.

Heilbrun, K., Philipson, J., Berman, L., & Warren, J. (1999). Risk communication: Clinicians'

References

167

reported approaches and perceived values. Journal of the American Academy of Psychiatry

and Law, 27, 397-406.

Hemphill, J.F., Templeman, R., Wong, S., & Hare, R.D. (1998). Psychopathy and crime: Recidivism

and criminal careers. In D.J. Cooke, A.E. Forth, & R.D. Hare (Eds.), Psychopathy: Theory,

research and implications for society (pp. 375-399). Dordrecht: Kluwer.

Hersh, K., & Borum, R. (1998). Command hallucinations, compliance, and risk assessment. Journal of

the American Academy of Psychiatry and Law, 26, 353-359.

Hiday, V.A. (1995). The social context of mental illness and violence. Journal of Health and Social

Behavior, 36, 122-137.

Hiday, V.A., Swanson, J.W., Swartz, M.S., Borum, R., & Wagner, H.R. (2001). Victimization: A link

between mental illness and violence? International Journal of Law and Psychiatry, 24, 559-572.

Hildebrand, M. (2004). Psychopathy in the treatment of forensic psychiatric patients. Assessment,

prevalence, predictive validity, and clinical implications. Amsterdam: Dutch University Press.

Hildebrand, M., & Ruiter, C. de (2004). PCL-R psychopathy and its relation to DSM-IV Axis I and

Axis II disorders in a sample of male forensic psychiatric patients in the Netherlands. International

Journal of Law and Psychiatry, 27, 233-248.

Hildebrand, M., Ruiter, C. de, & Beek, D. van (2001). SVR-20. Richtlijnen voor het beoordelen

van het risico van seksueel gewelddadig gedrag [SVR-20. Guidelines for the assessment of

risk of sexual violence]. Utrecht, The Netherlands: Forum Educatief.

Hildebrand, M., Ruiter, C. de, & Nijman, H. (2004). PCL-R psychopathy predicts disruptive behavior

among male offenders in a Dutch forensic psychiatric hospital. Journal of Interpersonal Violence,

19, 13-29.

Hildebrand, M., Ruiter, C. de, & Vogel, V. de (2004). Psychopathy and sexual deviance in treated

rapists: Association with (sexual) recidivism. Sexual Abuse: A Journal of Research and Treatment,

16, 1-24.

Hildebrand, M., Ruiter, C. de, Vogel, V. de, & Wolf, P. van der (2002). Reliability and factor structure

of the Dutch language version of Hare’s Psychopathy Checklist-Revised. International Journal of

Forensic Mental Health, 1, 139-154.

Hilterman, E. (2001). Statische vergelijking van tijdens verlof recidiverende en niet-recidiverende tbs-

gestelden [Static comparison of during leave recidivating and non-recidivating tbs patients].

Proces, 80, 121-127.

Hilterman, E. (2004). Back to square one revisited: Assessment of recidivism during leave by Dutch

forensic psychiatric patients. Paper presented at the fourth annual conference of the International

Association of Forensic Mental Health Services, Stockholm, Sweden.

Hilton, N.Z., & Simmons, J.L. (2001). The influence of actuarial risk assessment in clinical judgments

and tribunal decisions about mentally disordered offenders in maximum security. Law and Human

Behavior, 25, 393-408.

References

168

Hochstedler Steury, E., & Choinski, M. (1995). "Normal" crimes and mental disorder: A two-group

comparison of deadly and dangerous felonies. International Journal of Law and Psychiatry, 18,

183-207.

Hodgins, S. (1992). Mental disorder, intellectual deficiency and crime: Evidence from a birth cohort.

Archives of General Psychiatry, 49, 476-483.

Hodgins, S. (1997). An overview of research on the prediction of dangerousness. Nordic Journal of


Hodgins, S. (1998). Epidemiological investigations of the associations between major mental disorders

and crime: Methodological limitations and validity of the conclusions. Social Psychiatry and

Psychiatric Epidemiology, 33, supplement 1, 29-37.

Hodgins, S. (2001). The major mental disorders and crime: Stop debating and start treating and

preventing. International Journal of Law and Psychiatry, 24, 427-446.

Hodgins, S. (2002). Research priorities in forensic mental health. International Journal of Forensic


Hodgins, S., & R. Müller-Isberner (2000). (Eds.). Violence, crime and mentally disordered offenders.

Concept and methods for effective treatment and prevention. Chichester: Wiley

Hodgins, S., Mednick, S.A., Brennan, P.A., Schulsinger, F., & Engberg, M. (1996). Mental disorder

and crime: Evidence from a Danish birth cohort. Archives of General Psychiatry, 53, 489-496.

Hollin, C. (1999). Treatment programs for offenders. Meta-analysis, “what works” and beyond.


Holmqvist, R., & Armelius, B.A. (1994). Emotional reactions to psychiatric patients. Acta

Psychiatrica Scandinavica, 90, 204-209.

Holmqvist, R., & Fogelstam, H. (1996). Psychological climate and countertransference in psychiatric

treatment homes. Acta Psychiatrica Scandinavica, 93, 288-295.

Hoptman, M.J., Yates, K.F., Patalinjug, M.B., Wack, R.C., & Convit, A. (1999). Clinical prediction of

assaultive behavior among male psychiatric patients at a maximum-security forensic facility.

Psychiatric Services, 50, 1461-1466.

Howells, K., Watt, B., Hall, G., & Baldwin, S. (1997). Developing programmes for violent offenders.

Legal and Criminological Psychology, 2, 117-128.

Huss, M.T., & Zeiss, R.A. (2004). Clinical assessment of violence from inpatient records: A

comparison of individual and aggregate decision making across risk strategies. International

Journal of Forensic Mental Health, 3, 139-147.

Hyler, S.E., Rieder, R.O., Williams, J.B.W., Spitzer, R.L., Hendler, J., & Lyons, M. (1988). The

Personality Diagnostic Questionnaire: Development and preliminary results. Journal of Personality

Disorders, 2, 229-237.

Johnson, H., & Sacco, V.F. (1995). Researching violence against women: Statistics Canada’s national

survey. Canadian Journal of Criminology, 37, 281-304.

References

169

Johnson, J.G., Cohen, P., Smailes, E., Kasen, S., Oldham, J.M., Skodol, A.E., & Brook, J.S. (2000).

Adolescent personality disorders associated with violence and criminal behavior during

adolescence and early adulthood. American Journal of Psychiatry, 157, 1406-1412.

Junginger, J. (1990). Predicting compliance with command hallucinations. American Journal of


Junginger, J. (1995). Command hallucinations and the prediction of dangerousness. Psychiatric

Services, 46, 911-914.

Kausch, O., & Resnick, P.J. (1999). Psychiatric assessment of the violent offender. In V.B. van

Hasselt, & H. Hersen (Eds.), Handbook of psychological approaches with violent offenders:

Contemporary strategies and issues (pp. 439-457). New York: Kluwer Academic / Plenum.

Klassen, D., & O’Connor, W. (1988). Crime, inpatient admission and violence among male mental

patients. International Journal of Law and Psychiatry, 11, 305-312.

Klassen, D., & O’Connor, W. (1990). Assessing the risk of violence in released mental patients: A

cross validation study. Psychological Assessment: A Journal of Consulting and Clinical

Psychology, 1, 75-81.

Krakowski, M., Volavka, J., & Brizer, D. (1986). Psychopathology and violence: A review of

literature. Comprehensive Psychiatry, 27, 131-148.

Krauss, D.A., Sales, B.D., Becker, J.V., & Figuered, A.J. (2000). Beyond prediction to explanation in

risk assessment research. A comparison of two explanatory theories of criminality and recidivism.


Kropp, P.R., & Hart, S.D. (1997). Assessing risk of violence in wife assaulters: The Spousal Assault

Risk Assessment (SARA) guide. In C.D. Webster, & A. Jackson (Eds.), Impulsivity. Theory,

assessment and treatment (pp. 302-325). New York: The Guilford Press.

Kropp, P.R., & Hart, S.D. (2000). The Spousal Assault Risk Assessment (SARA) guide: Reliability

and validity in adult male offenders. Law and Human Behavior, 24, 101-118.

Kropp, P.R., Hart, S.D., & Lyon, D.R. (2002). Risk assessment of stalkers. Some problems

and possible solutions. Criminal Justice and Behavior, 29, 590-616.

Kropp, P.R., Hart, S.D., Webster, C.D., & Eaves, D. (1999). Manual for the Spousal Assault Risk

Assessment Guide (Version 3). Vancouver: British Columbia Institute against Family Violence.

Langton, C.M. (2003). Contrasting approaches to risk assessment with adult male sexual offenders:

An evaluation of recidivism prediction schemes and the utility of supplementary clinical

information for enhancing predictive accuracy. Unpublished master’s thesis, University of

Toronto, Toronto, Canada.

Laws, D.R. (2003). Penile plethysmography: Will we ever get it right? In T. Ward, D.R. Laws, & S.M.

Hudson (Eds.), Sexual deviance. Issues and controversies (pp. 82-102). Londen: Sage Publications.

Laws, D.R. (2004). The Risk for Sexual Violence Protocol (RSVP): An introduction. Keynote

presentation at the Tools to Take Home Conference, Birmingham, UK.

References

170

Laws, D.R., Hudson, S.M., & Ward, T. (2000). Remaking relapse prevention: A sourcebook. Londen:

Sage Publications.

Layde, J.B. (2004). General instruments for risk assessment. Current Opinions in Psychiatry, 17, 401-

405.

Levene, K.S., Augimeri, L.K., Pepler, D.J., Walsh, M.M., Webster, C.D., & Koegl, C.J. (2001).

Early Assessment Risk List for Girls: EARL-21G. Version 1 - Consultation version. Toronto,

Ontario, Canada: Earlscourt Child and Family Centre.

Lidz, C.W., Mulvey, E.P., & Gardner, W. (1993). The accuracy of predictions of violence to others.

Journal of the American Medical Association, 269, 1007-1011.

Lindqvist, P., & Allebeck, P. (1990). Schizophrenia and crime. A longitudinal follow-up of 644

schizophrenics in Stockholm. British Journal of Psychiatry, 157, 345-350.

Link, B.G., Andrews, A., & Cullen, F. (1992). The violent and illegal behavior of mental patients

reconsidered. American Sociological Review, 57, 275-292.

Link, B.G., & Stueve, A. (1994). Psychotic symptoms and the violent / illegal behavior of mental

patients compared to community controls. In J. Monahan, & H.J. Steadman (Eds.), Violence and

mental disorder (pp. 137-159). Chicago: The University of Chicago Press.

Link, B.G., Stueve, A., & Phelan, J. (1998). Psychotic symptoms and violent behaviors: Probing the

components of 'threat / control-override' symptoms. Social Psychiatry and Psychiatric

Epidemiology, 33, supplement 1, 55-60.

Links, P.S., Steiner, M., Offord, D.R., & Eppel, A. (1988). Characteristics of borderline personality

disorder: A Canadian study. Canadian Journal of Psychiatry, 33, 336-354.

Litwack, T.R. (2001). Actuarial versus clinical assessments of dangerousness. Psychology, Public

Policy, and Law, 7, 409-443.

Litwack, T.R. (2002). Some questions for the field of violence risk assessment and forensic mental

health: Or, ‘Back to basics’ revisited. International Journal of Forensic Mental Health, 1, 171-178.

Litwack, T.R., & Schlesinger, L.B. (1999). Dangerousness risk assessments: Research, legal, and

clinical considerations. In A.K Hess, & I.B. Weiner (Eds.), The handbook of forensic psychology

(pp. 171-217). New York: Wiley.

Logan, C., & Watt, K. (2001). Structured professional guidelines approaches to risk assessment:

Single practitioner vs. multidisciplinary team administration. Paper presented at the International

Conference Violence Risk Assessment and Management: Bridging science and practice closing

together, Sundsvall, Sweden.

Logan, C. (2004). Les femmes fatales: Treating psychopathic women. Paper presented at the Bergen

International Conference on the Treatment of Psychopathy, Bergen, Norway.

Lösel, F. (1998). Treatment and management of psychopaths. In D.J. Cooke, A.E. Forth, & R.D. Hare


Kluwer.

References

171

Loucks, A.D., & Zamble, E. (1999). Canada searches for predictors common to both men and

women. Corrections Today, 61, 26-32.

Magdol, L., Moffitt, T.E., Caspi, A., Newman, D.L., Fagan, J., & Silva, P.A. (1997). Gender

differences in partner violence in a birth cohort of 21 years olds: Bridging the gap between clinical

and epidemiological approaches. Journal of Consulting and Clinical Psychology, 65, 68-78.

Marshall, W.L., & Fernandez, Y.M. (2000). Phallometric testing with sexual offenders: Limits to its

value. Clinical Psychology Review, 20, 807-822.

McGrath, R.J. (1991). Sex offender risk assessment and disposition planning: A review of empirical

and clinical findings. International Journal of Offender Therapy and Comparative psychology, 35,

328-350.

McGraw, K.O., & Wong, S.P. (1996). Forming inferences about some intraclass correlation

coefficients. Psychological Methods, 1, 30-46.

McNiel, D.E. (1994). Hallucinations and violence. In J. Monahan, & H.J. Steadman (Eds.), Violence

and mental disorder (pp.183-202). Chicago: The University of Chicago Press.

McNiel, D.E., & Binder, R.L. (1990). The relationship of gender to violent behavior in acutely

disturbed psychiatric patients. Journal of Clinical Psychiatry, 51, 110-114.

McNiel, D.E., & Binder, R.L. (1995). Correlates of accuracy in the assessment of psychiatric

inpatients' risk of violence. American Journal of Psychiatry, 152, 901-906.

McNiel, D.E., Gregory, A., Lam, J., Binder, R.L., & Sullivan, G. (2003). Utility of decision support

tools for assessing acute risk of violence. Journal of Consulting and Clinical Psychology, 71, 945-

953.

McNiel, D.E., Lam, J., & Binder, R.L. (2000). Relevance of interrater agreement to violence risk

assessment. Journal of Consulting and Clinical Psychology, 68, 1111-1115.

McNiel, D.E., Sandberg, D.A., & Binder, R.L. (1998). The relationship between confidence and

accuracy in clinical assessment of psychiatric patients’ potential for violence. Law and


Meehl, P.E. (1954). Clinical versus statistical prediction. Minneapolis, MN: University of Minnesota

Press.

Meehl, P.E. (1996). Clinical versus statistical prediction: A theoretical analysis and a review of the

literature. Northvale, NJ: Jason Aronson. (Original work published in 1954).

Melton, G., Petrila, J., Poythress, N., & Slobogin, C. (1997). Psychological evaluations for the courts:

A handbook for mental health professionals and lawyers. Second edition. New York: Guilford.

Mertens, N.M., Grapendaal, M., & Docter-Schamhardt, B.J.W. (1998). Delinquency in girls in

The Netherlands (report No. 169). The Hague, The Netherlands: Ministry of Justice, Scientific

Research and Documentationcentre.

Milton, J., Amin, S., Singh, S.P., Harrison, G., Jones, P., Croudace, T., Medley, I., & Brewin, J.

(2001). Aggressive incidents in first-episode psychosis. British Journal of Psychiatry, 178, 433-

References

172

440.

Modestin, J., & Ammann, R. (1995). Mental disorders and criminal behavior. British Journal

of Psychiatry, 166, 667-675.

Monahan, J. (1981). The clinical prediction of violent behavior. Rockville, MD: National Institute of

Mental Health.

Monahan, J. (1984). The prediction of violent behavior: Toward a second generation of theory and

policy. American Journal of Psychiatry, 141, 10-15.

Monahan, J. (1988). Risk assessment of violence among the mentally disordered: Generating

useful knowledge. International Journal of Law and Psychiatry, 11, 249-257.

Monahan, J. (1992). Mental disorder and violent behavior. American Psychologist, 47, 511-521.

Monahan, J. (1996). Violence prediction. The past twenty and the next twenty years. Criminal

Justice and Behavior, 23, 107-120.

Monahan, J. (2001). Major mental disorder and violence: Epidemiology and risk assessment.

In G.F. Pinard, & L. Pagani (Eds.), Clinical assessment of dangerousness. Empirical contributions

(pp. 89-102). Cambridge: Cambridge University Press.

Monahan, J., Heilbrun, K., Silver, E., Nabors, E., Bone, J., & Slovic, P. (2002). Communicating

violence risk: Frequency formats, vivid outcomes, and forensic settings. International Journal of

Forensic Mental Health, 1, 121-126.

Monahan, J., & Steadman, H.J. (1983). Crime and mental illness: An epidemiological approach. In:

N. Morris, & M. Tonry (Eds.), Crime and justice: An annual review of research (pp. 145-189).

Chicago: University of Chicago press.

Monahan, J., & Steadman, H.J. (1996). Violent storms and violent people. How meteorology can

inform risk communication in mental health law. American Psychologist, 51, 931-938.

Monahan, J., Steadman, H.J., Silver, E., Appelbaum, P.S., Robbins, P.C., Mulvey, E.P., Roth,

L.H., Grisso, T., & Banks, S. (2001). Rethinking risk assessment: The MacArthur study of

mental disorder and violence. Oxford: Oxford University press.

Mossman, D. (1994). Assessing predictions of violence: Being accurate about accuracy. Journal

of Consulting and Clinical Psychology, 62, 783-792.

Mossman, D. (2000). Book review evaluating violence risk ‘by the book’: A review of HCR-

20: Assessing risk for violence, version 2, and the manual for the Sexual Violence Risk-20.

Behavioral Sciences and the Law, 18, 781-789.

Mullen, P.E. (1997). A reassessment of the link between mental disorder and violent behavior and its

implications for clinical practice. Australian and New Zealand Journal of Psychiatry, 31, 3-11.

References

173

Müller-Isberner, R., & Hodgins, S. (2000). Evidence-based treatment for mentally disordered

offenders. In S. Hodgins, & R. Müller-Isberner (Eds.), Violence, crime and mentally disordered

offenders. Concept and methods for effective treatment and prevention (pp. 7-38). Chichester:

Wiley

Murphy, D. (2002). Risk assessment as collective clinical judgement. Criminal Behavior and


Newhill, C.E., Mulvey, E.P., & Lidz, C.W. (1995). Characteristics of violence in the community

by female patients seen in a psychiatric emergency service. Psychiatric Services, 46, 785-789.

Nicholls, T.L. (2001). Violence risk assessment with female NCRMD acquittees: Validity of the

HCR-20 and PCL:SV. Unpublished master’s thesis, Simon Fraser University, Vancouver,

British Columbia, Canada.

Nicholls, T.L., Ogloff, J.R.P., & Douglas, K.S. (2004). Assessing risk for violence among male and

female civil psychiatric patients: The HCR-20, PCL:SV, and VSC. Behavioral Sciences and the

Law, 22, 127-158.

Niemantsverdriet, J.R. (1993). Achteraf bezien: Over het evalueren van terbeschikkingstellingen

[Seen in retrospect: About the evaluation of tbs-orders]. Academisch proefschrift, Katholieke

Universiteit Nijmegen. Utrecht, The Netherlands: Elinkwijk.

Novaco, R. (1997). Remediating anger and aggression with violent offenders. Legal and


Odgers, C.L., & Moretti, M.M. (2002). Aggressive and antisocial girls: Research update and

challenges. International Journal of Forensic Mental Health, 1, 103-119.

Otto, R.K. (1992). Prediction of dangerous behavior: A review and analysis of 'second generation'

research. Forensic Reports, 5, 103-133.

Otto, R.K. (2000). Assessing and managing violence risk in outpatient settings. Journal of Clinical

Psychology, 56, 1239-1262.

Pagani, L., & Pinard, G.F. (2001). Clinical assessment of dangerousness: An overview of the

literature. In G.F. Pinard, & L. Pagani (Eds.), Clinical assessment of dangerousness. Empirical

contributions (pp1-22). Cambridge: Cambridge University Press.

Pajer, K.A. (1998). What happens to ‘bad’ girls? A review of the adult outcomes of antisocial

adolescent girls. American Journal of Psychiatry, 155, 862-870.

Panhuis, P.J.A. van, & Dingemans, P.M. (2000). Geweld en psychotische ziekte [Violence and

psychotic disorder]. Tijdschrift voor Psychiatrie, 42, 793-802.

Perry, J.C., & Herman, J.L. (1993). Trauma and defense in the etiology of borderline personality

disorder. In J. Paris (Ed.), Borderline personality disorder. Etiology and treatment (pp. 123-139).

Washington, DC: American Psychiatric Press, Inc.

Pescosolido, B., Monahan, J., Link, B., Stueve, A., & Kikuzawa, S. (1999). The public’s view

of the competence, dangerousness and need for legal coercion among persons with mental

References

174

illness. American Journal of Public Health, 89, 1339-1345.

Pfohl, B., Blum, N., & Zimmerman, M. (1995). Structured Interview for DSM-IV Personality, SIDP-

IV. Iowa: University of Iowa.

Phelan, J.C., & Link, B.G. (1998). The growing belief that people with mental illnesses are violent:

The role of dangerousness criterion for civil commitment. Social Psychiatry and Psychiatric

Epidemiology, 33, supplement 1, 7-12.

Philipse, M., Erven, T. van, & Peters, J. (2002). Risicotaxatie in de tbs. Van geloof naar empirie

[Risk assessment in the tbs. From belief to empiricism]. Justitiële Verkenningen, 28, 77-93.

Philipse, M. (2005). Predicting criminal recidivism. Empirical studies and clinical practice in forensic

psychiatry. Nijmegen, The Netherlands: Radboud Universiteit.

Philipse, M., Ruiter, C. de, Hildebrand, M., & Bouman, Y. (2000). HCR-20. Beoordelen

van het risico van gewelddadig gedrag. Versie 2 [HCR-20. Assessing the risk of violence. Version

2]. Nijmegen / Utrecht, The Netherlands: Prof. Mr. W.P.J. Pompestichting / Dr. Henri van der

Hoeven Stichting.

Pinard, G.F., & Pagani, L. (2001). Discussion and clinical commentary on issues in the assessment and

prediction of dangerousness. In G.F. Pinard, & L. Pagani (Eds.), Clinical assessment of

dangerousness. Empirical contributions (pp. 258-277). Cambridge: Cambridge University Press.

Prentky, R.A., Knight, R.A., Lee, A.F.S., & Cerce, D. (1995). Predictive validity of lifestyle

impulsivity for rapists. Criminal Justice and Behavior, 22, 106-128.

Prentky, R.A., Lee, A.F.S., Knight, R.A., & Cerce, D. (1997). Recidivism rates among child molesters

and rapists: A methodological analysis. Law and Human Behavior, 21, 635-658.

Price, R. (1997). On the risks of risk prediction. The Journal of Forensic Psychiatry, 8, 1-4.

Prins, H. (1996). Risk assessment and management in criminal justice and psychiatry. The Journal of

Forensic Psychiatry, 7, 42-62.

Quinsey, V.L., Harris, G.T., Rice, M.E., & Cormier, C.A. (1998). Violent offenders: Appraising

and managing risk. Washington, DC: American Psychological Association.

Quinsey, V.L., Lalumière, M.L., Rice, M.E., & Harris, G.T. (1995). Predicting sexual offenses.

In J.C. Campbell (Ed.), Assessing dangerousness: Violence by sex offenders, batterers, and child

abusers (pp. 114-137). Thousand Oaks, CA: Sage.

Quinsey, V.L., Rice, M.E., & Harris, G.T. (1995). Actuarial prediction of sexual recidivism.

Journal of Interpersonal Violence, 10, 85-105.

Reed, J. (1997). Risk assessment and clinical risk management: The lessons from recent inquiries.

British Journal of Psychiatry, 170, 1-3.

Regier, D.A., Farmer, M.E., Rae, D.S., Locke, B.Z., Keith, S.J., Judd, L.L., & Goodwin, F.K.

(1990). Comorbidity of mental disorders with alcohol and other drug abuse. Journal of the

American Medical Association, 264, 2511-2518.

Rice, M.E., & Harris, G.T. (1992). A comparison of criminal recidivism among schizophrenic and

References

175

nonschizophrenic offenders. International Journal of Law and Psychiatry, 15, 397-408.

Rice, M.E., & Harris, G.T. (1995). Violent recidivism: Assessing predictive validity. Journal of

Consulting and Clinical Psychology, 63, 737-748.

Rice, M.E., & Harris, G.T. (1997). Cross-validation and extension of the Violence Risk Appraisal

Guide for child molesters and rapists. Law and Human Behavior, 21, 231-241.

Rice, M.E., & Harris, G.T. (2002). Men who molest their immature daughters: Is a special explanation

required? Journal of Abnormal Psychology, 111, 329-339.

Rice, M.E., Harris, G.T., & Quinsey, V.L. (2002). The appraisal of violence risk. Current Opinion in


Rice, M.E., Quinsey, V.L, & Harris, G.T. (1991). Sexual recidivism among child molesters released

from a maximum security psychiatric institution. Journal of Consulting and Clinical Psychology,

59, 381-386.

Robins, L.N., Helzer, J.E., Croughan, J., & Ratcliff, K. (1981). National Institute of Mental Health

Diagnostic Interview Schedule: Its history, characteristics, and validity. Archives of General


Rogers, R. (2000). The uncritical acceptance of risk assessment in forensic practice. Law and


Ross, D.J., Hart, S.D., & Webster, C.D. (1998). Aggression in psychiatric patients. Using the HCR-20

to assess risk for violence in hospital and in the community. Port Coquitlam, BC: Riverview

Hospital.

Rudnick, A. (1999). Relation between command hallucinations and dangerous behavior. Journal of the

American Academy of Psychiatry and Law, 27, 253-257.

Ruiter, C. de, & Greeven, P.G.J. (2000). Personality disorders in a Dutch forensic psychiatric sample:

Convergence of interview and self-report measures. Journal of Personality Disorders, 14, 162-170.

Ruiter, C. de, & Hildebrand, M. (2003). The dual nature of forensic psychiatric practice: Risk

assessment and management under the Dutch TBS-order. In P.J. van Koppen & S.D. Penrod (Eds.),

Adversarial versus inquisitorial justice: Psychological perspectives on criminal justice systems (pp.

91-106). New York: Kluwer/Plenum.

Salekin, R.T., Rogers, R., & Sewell, K.W. (1996). A review and meta-analysis of the Psychopathy

Checklist and Psychopathy Checklist-Revised: Predictive validity of dangerousness. Clinical

Psychology: Science and Practice, 3, 203-215.

Salekin, R.T., Rogers, R., & Sewell, K.W. (1997). Construct validity of psychopathy in a female

offender sample: A multitrait-multimethod evaluation. Journal of Abnormal Psychology, 106,

576-585.

Scarth, K., & McLean, H. (1994). The psychological assessment of women in prison. Forum on

Corrections Research, 6, 32-35.

Schmidt, P., & Wytte, A.D. (1988). Predicting recidivism using survival models. New York: Springer.

References

176

Schulte, H.M., Hall, M.J., & Crosby, R. (1994). Violence in patients with narcissistic personality

pathology: Observations of a clinical series. American Journal of Psychotherapy, 48, 610-623.

Serbin, L.A., Cooperman, J.M., Peters, P.L., Lehoux, P.M., Stack, D.M., & Schwartzman, A.E.

(1998). Intergenerational transfer of psychosocial risk in women with childhood histories of

aggression, withdrawal, or aggression and withdrawal. Developmental Psychology, 34, 1246-1262.

Serin, R.C., & Amos, N.L. (1995). The role of psychopathy in the assessment of dangerousness.


Serin, R.C., Mailloux, D.L., & Malcolm, P.B. (2001). Psychopathy, deviant sexual arousal, and

recidivism among sexual offenders. Journal of Interpersonal Violence, 16, 234-246.

Sheldrick, C. (1999). Practitioner review: The assessment and management of risk in adolescents.

Journal of Child Psychology and Psychiatry, 40, 507-518.

Silver, E., Mulvey, E.P., & Monahan, J. (1999). Assessing violence risk among discharged psychiatric

patients: Toward an ecological approach. Law and Human Behavior, 23, 237-255.

Silverthorn, P., & Frick, P.J. (1999). Developmental pathways to antisocial behavior: The delayed-

onset pathway in girls. Development and Psychopathology, 11, 101-126.

Simon, L.M.J. (2000). An examination of the assumptions of specialization, mental disorder, and

dangerousness in sex offenders. Behavioral Sciences and the Law, 18, 275-308.

Simourd, L., & Andrews, D.A. (1994). Correlates of delinquency: A look at gender differences.

Forum on Corrections Research, 6, 26-31.

Sjöstedt, G., & Långström, N. (2001). Actuarial assessment of sex offender recidivism risk: A

cross-validation of the RRASOR and the Static-99 in Sweden. Law and Human Behavior,

25, 629-645.

Sjöstedt, G., & Långström, N. (2002). Assessment of risk for criminal recidivism among rapists: A

comparison of four different measures. Psychology, Crime and Law, 8, 25-40.

Skeem, J.L., Mulvey, E.P., & Lidz, C. (2000). Building mental health professionals' decisional models

into tests of predictive validity: The accuracy of contextualized predictions of violence. Law and


Slovic, P., & Monahan, J. (1995). Probability, danger, and coercion: A study of risk perception and

decision making in mental health law. Law and Human Behavior, 19, 49-65.

Slovic, P., Monahan, J., & MacGregor, D.G. (2000). Violence risk assessment and risk

communication: The effects of using actual cases, providing instruction, and employing

probability versus frequency formats. Law and Human Behavior, 24, 271-298.

Smith, C., & Thornberry, T.P. (1995). The relationship between childhood maltreatment and

adolescent involvement in delinquency. Criminology, 33, 451-481.

Smith, J., & Hucker, S. (1994). Schizophrenia and substance abuse. British Journal of Psychiatry, 165,

13-21.

Snowden, P. (1997). Practical aspects of clinical risk assessment and management. British Journal of

References

177


Sreenivasan, S., Kirkish, P., Garrick, T., Weinberger, L.E., & Phenix, A. (2000). Actuarial risk

assessment models: A review of critical issues related to violence and sex offender recidivism

assessments. The Journal of the American Academy of Psychiatry and the Law, 28, 438-448.

Stålenheim, E.G., & Knorring, L. von (1996). Psychopathy and Axis I and Axis II psychiatric

disorders in a forensic psychiatric population in Sweden. Acta Psychiatrica Scandinavica,

94, 217-223.

Steadman, H.J., Monahan, J., Appelbaum, P.S., Mulvey, E.P., Roth, L.H., Robbins, P.C., &

Klassen, D. (1994). Designing a new generation of risk assessment research. In J. Monahan, & H.J.

Steadman (Eds.), Violence and mental disorder (pp. 297-318). Chicago: The University of Chicago

Press.

Steadman, H.J., Monahan, J., Robbins, P.C., Appelbaum, P., Grisso, T., Klassen, D., Mulvey, E.P., &

Roth, L. (1993). From dangerousness to risk assessment: Implications for appropriate research

strategies. In S. Hodgins (Ed.), Mental disorder and crime (pp.39-62). Newbury Park: Sage.

Steadman, H.J., Mulvey, E.P., Monahan, J., Robbins, P.C., Appelbaum, P.S., Grisso, T., Roth, L.H., &

Silver, E. (1998). Violence by people discharged from acute psychiatric inpatient facilities and by

others in the same neighborhoods. Archives of General Psychiatry, 55, 393-401.

Steadman, H.J., Silver, E., Monahan, J., Appelbaum, P., Robbins, P.C., Mulvey, E.P., Grisso, T.,

Roth, L., & Banks, S. (2000). A classification tree approach to the development of actuarial

violence risk assessment tools. Law and Human Behavior, 24, 83-100.

Strand, S., & Belfrage, H. (2001). Comparison of HCR-20 scores in violent mentally disordered men

and women: Gender differences and similarities. Psychology, Crime and Law, 7, 71-79.

Strand, S., Belfrage, H., Fransson, G., & Levander, S. (1999). Clinical and risk management factors

in risk prediction of mentally disordered offenders: More important than actuarial data? Legal and


Straznickas, K.A., McNiel, D.E., & Binder, R.L. (1993). Violence toward family care givers by

mentally ill relatives. Hospital and Community Psychiatry, 44, 385-387.

Studer, L.H., Clelland, S.R., Aylwin, A.S., Reddon, J.R., & Monro, A. (2000). Rethinking risk

assessment for incest offenders. International Journal of Law and Psychiatry, 23, 15-22.

Sturidsson, K., Haggård-Grann, U., Lotterberg, M., Dernevik, M., & Grann, M. (2004). Clinicians’

perceptions of which factors increase or decrease the risk of violence among forensic out-patients.

International Journal of Forensic Mental Health, 3, 23-36.

Sugarman, P., Dumughn, C., Saad, K., Hinder, S., & Bluglass, R. (1994). Dangerousness in

exhibitionists. Journal of Forensic Psychiatry, 5, 187-296.

Swanson, J.W. (1994). Mental disorder, substance abuse, and community violence: An

epidemiological approach. In J. Monahan, & H.J. Steadman (Eds.), Violence and mental

disorder (pp. 101-136). Chicago: The University of Chicago Press.

References

178

Swanson, J.W., Borum, R., Swartz, M.S., & Monahan, J. (1996). Psychotic symptoms and disorders

and the risk of violent behavior in the community. Criminal Behavior and Mental Health, 6, 309-

329.

Swanson, J.W., Holzer, C.E., Ganju, V.K., & Tsutomu Jono, R. (1990). Violence and psychiatric

disorder in the community: Evidence from the Epidemiologic Catchment Area surveys. Hospital

and Community Psychiatry, 41, 761-770.

Swartz, M.S., Swanson, J.W., Hiday, V.A., Borum, R., Wagner, R., & Burns, B.J. (1998). Taking the

wrong drugs: The role of substance abuse and medication noncompliance in violence among

severely mentally ill individuals. Social Psychiatry and Psychiatric Epidemiology, 33, supplement

1, 75-80.

Tabachnick, B.G., & Fidell, L.S. (2001). Using multivariate statistics. Boston: Allyn & Bacon.

Tardiff, K. (2001). Axis II disorders and dangerousness. In G.F. Pinard, & L. Pagani (Eds.), Clinical

assessment of dangerousness. Empirical contributions (pp. 103-120). Cambridge: Cambridge

University Press.

Tardiff, K., Marzuk, P.M., Leon, A.C., & Portera, L. (1997). A prospective study of violence by

psychiatric patients after hospital discharge. Psychiatric Services, 48, 678-681.

Tardiff, K., Marzuk, P.M., Leon, A.C., Portera, L., & Weiner, C. (1997). Violence by patients

admitted to a private psychiatric hospital. American Journal of Psychiatry, 154, 88-93.

Taylor, P.J., Leese, M., Williams, D., Butwell, M., Daly, R., & Larkin, E. (1998). Mental disorder and

violence. A special (high) security hospital study. British Journal of Psychiatry, 172, 218-226.

Teplin, L.A. (1990). The prevalence of severe mental disorder among male urban jail detainees:

Comparison with the Epidemiological Catchment Area Program. American Journal of Public


Teplin, L.A., Abram, K.M., & McClelland, G.M. (1994). Does psychiatric disorder predict violent

crime among released jail detainees? A six-year longitudinal study. American Psychologist, 49,

335-342.

Tiihonen, J. (2001). Recidivistic violent behavior and Axis I and Axis II disorders. In G.F.

Pinard, & L. Pagani (Eds.), Clinical assessment of dangerousness. Empirical contributions

(pp. 121-135). Cambridge: Cambridge University Press.

Tiihonen, J., Isohanni, M., Rasanen, P., Koiranen, M., & Moring, J. (1997). Specific major mental

disorders and criminality: A 26-year prospective study of the 1966 northern Finland birth cohort.

American Journal of Psychiatry, 154, 840-845.

Vertommen, H., Verheul, R., Ruiter, C. de, & Hildebrand, M. (2002). De herziene versie van Hare’s

Psychopathie Checklist [Revised version of Hare’s Psychopathy Checklist]. Lisse: Swets Test

Publishers.

Vida, S. (1997). AccuROC nonparametric receiver operating characteristic analysis (version 2.5)

[Computer software]. Montreal, Quebec, Canada: Accumetric Corporation.

References

179

Vitale, J.E., & Newman, J.P. (2001). Using the Psychopathy Checklist-Revised with female samples:

Reliability, validity and implications for clinical utility. Clinical Psychology: Science and Practice,

8, 117-132.

Vitale, J.E., Smith, S.S., Brinkley, C.A., & Newman, J.P. (2002). The reliability and validity of the

Psychopathy Checklist-Revised in a sample of female offenders. Criminal Justice and Behavior,

29, 202-231.

Vogel, V. de (2003). SVR-20 violence risk assessment scheme: Overview and annotated bibliography.

Retrieved from http://www.sfu.ca/psychology/groups/faculty/hart/violink.htm.

Vogel, V. de, & Ruiter, C. de (2004). Differences between clinicians and researchers in assessing risk

of violence in forensic psychiatric patients. The Journal of Forensic Psychiatry and Psychology,

15, 145-164.

Vogel, V. de, & Ruiter, C. de (in press). The HCR-20 in personality disordered female offenders: A

comparison with a matched sample of males. Clinical Psychology and Psychotherapy, 12.

Vogel, V. de, Ruiter, C. de, Hildebrand, M., Bos, B., & Ven, P. van de (2004). Type of discharge

and risk of recidivism measured by the HCR-20. A retrospective study in a Dutch sample of treated

forensic psychiatric patients. International Journal of Forensic Mental Health.

Wallace, C., Mullen, P., Burgess, P., Palmer, S., Ruschena, D., & Browne, C. (1998). Serious

criminal offending and mental disorder. Case linkage study. British Journal of Psychiatry, 172,

477-484.

Walsh, E., Buchanan, A., & Fahy, T. (2002). Violence and schizophrenia: Examining the evidence.

British Journal of Psychiatry, 180, 490-495.

Ward, T., & Eccleston, L. (Eds.) (2004). Offender rehabilitation [Special issue]. Psychology, Crime,

and Law, 10.

Warren, J.I., Burnette, M.L., South, S.C., Chauhan, P., Bale, R., & Friend, R. (2002). Personality

disorders and violence among female prison inmates. The Journal of the American Academy of

Psychiatry and the Law, 30, 502-509.

Warren, J.I., Burnette, M.L., South, S.C., Chauhan, P., Bale, R., Friend, R., & Patten, I. van (2003).

Psychopathy in women: Structural modeling and comorbidity. International Journal of Law and


Webster, C.D., & Cox, D. (1997). Integration of nomothetic and ideographic positions in risk

assessments: Implication for practice and the education of psychologists and other mental health

professionals. American Psychologist, 52, 1245-1246.

Webster, C.D., Douglas, K.S., Eaves, D., & Hart, S.D. (1997a). Assessing risk of violence to others.

In C.D. Webster, & A. Jackson (Eds.), Impulsivity. Theory, assessment and treatment (pp. 251-

272). New York: The Guilford Press.

Webster, C.D., Douglas, K.S., Eaves, D., & Hart, S.D. (1997b). HCR-20. Assessing the risk of

violence. Version 2. Vancouver, BC, Canada: Simon Fraser University and Forensic Psychiatric

References

180

Services Commission of British Columbia.

Webster, C.D., Eaves, D., Douglas, K.S., & Wintrup, A. (1995). The HCR-20 scheme. The assessment

of dangerousness and risk. Vancouver, BC, Canada: Simon Fraser University and Forensic

Psychiatric Services Commission of British Columbia.

Webster, C.D., Harris, G., Rice, M., Cormier, C., & Quinsey, V. (1994). The violence prediction

scheme: Assessing dangerousness in high risk men. Toronto: University of Toronto Press.

Webster, C.D., Hucker, S.J., & Bloom, H. (2002). Transcending the actuarial versus clinical polemic

in assessing risk for violence. Criminal Justice and Behavior, 29, 659-665.

Webster, C.D., Müller-Isberner, R., & Fransson, G. (2002). Violence risk assessment: Using structured

clinical guidelines professionally. International Journal of Forensic Mental Health, 1, 185-193.

Weinrott, M.R., & Saylor, M. (1991). Self report of crimes committed by sex offenders. Journal of

Interpersonal Violence, 5, 283-300.

Weisman, M.M. (1993). The epidemiology of personality disorders: a 1990 update. Journal of

Personality Disorders, 7, supplement, 44-62.

Wessely, S.C., Buchanan, A., Reed, A., Cutting, J., Everitt, B., Garety, P., & Taylor, P.J. (1993).

Acting on delusions. I: Prevalence. British Journal of Psychiatry, 163, 69-76.

Wessely, S.C., Castle, D., Douglas, A.J., & Taylor, P.J. (1994). The criminal careers of incident cases

of schizophrenia. Psychological Medicine, 24, 483-502.

Whyte, C.R., Constantopoulos, C., & Bevans, H.G. (1982). Types of countertransference identified by

Q-analysis. British Journal of Medical Psychology, 55, 187-201.

Widiger, T.A., & Trull, T.J. (1994). Personality disorders and violence. In J. Monahan, & H.J.

Steadman (Eds.), Violence and mental disorder (pp. 203-226). Chicago: The University of Chicago

Press.

Widom, C.S. (1989a). The cycle of violence. Science, 244, 160-166.

Widom, C.S. (1989b). Child abuse, neglect, and violent criminal behavior. Criminology, 27, 251-271.

Wilson, R.J. (1999). Emotional congruence in sexual offenders against children. Sexual Abuse: A

Journal of Research and Treatment, 11, 33-47.

Witt, P.H. (2000). Book review. A practitioner’s view of risk assessment: The HCR-20 and SVR-20.

Behavioral Sciences and the Law, 18, 791-798.

Zanarini, M.C., Gunderson, J.G., Marino, M.F., Schwartz, E.O., & Frankenburg, F.R. (1989).

Childhood experiences of borderline patients. Comprehensive Psychiatry, 30, 18-25.

Appendixes

181

Appendix I List of abbreviations ASPD Antisocial personality disorder BPD Borderline personality disorder DSM Diagnostic and Statistical Manual of mental disorders ECA Epidemiological Catchment Area study MMD Major mental disorder NPD Narcissistic personality disorder PD Personality disorder SPD Sadistic personality disorder SPJ Structured Professional Judgment Tbs Terbeschikkingstelling (disposal to be treated on behalf of the state) TCO Threat control override symptoms Abbreviations of statistics used in this thesis AUC Area Under the Curve CI Confidence interval ICC Intraclass correlation coefficient OR Odds ratio ROC Receiver Operating Characteristic SE Standard error Abbreviations of instruments used in this thesis FWC Feeling Word Checklist HCR-20 Historical Clinical Risk management-20 PCL-R Psychopathy Checklist-Revised SVR-20 Sexual Violence Risk-20 Abbreviations of instruments named in this thesis BDHI Buss Durkee Hostility Inventory DIS Diagnostic Interview Schedule DISC Diagnostic Interview Schedule for Children EARL-20B Early Assessment Risk List for Boys EARL-21G Early Assessment Risk List for Girls LHA Life History of Aggression scale LSI-R Level of Service Inventory-Revised PCL:SV Psychopathy Checklist: Screening Version RRASOR Rapid Risk Assessment of Sexual Offense Recidivism SACJ-Min Structured Anchored Clinical Judgment SARA Spousal Assault Risk Assessment guide SAVRY Structured Assessment of Violence Risk in Youth SIDP-IV Structured Interview for DSM-IV Personality SONAR Sex Offender Need Assessment Rating SORAG Sex Offender Risk Assessment Guide VORAS Violent Offender Risk Assessment Scale VRAG Violence Risk Appraisal Guide

Appendixes

183

Appendix II Historical Clinical Risk management-20 (HCR-20) worksheet Patient

Name: Date of birth: Patient number:

Code: no = 0, probably / partially = 1, yes = 2, don’t know = -

Historical items Present?

1. Previous violence

2. Young age at first violent incident

3. Relationship instability

4. Employment problems

5. Substance use problems

6. Major mental illness

7. Psychopathy Definite: coded from existing reports

Provisional: should be referred for evaluation

8. Early maladjustment

9. Personality disorder

10. Prior supervision failure

Other H factor:

Appendixes

184

Clinical items Present?

1. Lack of insight

2. Negative attitudes

3. Active symptoms of major mental illness

4. Impulsivity

5. Unresponsive to treatment

Other C factor:

Risk management items In Out

Present?

1. Plans lack feasibility

2. Exposure to destabilizers

3. Lack of personal support

4. Noncompliance with remediation attempts

5. Stress

Other R factor:

Appendixes

185

Risk facet Scenario 1 Scenario 2 Scenario 3

Nature • What kinds of violence might

this person commit? • What is the likely motivation? • Who are the likely victims?

Severity • What would be the physical

harm to victims? • What would be the psychological

harm to victims? • Is there a chance that the

violence might escalate to life threatening levels?

Imminence • How soon might the violence

occur? • Are there any warning signs that

might signal that violence risk is increasing or imminent?

Frequency / duration • How often might the violence

occur — once or several times? • Is the violence risk chronic or

acute (i.e., time-limited)?

Likelihood • In general, how frequent or

common is this type of violence? • How frequently has this person

committed this type of violence? • What is the probability that this

person will commit this type of violence?

Other considerations • Is there anything else that should

be taken into consideration?

Appendixes

186


Risk-elevating factors • What events, occurrences, or

circumstances might increase this person’s violence risk?

• What factors might lead this person to consider or choose to act violently?

Risk-protective factors • What events, occurrences, or

circumstances might decrease this person’s violence risk?

• What factors might prevent this person from considering or choosing to act violently?

Risk monitoring • What is the best way to monitor

any warning signs that this person’s violence risk may be increasing?

• What events, occurrences, or circumstances should trigger a formal re-assessment of this person’s violence risk?

Risk management • What treatment or rehabilitation

strategies could be implemented to manage this patient’s violence risk?

• What supervision or surveillance strategies could be implemented to manage this patient’s violence risk?

Final risk judgment

Violence risk

• What level of effort, attention, and intervention is required to prevent this person from perpetrating violence?

! Low ! Moderate ! High

Evaluator: Date:

Signed:

Note: Adopted with permission from the Mental Health, Law, and Policy Institute, Simon Fraser University.

Appendixes

187

Appendix III Sexual Violence Risk-20 (SVR-20) coding work sheet Patient

Name: Date of birth: Patient number:

Code: no = 0, probably / partially = 1, yes = 2, don’t know = -

Psychosocial adjustment Present?

1. Sexual deviance

2. Victim of child abuse

3. Psychopathy Definite: coded from existing reports Provisional: should be referred for evaluation

4. Major mental illness

5. Substance use problems

6. Suicidal / homicidal ideation

7. Relationship problems

8. Employment problems

9. Past nonsexual violent offenses

10. Past nonviolent offenses

11. Past supervision failure

Appendixes

188

Sexual offenses Present?

12. High density sex offenses

13. Multiple sex offense types

14. Physical harm to victim(s) in sex offenses

15. Uses weapons or threats of death in sex offenses

16. Escalation in frequency or severity of sex offenses

17. Extreme minimization or denial of sex offenses

18. Attitudes that support or condone sex offenses

Future plans Present?

19. Lacks realistic plans

20. Negative attitude toward intervention


Appendixes

189


Nature • What kinds of sexual violence

might this person commit? • What is the likely motivation? • Who are the likely victims?

Severity • What would be the physical harm

to victims? • What would be the psychological

harm to victims? • Is there a chance that the sexual

violence might escalate to life threatening levels?

Imminence • How soon might the sexual

violence occur? • Are there any warning signs that

might signal that risk is increasing or imminent?

Frequency / duration • How often might the sexual

violence occur — once or several times?

• Is the risk chronic or acute (i.e., time-limited)?

Likelihood • In general, how frequent or

common is this type of sexual violence?

• How frequently has this person committed this type of sexual violence?

• What is the probability that this person will commit this type of sexual violence?

Other considerations • Is there anything else that should be

taken into consideration?

Appendixes

190


Risk-elevating factors • What events, occurrences, or

circumstances might increase this person’s risk?

• What factors might lead this person to consider or choose to commit sexual violence?

Risk-protective factors • What events, occurrences, or

circumstances might decrease this person’s risk?

• What factors might prevent this person from considering or choosing to commit sexual violence?

Risk monitoring • What is the best way to monitor

any warning signs that this person’s risk may be increasing?

• What events, occurrences, or circumstances should trigger a formal re-assessment of this person’s risk?

Risk management • What treatment or rehabilitation

strategies could be implemented to manage this patient’s risk?

• What supervision or surveillance strategies could be implemented to manage this patient’s risk?

Final risk judgment

Sexual violence risk • What level of effort, attention, and intervention is required

to prevent this person from perpetrating sexual violence? ! Low ! Moderate ! High

Evaluator: Date:

Signed:

Note: Adopted with permission from the Mental Health, Law, and Policy Institute, Simon Fraser University.

Appendixes

191

Appendix IV Feeling Word Checklist (FWC)

Feeling Word Checklist

Name patient: ………………………… Name evaluator: ………………………… Date: ………………………… Profession evaluator: ………………………… When I think about ……….. I feel:*

not at all somewhat fairly very 1. Helpful 0 1 2 3

2. Happy 0 1 2 3

3. Angry 0 1 2 3

4. Enthusiastic 0 1 2 3

5. Anxious 0 1 2 3

6. Strong 0 1 2 3

7. Manipulated 0 1 2 3

8. Relaxed 0 1 2 3

9. Cautious 0 1 2 3

10. Disappointed 0 1 2 3

11. Indifferent 0 1 2 3

12. Affectionate 0 1 2 3

13. Suspicious 0 1 2 3

14. Sympathetic 0 1 2 3

15. Disliked 0 1 2 3

16. Surprised 0 1 2 3

17. Tired 0 1 2 3

18. Threatened 0 1 2 3

19. Receptive 0 1 2 3

20. Objective 0 1 2 3

21. Overwhelmed 0 1 2 3

22. Bored 0 1 2 3

23. Motherly 0 1 2 3

24. Confused 0 1 2 3

25. Embarrassed 0 1 2 3

26. Interested 0 1 2 3

27. Aloof 0 1 2 3

28. Sad 0 1 2 3

29. Inadequate 0 1 2 3

30. Frustrated 0 1 2 3 ∗ This list is about how you feel about the patient at this moment. Circle the answer of choice.

Note: Adopted from Whyte, Constantopoulos, & Bevans (1982). Subscale items: Helpful: 1, 2, 8; Unhelpful: 3, 10, 15, 18, 24; Close: 12, 16, 21, 23; Distant: 9, 11, 27; Accepting: 4, 14, 19, 26; Rejecting: 13, 17, 22, 30; Autonomy: 6, 20; Controlled: 5, 7, 25, 28, 29.

Appendixes

193

Appendix V Incidents registration

Category

Label

Example

Type of incident

1. Verbal abuse

Cursing, name-calling

2. Verbal threat Threatening to hurt or kill someone

3. Physical violence Hitting someone

4. Violation of hospital rules Use of drugs

Seriousness of incident

1. Mild

Push someone

2. Serious Stab someone with a knife

Direction of incident

1. Objects

Door, chair

2. Self Self mutilation

3. Fellow patients Patient in living group

4. Staff members Group leader

5. Others Family, visitors

6. Not relevant Possession of drugs or pornography

7. Unknown

Location of incident

1. In the hospital

In the living group

2. Outside the hospital During a leave or the transmural phase

Sexual nature of incident

1. Yes

Sexually assaulting someone

2. No Non-sexual incident

195

Curriculum Vitae Vivienne de Vogel was born on January 8th 1973 in Dordrecht, The Netherlands. From 1985 until

1991, she attended high school in Dordrecht. In 1991, she started her study Psychology at the

University of Leiden and graduated in 1997 (in Clinical and Health Psychology and in Personality

Psychology). During her study, she also studied at the Faculty of Law for two years in order to

specialize in the area of forensic psychology. In 1997, she became a research assistant in a longitudinal

research into psychopathology at the children’s hospital Erasmus MC-Sophia in Rotterdam. In

February 1998, she started to work in the Dr. Henri van der Hoeven Kliniek, a forensic psychiatric

hospital in Utrecht. For the first three years, she participated in several research projects and conducted

personality assessments. As of January 2001, she started her dissertation research on structured risk

assessment in forensic clinical practice. Currently, she is a senior researcher at the Dr. Henri van der

Hoeven Kliniek and a university teacher at the University of Amsterdam.

Backlist Criminal Sciences J.A.C. Bevers et al: An Independent Defence Before the International Criminal Court. Proceedings of the Confererence held at the Hague, 1-2 November 1999. (2000). ISBN 90 5170 514 X Ch. Joubert: Judicial Control of Foreign Evidence in Comparative Perspective. (2005). ISBN 90 3619 132 7 M. Hildebrand: Psychopathy in the Treatment of Forensic Psychiatric Patients. Assessment, Prevalence, Predictive Validity, and Clinical Implications. (2004). ISBN 90 3619 052 2 J.W. de Keijser: Punishment and Purpose. From Moral Theory to Punishment in Action. (2000). ISBN 90 5170 515 8 M. Malsch et al.(eds.): Complex Cases. Perspectives on The Netherlands Criminal Justice System. (1999). ISBN 90 5170 499 2 L. Meintjes – van der Walt: Exxpert evidence in the Criminal Justice Process. A Comparative Perspective. (2001). ISBN 90 5170 528 X J.F. Nijboer et al.(eds.): Harmonisation in Forensic Expertise. An Inquiry into the Desirability of and Opportunities for International Standards. (2000). ISBN 90 5170 498 4

Date post:	26-Aug-2019
Category:	Documents
Upload:	hoangduong
View:	219 times
Download:	0 times

UvA-DARE (Digital Academic Repository) Structured risk ... · structured professional judgment...

Documents