Skip to main content

Susceptibility (risk and protective) factors for in-patient violence and self-harm: prospective study of structured professional judgement instruments START and SAPROF, DUNDRUM-3 and DUNDRUM-4 in forensic mental health services



The START and SAPROF are newly developed fourth generation structured professional judgement instruments assessing strengths and protective factors. The DUNDRUM-3 and DUNDRUM-4 also measure positive factors, programme completion and recovery in forensic settings.


We compared these instruments with other validated risk instruments (HCR-20, S-RAMM), a measure of psychopathology (PANSS) and global function (GAF). We prospectively tested whether any of these instruments predict violence or self harm in a secure hospital setting (n = 98) and whether they had true protective effects, interacting with and off-setting risk measures.


SAPROF and START-strengths had strong inverse (negative) correlations with the HCR-20 and S-RAMM. SAPROF correlated strongly with GAF (r = 0.745). In the prospective in-patient study, SAPROF predicted absence of violence, AUC = 0.847 and absence of self-harm AUC = 0.766. START-strengths predicted absence of violence AUC = 0.776, but did not predict absence of self-harm AUC = 0.644. The DUNDRUM-3 programme completion and DUNDRUM-4 recovery scales also predicted in-patient violence (AUC 0.832 and 0.728 respectively), and both predicted in-patient self-harm (AUC 0.750 and 0.713 respectively). When adjusted for the HCR-20 total score however, SAPROF, START-S, DUNDRUM-3 and DUNDRUM-4 scores were not significantly different for those who were violent or for those who self harmed. The SAPROF had a significant interactive effect with the HCR-dynamic score. Item to outcome studies often showed a range of strengths of association with outcomes, which may be specific to the in-patient setting and patient group studied.


The START and SAPROF, DUNDRUM-3 and DUNDRUM-4 can be used to assess both reduced and increased risk of violence and self-harm in mentally ill in-patients in a secure setting. They were not consistently better than the GAF, HCR-20, S-RAMM, or PANSS when predicting adverse events. Only the SAPROF had an interactive effect with the HCR-20 risk assessment indicating a true protective effect but as structured professional judgement instruments all have additional content (items) complementary to existing risk assessments, useful for planning treatment and risk management.

Peer Review reports


The assessment of risk of violence [14] has developed into ‘structured professional judgement’ approaches to risk assessment [1, 5, 6]. Identifying risk factors is held to be an aid to treatment planning [7] and perhaps for this reason risk assessment has come to pervade forensic mental health practice.

Doyle and Dolan [8] reviewed what they called ‘generational’ developments or phases in risk assessment. The ‘first generation’ - unstructured clinical or professional judgement [9] gave way to the second generation actuarial risk assessment tools [8]. However, the actuarial approach was criticised for focusing on a limited number of factors without taking into account potentially crucial case-specific idiosyncratic factors [8, 10]. A combination of both the clinical and actuarial approaches was required. This led to the development of the third generation risk assessment [8] described as empirically validated structured decision making [11] or structured professional judgement (SPJ) [12]. The leading structured professional judgement instrument for the assessment of risk of violence has been the Historical-Clinical-Risk Management-20 (HCR-20) [13]. This added the distinction between fixed historical risk factors and dynamic factors that are subject to change over time and in response to treatment. Although rated according to a set of defined risk items, the final judgement of risk level allows for clinical judgement rather than a simple actuarial score.

Forensic hospital patients and others like them are also at a greatly increased risk of suicide both in hospital and on returning to the community [14]. There has also been a recent interest in the assessment of risk of suicide and self-harm using structured professional judgement instruments [15].

Gaps in the structured professional judgement approach to risk assessment could be identified. Protective or resilience factors that might reduce risk of violence were first used in the structured clinical risk assessment instruments devised for children and adolescents [16, 17]. This reflects not only the importance of resilience as a developmental factor in young people, but also the reality that protective factors are taken into account by clinicians when making decisions about risk and treatment. The assessment system for adults should therefore allow for a broader assessment of susceptibility factors – negative risk or vulnerability factors that increase the probability of violence and self harm, and positive, protective or resilience factors that reduce the risk of violence and self-harm. Several new SPJ risk assessment instruments have appeared that are designed to assess protective factors or progress in treatment and recovery as part of the assessment and management of susceptibility (risk and protective factors) for violence and self-harm.

The Short-Term Assessment of Risk and Treatability (START) is a clinical guide for the dynamic assessment of risks, strengths and treatability which is relevant to everyday psychiatric clinical practice [18]. According to the authors and others, the START is intended to “stimulate discussion about strengths, vulnerabilities and appropriate interventions and management” [19, 20].

The SAPROF [21] is a recently-developed instrument for the assessment of factors protecting against violent acts. By specifically focusing on protective factors, the SAPROF aims to provide a more accurate and well-rounded assessment of risk for future violent behaviour [21].

The DUNDRUM-3 programme completion scale and DUNDRUM-4 recovery scale [22, 23] are two structured professional judgement instruments designed for use as measures of progress along the recovery pathway for those detained in secure forensic psychiatric services. These have been shown to predict moves from more secure to less secure places, along with measures of risk [24] and they have been shown to predict conditional discharge from hospital to the community [25]. The DUNDRUM-1 triage security instrument is a measure of the need for therapeutic security and is designed to be a static measure of a quality that is complementary to and distinct from risk of violence [26]. It has been shown to influence moves between levels of therapeutic security [24] and it is used also as a benchmark to enable comparisons between studies [26]. The SPJ instruments of the DUNDRUM toolkit are all designed to be complimentary to measures of risk of violence or self-harm. The DUNDRUM-3 and DUNDRUM-4 are included here as they can be conceptualised as positive or protective factors likely to reduce the risk of violence and self-harm.

Rutter [27] pointed out that a protective or resilience factor should do more than simply predict the absence of harm or adverse outcomes, since predicting the absence of harm is merely the absence of risk. Risk or vulnerability factors (or their reciprocals, measuring the absence of risk) and protective or resilience factors can be validated as predictive or not using the receiver operating characteristic, as a means of taking into account base rate variations between samples [28]. The strength of association in the specific population and setting studied can be assessed with unadjusted odds ratios. According to Rutter [27] a truly protective factor would interact with risk factors to reduce the probability of an adverse event or outcome, even when risk factors were present. This requires a form of analysis of interactive effects additional to that normally used to validate risk factors.


In this prospective study we set out to assess psychometric properties, concurrent validity and criterion outcome measures of the validity of the START and SAPROF. We prospectively tested whether START and SAPROF, DUNDRUM-3 and DUNDRUM-4 would predict adverse events (or the absence of adverse events), violence or self harm. We compared these to existing validated instruments for the assessment of risk of violence (HCR-20) and self-harm (S-RAMM) and examined whether they accounted for any element of statistical prediction over and above an existing ‘gold standard’ instrument for the assessment of risk of violence, the HCR-20. We also examined the predictive properties of measures of symptoms (PANSS [29, 30]) and global function (GAF [31]) treating these as another standard to be beaten – are specific risk assessment instruments and their constituent items better than assessing symptoms and function?


Study design

This is a naturalistic six month prospective cohort study of in-patients in a therapeutically secure forensic hospital. Data were gathered as part of the clinical audit of service delivery and the study was approved by the research ethics, audit and effectiveness committee of the National Forensic Mental Health Service. All patients gave informed consent to participate.


The Central Mental Hospital is a 94 bed forensic secure hospital providing high, medium and low security integrated on a single campus. The hospital is the only legally designated centre for forensic mental health treatments for a population of 4.6 million. At the time of the study the hospital was organised into a series of eight units from high secure admission and intensive care through medium secure and low secure to pre-discharge and community high support places so that the location at the start of the study period can be used as an index of the level of therapeutic security for the environment in which the patient is located [24].


All patients at the Central Mental Hospital during the period March to April 2010 (n = 100) with severe mental illness participated as part of routine assessments of risk and outcome measures.

Variables and data sources

The researchers who made ratings or collated them were each blind to the work of the others. One post membership psychiatric trainee (ZA) rated the START and SAPROF by interviewing patients, reviewing case notes and speaking to members of the multi-disciplinary team and ward-based nursing staff. The START takes a list of risk factors and treats each one as both a risk factor and a protective factor. The SAPROF includes items thought to be protective against violence such as intelligence, secure attachment in childhood and empathy that are not included in existing risk assessment instruments. Two post membership psychiatric trainees (LN and OG) carried out interviews using the PANSS and GAF.

The HCR-20 provides ratings for ten stable historical risk factors (HCR-H items) though we omitted item H7 ‘psychopathy’ as this was not in routine use, five current ‘clinical’ (HCR-C) and five future ‘risk management’ (HCR-R) items. Each item is scored 0 to 2 and the total scale is scored 0 to 38. The ‘C’ and ‘R’ items added together constitute a ‘dynamic’ or change sensitive score (HCR-dynamic, scored 0 to 20). Similarly, the Suicide Risk Assessment and Management Manual S-RAMM [15] is made up of 23 items each scored 0 to 2 with an overall scale score from 0 to 46 and generates a nine item stable, background score (S-RAMM-B) and change sensitive dynamic scales for eight current (S-RAMM-C) and five future (S-RAMM-F) risk items for self-harm or suicide, the latter two of which combine as a thirteen item dynamic score (S-RAMM-dynamic) rated 0 to 26. The HCR-20 and S-RAMM scales were collated from team assessments by an advanced nurse practitioner (AN) who ensured quality and fidelity to the handbook definitions.

Measures of need for therapeutic security (the DUNDRUM-1), treatment programme completion (DUNDRUM-3) and recovery (DUNDRUM-4) [22] were assessed by a forensic psychiatry lecturer / higher trainee (MD). These SPJ scales are composed of items rated 0 to 4 where ‘0’ indicates no need for therapeutic security, ‘1’ indicates a need for admission to an open ward or equivalent, ‘2’ for low security, ‘3’ for medium security and ‘4’ for high security. For the DUNDRUM-3 and DUNDRUM-4 ‘4’ indicates no readiness for a move to a less secure place, ‘3’ indicates a move from high to medium security, ‘2’ a move from medium to low security, ‘1’ a move from low security to open conditions and ‘0’ indicates no need for therapeutic security. The DUNDRUM-1 (eleven items rated 0 to 44) was used to provide a benchmark for comparative purposes so that other researchers replicating this study or carrying out meta-analyses can compare groups of patients according to their assessed need for therapeutic security. The scale includes items for seriousness of violence and self harm, immediacy of risk of violence and self harm, specialist forensic need, absconding, preventing access to contraband, victim sensitivity and public confidence, complex risk of violence, institutional behaviour and legal process. The DUNDRUM-3 (seven items rated 0 to 28) is a measure of programme completion in domains relevant to risk and harm reduction such as physical and mental health, substance misuse, problem behaviours, self-care and activities of daily living, education, occupation and creativity and family and social networks. The DUNDRUM-4 recovery items (six items rated 0 to 24) include stability, insight, therapeutic rapport, leave, dynamic risk and victim sensitivity.

Validity of measures

We first measured inter-rater reliability – although this was not necessary in the design of this study. Inter-rater reliability refers to the extent of convergence of judgements about individual items and overall scale scores of different assessors using the tool on the same patient.

We tested concurrent validity with the HCR-20 and S-RAMM because of the expected inverse relationship on the one hand between risk assessment scales HCR-20 and S-RAMM and the protective scales START-strength and SAPROF. The S-RAMM had been validated for the prediction of self-harm in this population and was known to overlap with assessment of risk of violence [32, 33]. We also expected positive correlations of HCR-20 and S-RAMM with the START-vulnerability score. We examined concurrent validity with the PANSS because of the known relationship between active symptoms and risk assessment measures [30]. We examined concurrent validity with the GAF because of the expected positive correlation with the protective scales START-strengths and SAPROF and because of the expected inverse relationship with the START-vulnerability scale. Finally we examined concurrent validity with the DUNDRUM-3 programme completion and DUNDRUM-4 recovery scales because they are measures of progress in treatments relevant to risk and increasing strength in domains related to recovery for forensic patients. Lower scores on these scales could be taken to represent ‘negative predictors’ or protective factors.

Outcome measures

The outcome measures were any adverse events. An adverse event was defined as in the START [19] handbook (page 9) where violence is defined as “any actual, attempted or threatened harm to self or others”. However in this study we have distinguished between violence and self-harm. The START handbook goes on to define self harm as “behaviours involving intentional injury of one’s own body without apparent suicide intent”. We have supplemented this by including any self-harming act whether it was thought to be with suicidal intent or not.

Adverse events were collated by one researcher (ZA) from routine incident report forms. These were supplemented by nurse management daily logs and statutory forms for seclusion and restraint over a 6 month period from March to April 2010 until 31st November 2010. These alternative sources of information acted as a cross check on the completeness of the record of adverse events.

Study size

All patients in the hospital during the period of baseline data gathering were included. START, SAPROF, HCR-20, S-RAMM, PANSS and GAF were obtained for 98 of 100. The DUNDRUM-1, DUNDRUM-3 and DUNDRUM-4 could not be completed for 6 patients who were discharged before these measures could be completed - they had a significantly shorter length of stay when assessed - 0.28(SD 0.46) years v 7.67(SD 10.09) years, (t = −7.0, df = 98, p < 0.001). The follow-up period was complete to the date patients left the hospital or to the end of the study period.

Based on earlier studies [33, 34] we estimated that approximately 10 violent and 10 self-harming adverse events might be expected over a six month period and that these would be sufficient to yield an area under the curve (AUC) in the receiver operating characteristic (ROC) that was capable of being significantly different from the line of random information.

Quantitative variables

The patients were grouped according to their location in the hospital as this has been established as a proxy for risk levels [23, 24, 32, 33]. Adverse events were further subdivided into violence and self-harm, as outcome measures for the prospective study.

Statistical methods

All data were analysed in SPSS-20 [34]. Correlations were calculated using the non-parametric Spearman correlation coefficient. Adverse events as outcomes of the prospective study were analysed using the receiver operating characteristic (ROC) area under the curve (AUC). An association was deemed significant if the 95% confidence interval of the AUC was greater than 0.5, the line of random information. The strength of association between measures and outcomes was measured using unadjusted odds ratios (OR). Because the odds ratio is a measure of the increase in odds for each increase of one point in the measurement scale, the magnitude of the odds ration differs according to the properties of item and scale scores. The odds ratio for a scale such as the GAF which is rated 0 to 100 will be inherently smaller when comparing like for like with the HCR-20, a scale rated 0 to 40. Likewise, the odds ratio for items rated 0 to 2 as in the HCR-20, S-RAMM, SAPROF and START will appear larger when comparing like for like with items from the DUNDRUM-1, DUNDRUM-3 and DUNDRUM-4 where items are rated 0 to 4. Confidence intervals for epidemiological rates were calculated using Confidence Interval Analysis [35].

Cronbach’s alpha statistic was used to measure the extent to which each item fits into the subscale or overall scale to which it is allocated. This is a measure of content coherence – whether all items in the overall scale or subscale measure the same thing. High internal consistency also indicates multiple co-linearity for items within a scale.

To examine the extent to which the protective instruments SAPROF, START-S, and recovery instruments DUNDRUM-3 and DUNDRUM-4 were protective in the presence of risk factors, we first carried out an analysis of variance in SPSS-20 with SAPROF, START-S, DUNDRUM-3 and DUNDRUM-4 as dependent variables, violence to others or self-harm as fixed factors (in separate analyses) and the HCR-20-dynamic score as covariate. To examine for interactive effects, we then carried out univariate analysis of variance to examine for main effects and interactive effects.

For item to outcome analysis, the receiver operating characteristic (ROC) area under the curve (AUC) and 95% confidence interval was calculated for each item, with harm to others and self harm as outcome measures. The unadjusted odds ratio (OR) and 95% confidence interval was also calculated for each item and both outcomes, as a measure of the strength of association. Because the items of each scale were strongly inter-correlated, regression models for the items of each scale were not attempted.


Participants and descriptive data

The 100 eligible patients included six women and 94 men. Mean age was 40.45 years (SD12.8, range 21.1 to 69.3). The average length of stay at the time of the baseline measures was 7.3 years (SD 9.9, range 0.03 to 44.6 years). Primary diagnosis according to ICD-10 criteria [36] was schizophrenia (ICD-10 F20) 69%, schizoaffective disorder (ICD-10 F25) 16%, bi-polar affective disorder (ICD-10 F31) 7%, recurrent depressive disorder, severe with psychotic symptoms (ICD-10 F33.3) 5%, intellectual disability (moderate mental retardation with significant impairment of behaviour ICD-10 F71.1) 3%.

Mean follow-up time was 181.9 days (SD 70.3, range 0 to 265). The number of patient-days at risk was 18,190.

Inter-rater reliability of new measures START and SAPROF

For 21 patients rated at different times by SM and ZA, the SAPROF total score correlated Spearman’s r = 0.829, p < 0.001. For the START-strength score, r = 0.694 p < 0.001 and for START-vulnerability score, r = 0.853, p < 0.001. The data subsequently analysed are the ratings made by one researcher ZA. These correlations are given only as an indication of the utility of the instruments.

Internal consistency

For the SAPROF 17 items Cronbach’s alpha = 0.880, START strengths n = 20 items Cronbach’s alpha = 0.949, START vulnerabilities 20 items Cronbach’s alpha = 0.945, all p < 0.001. No item, if deleted led to a substantial increase in alpha. Internal consistency for these scales is therefore good. Similarly, internal consistency for the PANSS alpha = 0.928 for the full scale, positive sub-scale 0.828, negative sub-scale 0.885, general sub-scale 0.822. The three item supplemental aggression risk (SAR) sub-scale which was not included in the total PANSS score had Cronbach’s alpha = 0.746. The HCR-20 full scale had Cronbach’s alpha = 0.866 with Historical sub-scale 0.672, Current (C) sub-scale 0.843, Risk (R) sub-scale 0.677, dynamic sub-scale (C + R) 0.872. The S-RAMM had a full scale alpha score = 0.672, Background (B) sub-scale 0.485, Current (C) sub-scale = 0.485, Future (F) sub-scale 0.453 and dynamic (C + F) 0.693. For the DUNDRUM-1 triage security scale, Cronbach’s alpha = 0.595, DUNDRUM-3 programme completion scale alpha = 0.912 and DUNDRUM-4 recovery scale alpha = 0.891.

Construct validity

The SAPROF and START were compared with each other. If the ‘strengths’ scales are valid, they should correlate positively with each other. If the concept of ‘strengths’ is distinct from risks or vulnerabilities, the strengths scales should not correlate strongly with risk or vulnerability scales. The START-S and SAPROF correlated strongly with each other (r = +0.810 p < 0.001), indicating that they measure the same construct. (Table 1). If the START strength and START vulnerability scales measure different constructs, they should not correlate. The correlation between the two was very strong and inverse r-0.947 p < 0.001 indicating that they measure the same thing, one as the inverse of the other.

Table 1 Cross validation using Spearman's rank correlation coefficient

Concurrent validity

The SAPROF and START are said to measure dynamic factors and so they are not expected to correlate with established scales or sub-scales made up of historical or static risk factors. Table 1 shows that the START-S correlated moderately and inversely with HCR-H and weakly and inversely with SRAMM-B. The SAPROF correlated moderately and inversely with the HCR-H and weakly with the S-RAMM-B.

If the SAPROF and START-S measure something different from risk, they should not correlate with the HCR-20 dynamic or S-RAMM dynamic risk assessment scales. Actually they correlated strongly but inversely with the HCR-20 dynamic scale. There was a moderate inverse correlation between the S-RAMM dynamic scale and the START-S. The S-RAMM dynamic score also had a moderate inverse correlation with the SAPROF (Table 1).

START and SAPROF were also compared with measures of global function and mental state. There was a strong positive correlation between GAF and START-S and strong inverse correlation between GAF and START-V. SAPROF and GAF correlated best (Table 1).

Table 1 shows that the PANSS-positive, PANSS-negative, PANSS-general PANSS-total scores and the PANSS-SAR score all correlated strongly and inversely with START-S. PANSS-positive, PANSS-total and PANSS-SAR scores correlated positively with START-V but START-V correlated less well with PANSS-negative and PANSS-general scores (Table 1). The SAPROF correlated inversely with the PANSS scales.

The DUNDRUM-1 correlated weakly and inversely with the START-S, moderately with the START-V and had a weak inverse correlation with the SAPROF. The DUNDRUM3 and DUNDRUM-4 both had strong inverse correlations with the START-S, strong positive correlations with the START-V and strong inverse correlations with the SAPROF.

Prospective study of violence and self harm

Thirteen individuals had adverse incidents concerning harm to others (broadly defined, as above) during the follow up period and 7 individuals had incidents involving self harm (broadly defined, as above). There was a significant overlap between self-harm and harm to others (X2 = 35.2, df = 1, p < 0.001, phi = 0.593, p < 0.001). The rate of events of harm to others (the base rate for violence) was 7.1 per 10,000 patient-days at risk (95% confidence interval 3.8 to 12.2/10,000) and the rate of self-harming events (the base rate for self-harm) was 3.8 per 10,000 patient-days at risk (95% CI 1.5 to 7.9/10,000).

The location at baseline (for eight locations from the most to least secure) predicted harm to others (AUC = 0.812, 95% confidence interval 0.677 to 0.948, p < 0.001) as expected, since we have previously shown that location is a proxy for measures of risk [32] and recovery [23, 24]. Length of stay at the beginning of the observation period did not predict harm to others (AUC = 0.504, 95% CI 0.343-0.665, p = 0.963). Location at baseline also predicted self-harm (AUC = 0.838, 95% CI 0.689-0.987, p = 0.003) while length of stay did not predict self harm or the absence of it (AUC = 0.578, 95% CI = 0.383-0.722, p = 0.495).

Table 2 shows that the SAPROF score predicted both the absence of violence and self harm (absence of violence AUC = 0.847 and absence of self-harm AUC = 0.766). The START Strengths and START Vulnerabilities predicted violence (START-S and absence of violence AUC = 0.776, violence; START-V and presence of violence AUC = 0.823) but not self harm.

Table 2 Scale scores at baseline and subsequent violence and self harm

By contrast, the HCR-20 predicted both violence (AUC = 0.872) and self harm (AUC = 0.881) as did all of its sub-scales. The S-RAMM predicted violence (AUC = 0.838) though not as quite so well as the HCR-20 and the S-RAMM predicted self-harm (AUC = 0.818) as did the S-RAMM sub-scales, though the S-RAMM-background and future scales did not reach significance for self-harm in this study. It is interesting to note that the SAPROF did almost as well as the S-RAMM as a predictor of the absence of self harm.

The GAF score was a significant predictor of the absence of both violence and self harm, with high AUCs (absence of violence AUC = 0.813 and absence of self-harm AUC = 0.855).

PANSS positive, PANSS general and PANSS total scores each predicted violence and self harm though the PANSS negative symptom score was neither a positive nor a negative predictor for violence or self harm. The odds ratios for these sub-scale scores, though significant, were only modestly better than chance. The PANSS supplemental aggression risk (SAR) score (not included in the PANSS total score) was also a significant predictor of both harm to others and harm to self.

Contrary to expectations, The DUNDRUM-1 triage security score predicted violence (AUC = 0.743) and though the AUC for the prediction of self harm did not reach significance, the odds ratio did (OR = 1.226). The DUNDRUM-3 programme completion score predicted violence (AUC = 0.832) and self-harm (AUC = 0.750), while the DUNDRUM-4 recovery scale also predicted violence (AUC = 0.728) and self-harm (AUC = 0.713).

Interactive effects between risk factors and protective factors

Univariate analysis of variance was used to test for the presence of interactive effects between risk measures and protective measures.

Tables 3 and 4 show that the SAPROF, START-S, DUNDRUM-3 and DUNDRUM-4 were all significantly different for the 13 patients who were violent when compared to the non-violent. Likewise for the 7 who self harmed compared to those who did not self-harm. However when these results were adjusted for the HCR-20-dynamic score the differences were no longer significant.

Table 3 Risk and recovery measures for violent and non-violent individuals compared
Table 4 Risk and recovery measures for self-harmers and others compared

For harm to others, the SAPROF and HCR-20-dynamic had significant main effects (HCR-20-dynamic F = 3.97, df = 17, p = 0.003, SAPROF F = 4.67, df = 25, p < 0.001) and a significant interaction effect (F = 2.973, df = 38, p = 0.008) indicating that the SAPROF had a ‘true’ protective effect. The SRAMM-dynamic score also had a significant interaction with the HCR-20-dynamic score (HCR-20-dynamic F = 3.828, df = 18, p = 0.001, SRAMM-dynamic F = 3.909. df = 20, p < 0.001, interaction F = 2.794, df = 33, p = 0.003) apparently indicating a synergistic effect. The START-S, DUNDRUM-1, DUNDRUM-3, DUNDRUM-4, PANNS positive, PANSS negative, PANSS general and GAF scores did not have significant interactions with the HCR-20-dynamic score.

The SAPROF did not have significant interactive effects with the START-S, DUNDRUM-1, DUNDRUM-3, DUNDRUM-4, SRAMM-dynamic, PANSS-positive, PANSS-neg-ative or PANSS-general scores, but had a marginal interactive effect with the GAF (main effects SAPROF F = 4.05, df = 25, p < 0.001, GAF F = 5.78, df = 21, p < 0.001, interactive effect F = 1.98, df = 33, p = 0.059).

Item to outcome analysis

Table 5 shows the performance of each item of the SAPROF as predictors of violence and self-harm. Twelve of the 17 items predicted the absence of violence including factors such as empathy (OR = 0.231), coping ability (OR = 0.187), self-control (OR = 0.205), work and leisure activities (OR = 0.336), financial management (OR = 0.231), motivation for treatment (OR = 0.340) and attitudes towards authority (OR = 0.264). Five items in the SAPROF predicted the absence of self-harm including empathy (OR = 0.293), coping (OR = 0.192), self-control (OR = 0.260), leisure activities (OR = 0.203) and use of medication (OR = 0.314). Odds ratios could not be calculated for items 14 to 17 (intimate relationships, professional care, living circumstances, external control) because for in-patients in a secure hospital there was too little variation in these item scores.

Table 5 SAPROF items related to outcomes

Table 6 shows that 16 of the 20 START-strengths items predicted the absence of violence (strongest odds ratios impulse control OR = 0.244, external trigger OR = 0.249), though only one (mental state OR = 0.180) appeared to predict absence of self-harm. Table 7 shows that for START-vulnerabilities, 16 of 20 items predicted violence, not always the same items as for START-strengths (strongest associations ‘relationships’ OR = 5.5, ‘external triggers’ OR = 6.3, ‘conduct’ OR = 5.1), while ‘mental state’ was again the only item predicting self-harm (OR = 3.9).

Table 6 START strengths items related to outcomes
Table 7 START vulnerabilities items related to outcomes

Table 8 shows that five of the ten historical items of the HCR-20 predicted violence in this in-patient forensic group (strongest associations ‘early maladjustment’ OR = 1.2 and ‘prior supervision failure’ OR = 1.2) while four of the five ‘current’ or ‘C’ items and three of the five ‘risk management’ or ‘R’ items predicted violence (e.g. C5 ‘unresponsiveness to treatment’ OR = 3.9, R4 ‘non-compliance’ OR = 4.8). H1 ‘past violence’ was not a significant predictor in this population as all subjects scored positive.

Table 8 HCR-20 items related to outcomes

For the prediction of self-harm in this group, two of the ten HCR-20 ‘H’ items, four of the five ‘C’ items and one of the five ‘R’ items were better than chance. As before, odds ratios for individual items were an interesting guide to the relative importance of items, with highest odds ratios for items C5 ‘unresponsiveness to treatment’ OR = 6.1, C1 ‘Lack of insight’ OR = 5.8, H 2 ‘young age at first violent incident’ and H3 ‘relationship instability’ both OR = 5.5.

Table 9 shows that for the S-RAMM, one of the nine background or ‘B’ items, two of the eight current or ‘C’ items and one of the five future or ‘F’ items predicted violence, while two of the nine ‘B’ items predicted self-harm (B1 ‘history of deliberate self harm’ OR = 2.5, B3 ‘previous hospitalisation’ OR = 6.4), as did three of the eight ‘C’ items (C3 ‘psychological symptoms’ OR = 3.8, C7 ‘psychosocial stress’ OR = 3.4 and C8 ‘problem solving deficits’ OR = 7.9).

Table 9 S-RAMM items related to outcomes

Table 10 shows that three of the eleven DUNDRUM-1 triage security items (scored 0 to 4) predicted violence (TS4 ‘immediacy of risk of suicide or self harm’ OR = 1.5, TS9 ‘complex risk of violence’ OR = 3.3, TS10 ‘institutional behaviour’ OR = 2.7) and 3 items predicted self-harm (TS2 ‘seriousness of self harm’ OR = 1.4, TS4 ‘immediacy of risk of suicide/self harm’ OR = 1.5, TS10 ‘institutional behaviour’ OR = 2.7). For the DUNDRUM-3 programme completion items, all seven predicted violence (odds ratios ranged from 1.9 to 4.9) and four predicted self-harm (odds ratios ranged from 1.9 to 4.9). Item PC2 ‘Mental health’ had the strongest odds ratio for both harm to others (OR = 4.9) and to self (OR = 5.2). For the DUNDRUM-4 recovery scale, four of the six items predicted violence and 3 predicted self-harm. Odds ratios were similar for harm to others and harm to self, except for the item derived from the HCR-20 dynamic risk scale, which had a stronger odds ratio for harm to others.

Table 10 DUNDRUM-1 triage security items, DUNDRUM-3 programme completion items and DUNDRUM-4 recovery items related to outcomes

For the PANSS, Table 11 shows that of the seven positive symptoms (scored 0 to 7), four were significantly associated with violence (‘conceptual disorganisation’ OR = 1.9, ‘hyperactivity’ OR = 3.6, ‘suspiciousness’ OR = 1.5 and ‘hostility’ OR = 2.2). Some of these were marginal (AUC > 0.5) and neither delusions nor hallucinations were associated with violence in this treated in-patient group of patients with severe mental illness, because of lack of variation in the population studied. Of the seven negative symptoms only one, ‘poor rapport’ (OR = 1.7) was associated with violence and of the 16 general symptoms, ‘tension’ (OR = 2.1), ‘uncooperativeness’ (OR = 1.7), ‘poor attention’ (OR = 2.3) and ‘poor impulse control’ (OR = 2.4) were associated with violence. Odds ratios could not be calculated for item G10 ‘disorientation’ as all subjects scored negative.

Table 11 PANSS items and AUC for harm to others and harm to self

For self-harm, the positive symptoms ‘conceptual disorganisation’ (OR = 2.3), ‘hyperactivity’ (OR = 2.8) and ‘hostility’ (1.8) were again associated with adverse outcomes. For negative symptoms, none were associated with self-harm. Of the general symptoms, only ‘tension’ (OR = 2.4), ‘poor attention’ (OR = 2.4), ‘lack of judgement and insight’ (OR = 1.5), ‘disturbance of volition’ OR = 2.3), ‘poor impulse control’ (OR = 2.1) and ‘preoccupation’ (OR = 2.2) were associated with self-harm.

The three items of the PANSS supplemental aggression risk scale (SAR) predicted both harm to others (‘anger’ OR = 2.6, ‘difficulty in delaying gratification’ OR = 2.2, ‘affective lability’ OR = 2.2) and ‘harm to self’ (anger OR = 3.3, ‘difficulty delaying gratification’ OR = 1.6, ‘affective lability’ OR = 2.1).


Main findings

This paper presents validation studies for ‘fourth generation’ risk assessment instruments. We have examined the utility of these instruments for assessing risk and protective factors for both violence and self-harm. We have identified both overlaps and differences in the risk factors that contribute to predictions of risk of violence and self-harm. We have included some methodological approaches intended to facilitate future researchers who might replicate this work or include it in meta-analyses. These include stating the base rates for violence and self-harm and giving the DUNDRUM-1 triage security ratings as a means of benchmarking the background need for therapeutic security. We believe the most important finding is confirmation that true protective effects can be identified. The SAPROF, a protective scale does more than assess the absence of risk – the SAPROF also had an interactive effect with the HCR-20, offsetting risk.

The SAPROF and START achieved satisfactory levels of inter-rater reliability. The SAPROF and START have good internal consistency. The START strengths and START vulnerabilities scores were strongly inversely correlated, suggesting that the START strengths score is simply the risk measure repeated. However there is sufficient difference in content between the START strengths and SAPROF on the one hand and the HCR-20 sub-scales to explain the interactive effect between the HCR-20 and the SAPROF, so that the ‘strengths/protective’ paradigm is not merely the same risk factors in new clothes.

The DUNDRUM toolkit instruments were not designed as risk assessment instruments; they were designed to be complementary to risk assessments. The DUNDRUM-1 was included only as a benchmarking measure to enable future replication and meta-analysis. In spite of this, the DUNDRUM-1 triage security scale predicted violence and some of its items predicted violence and also self harm. This may be because items such as suicidal behaviour, complex needs and institutional behaviour are indicators of the seriousness of the behaviour that follows and such acts are easier to detect. DUNDRUM-3 programme completion and DUNDRUM-4 recovery scales were good predictors of violence, comparable to the HCR-20 sub-scales and total score.

The S-RAMM, an assessment of risk of suicide and self-harm, was a predictor of violence and the S-RAMM dynamic score had a synergistic interaction with the HCR-20-dynamic score. The GAF and PANSS scales (other than the PANSS negative score) also performed well as predictors of violence. Although designed as assessments of risk and protective factors for violence, most scales were also predictive of self-harm. The S-RAMM-C scale performed well but the S-RAMM-B (background or fixed historical risk factors for suicide), S-RAMM-F (‘future’ risk factors for suicide), START-S and START-V were notable for their lack of predictive capability for self harm in this study.

The overlap between risk factors for violence and self-harm, and the need to assess both has been established [32, 33, 37, 38]. An item analysis shows considerable overlap of the content of each of the scales examined here. Of these, the DUNDRUM-3 programme completion items appeared particularly strong predictors of self-harm or the absence of it, perhaps because of an underlying element of positive motivation that is inherent in the way each item is defined and rated. Of greatest relevance is that most scale scores were predictors of both violence and self-harm, though this was often because of different items within each scale. Much of this appears to be contextual. In a group made up of forensic patients admitted to a forensic hospital because of severe mental illness and violence, it is not surprising that items such as the first item of the HCR-20 ‘past violence’ should be poor discriminants for further violence in a group where all score positive. It is important to note that in this context items such as the first S-RAMM item ‘past self-harm’ are such good predictors of violence to others, though not self-harm, while items such as HCR-20 C5 ‘unresponsiveness to treatment’ and S-RAMM item C3 ‘psychological symptoms’ and C8 ‘problem solving deficits’ predicted both harm to others and self-harm.

For violence, only some items in each scale were predictive. The highest AUC results were obtained for lack of progress in treatment programmes such as education, occupation and creativity, a low GAF score, conduct problems, lack of progress in mental health programmes, impulse control, adverse institutional behaviour, leisure activities, external triggers, negative attitudes, poor attention, financial problems, hyperactivity and self-control, relationship problems, stability, empathy and hostility.

For self-harm a different selection of items predicted adverse events with the highest AUC results for the GAF, poor attention, conceptual disorganisation, lack of progress in mental health programmes, disturbance of volition, unresponsiveness to treatment, adverse institutional behaviour, preoccupation, leisure activities, hyperactivity, tension, problem solving deficits, stability, self-control and negative attitudes.

A notable feature emerges when unadjusted odds ratios are compared with AUC results. Scales and items with significant AUC statistics may have better than random sensitivity and specificity, but may still be weak predictors. While this reflects the reality of multiple co-linearity, any argument that a risk assessment scale made up of just a few of the strongest items would be sufficient is at odds with the clinical need to take notice of a much wider range of risk and protective factors when planning care and treatment [7, 8] and when making recommendations or decisions regarding discharge [25]. However it is also the case that using structured professional judgement instruments to assess treatment needs would be invalid if many of the items were poor predictors on their own. We believe the poor performance of many scale items should lead to two forms of revision of these scales. The first would be to specify that some items are useful only for certain contexts – such as in-patient settings, out-patient community placements or prisons. The second would be to drop some items or refine their handbook definitions.

Study limitations

This paper describes the predictive validity of the SAPROF, START, DUNDRUM-3 and DUNDRUM-4, PANSS and GAF - a range of risk assessment, symptom measurement and outcome measurement instruments for a group of forensic in-patients, including only 6% women. The outcomes found here may not generalise to other settings. Factors influencing in-patient violence may not generalise to violence in the community. Similarly, self-harm in hospital may not equate to self-harm in the community. Replication in other populations would be helpful. However we found that length of stay did not predict violence or self harm whereas location along the continuum of care did. This demonstrates that in this hospital milieu, location was determined by risk and need for therapeutic security, not by a simple chronological waiting list or tariff for movement from more secure to less secure locations. The location was therefore a proxy for risk and an indication that placement and milieu were appropriate to individual need, to manage and reduce risk [3941].

Many items included in validated risk assessment instruments such as the HCR-20 appeared not to be predictive in this study. The item H7 ‘psychopathy’ was omitted in keeping with modern practice and the latest revision of the HCR-20. This study size and length of follow-up may have been insufficient – a longer follow-up period would have generated a higher base rate for violence and self-harm. We have stated the exact base rates for these events to enable future research and meta-analysis to make valid comparisons. Future studies in more acute in-patient populations (acute psychiatric intensive care units) might be expected to observe higher base rates, more incidents of violence and self-harm amongst fewer patients. Such a study might show stronger effects for more risk factors. Similarly, studies of community based samples might yield fewer adverse events and very different results for individual items. However the completeness and reliability of the recording of such events in the community might be less reliable.

There may be other predictive factors not included in the instruments studied here. The PANSS does not include specific items for sadness or hopelessness, though the item ‘psychological symptoms’ in the S-RAMM would include such symptoms.

According to Rutter [27] a proper analysis of protective effects or resilience would have to involve examining for the effects of protective factors and the interactions they might have not just with risk factors but also with adverse life-events and difficulties. We found some evidence for an interactive effect between the SAPROF and the HCR-20 dynamic score. However a further analysis of interactions between items is required. It may be that some ‘strength’ items are protective against some ‘vulnerability’ risk items but not others. Analysing the many possible combinations would require very large numbers to allow correction for multiple testing, an analysis beyond the scope of this study. Including the location in an analysis of variance may go some way towards this. In a forensic hospital with high levels of staff to patient ratios and professional, ‘low expressed emotion’ interactions, provocations may be so limited that even high personal risk factors are less likely to lead to violence or self harm than might occur in the community. And high average levels of positive symptoms and some risk factors may overshadow other risk factors, making powerful risk factors appear to be poor discriminants in that setting [42].

Finally, this prospective study covered a six month observation period, though a shorter observation period may have been more meaningful, at least for symptom ratings. A longer observation period might have improved statistical power with more adverse events emerging, but a longer observation period would also raise the problem of the validity of ratings of dynamic risk items which might have changed in the interim. Much remains to be learned about the time course over which dynamic and some static risk items might change.

Future validation studies

The definitive study would probably have to screen very large numbers of high-risk patients in order to demonstrate an effect, including interactive effects. Serial assessments to demonstrate change and assess the effect of change would be of interest. Mindful of the criteria suggested by the Risk Management Authority of Scotland for an evidence based risk assessment tool, it may be necessary to replicate studies such as this in different populations and cultures. Future studies should also consider the synergistic interactions between individual items. True ‘protective effects’ may only emerge from such studies [43, 44].

Advantages of fourth generation (START and SAPROF) structured professional judgement instruments in clinical practice

Measures of strengths and protective factors such as START-S and SAPROF, as well as measures of progress in treatment programmes and recovery such as the DUNDRUM-3 and DUNDRUM-4 performed well in this study of in-patient violence and self-harm in a forensic setting. While identifying items as ‘protective’ rather than ‘vulnerability’ factors may to some extent be a matter of semantics (the START-S and START-V appear to be simple reciprocals of each other), new content can be found in the SAPROF and in the DUNDRUM-3 programme completion and DUNDRUM-4 recovery instruments as well as some START items.

The S-RAMM item B1 “history of deliberate self harm” predicted harm to others as well as harm to self (AUC 95% CI greater than 0.5) in this population. In contrast, the HCR-20 H1 item “previous violence” was a poor discriminator in this forensic population, because almost all patients scored positive. This may be an unexpected benefit of using diverse assessments and may also be understandable in psychological terms. The S-RAMM is also valuable because so much of the item content is not replicated in violence risk assessment instruments.

The new content of instruments such as the SAPROF, START, DUNDRUM-3 and DUNDRUM-4 as well as the S-RAMM lends itself to the use of risk assessment as a form of needs assessment and a means for planning care and treatment. These positively connoted factors are more likely to be acceptable to patients or service users when working to engage them in recovery oriented programmes in which risk management is important.

The pairing of assessments of risk of violence to others with assessments of risk of self-harm and suicide is similarly a means of identifying the process of risk assessment and risk management with the patient or service user’s best interests rather than a process intended exclusively to serve the purposes of criminal justice and public protection.

The START has been shown to have good psychometric properties and to be predictive of violence when used as part of the assessment for mental health review boards in a forensic hospital [45]. De Vries Robbé et al. found that the use of the SAPROF can be helpful in formulating treatment goals, progressing through stages of treatment, planning the phasing of treatment and facilitating risk communication [46, 47]. The DUNDRUM-3 and DUNDRUM-4 are intended to serve the same processes of treatment planning and measurement of treatment outcome in domains relevant to risk reduction and risk management and to provide transparency when reporting to mental health tribunals and boards [2226].

The GAF, a simple global assessment of function, performs as well as many of the more specific assessment instruments. It may be that global function is the most consistent underlying measure of risk and resilience. Alternatively this may be an example of the use of an ‘intuition based’ assessment. Carroll [48] has recently pointed out the advantages of such intuitive approaches while cautioning that a structured professional judgement instrument should always be used alongside intuitive assessments as a valid, transparent, deliberative and unbiased check on some of the problems that can arise with intuitive methods.


The START and SAPROF have good psychometric characteristics for use as clinical or research instruments in severe mental illness. The SAPROF predicted the absence of violence and self harm. The START-Strengths predicted absence of violence but not self harm. The DUNDRUM-3 programme completion and DUNDRUM-4 recovery instruments also predicted violence and self-harm. They were not consistently better than the HCR-20, S-RAMM, PANSS or GAF in predicting adverse events-violence or self harm. The SAPROF, START, DUNDRUM-3 and DUNDRUM-4 however have the advantage of covering a wider content than existing risk assessment instruments and have different purposes. The GAF performs as well as the scale scores of most specific risk assessment instruments. Many individual items in the SPJ instruments studied were strongly associated with adverse outcomes in this setting, meriting further study of context and interactive effects.


  1. Doyle M, Dolan M: Predicting community violence from patients discharged from mental health services. Br J Psychiatr. 2006, 189: 520-526. 10.1192/bjp.bp.105.021204.

    Article  Google Scholar 

  2. Monahan J, Steadman HJ: Violence and mental disorder. Developments in risk assessment. 1984, Chicago & London: The University of Chicago press

    Google Scholar 

  3. Douglas K, Cox D, Webster C: Violence risk assessment: science and practice. Leg Criminol Psychol. 1999, 4: 184-19.

    Article  Google Scholar 

  4. Dolan M, Doyle M: Violence risk prediction. Clinical and actuarial measures and the role of the Psychopathy Checklist. Br J Psychiatr. 2000, 177: 303-311. 10.1192/bjp.177.4.303.

    CAS  Article  Google Scholar 

  5. Webster CD, Muller-Isberner R, Franson G: Violence risk assessment: using structured clinical guidelines professionally. Int J Forensic Ment Health. 2002, 1: 185-193. 10.1080/14999013.2002.10471173.

    Article  Google Scholar 

  6. Douglas KS, Ogloff JR, Hart SD: Evaluation of a model of violence risk assessment among forensic psychiatric patients. Psychiatr Serv. 2003, 54: 1372-1379. 10.1176/

    Article  PubMed  Google Scholar 

  7. Kennedy HG: Risk assessment is inseparable from risk management: comment on Szmuckler. Psychiatr Bull. 2001, 25: 208-211. 10.1192/pb.25.6.208.

    Article  Google Scholar 

  8. Doyle M, Dolan M: Violence risk assessment: combining actuarial and clinical information to structure clinical judgements for the formulation and management of risk. J Psychiatr Ment Health Nurs. 2002, 9: 649-657. 10.1046/j.1365-2850.2002.00535.x.

    CAS  Article  PubMed  Google Scholar 

  9. Grove WM, Meehl PE: Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures. The clinical-statistical controversy. Psychol Publ Pol Law. 1996, 2: 293-323.

    Article  Google Scholar 

  10. Kraemer H, Kazdin A, Offord D, Kesler R, Jensen P, Kupfer D: Coming to terms with the terms of risk. Arch Gen Psychiatr. 1997, 54: 337-343. 10.1001/archpsyc.1997.01830160065009.

    CAS  Article  PubMed  Google Scholar 

  11. Hart SD: The role of psychopathy in assessing risk for violence: conceptual and methodological issues. Leg Criminol Psychol. 1998, 3: 121-137. 10.1111/j.2044-8333.1998.tb00354.x.

    Article  Google Scholar 

  12. Douglas K, Cox D, Webster C: Violence risk assessment: science and practice. Leg Criminol Psychol. 1999, 4: 194-184.

    Article  Google Scholar 

  13. Webster CD, Douglas KS, Eaves D, et al: (1997) HCR–20: assessing risk for violence. 1997, Burnaby: Mental Health Law and Policy Institute, Simon Fraser University,

    Google Scholar 

  14. Clarke M, Davies S, Hollin C, Duggan C: Long-term suicide risk in forensic psychiatric patients. Arch Suicide Res. 2011, 15: 16-28. 10.1080/13811118.2011.539951.

    Article  PubMed  Google Scholar 

  15. Bouch J, Marshall JJ: S-RAMM: suicide risk assessment and management manual (research edition) vale of Glamorgan. 2003, Vale of Glamorgan: Cognitive Centre Foundation

    Google Scholar 

  16. Borum R, Bartel P, Forth A: Manual for the structured assessment of violence risk in youth (SAVRY) version 1.1. 2003, Florida: University of South Florida

    Google Scholar 

  17. Meyers JR: Predictive validity of the structured assessment for violence risk in youth (SAVRY) with juvenile offenders. Crim Justice Behav. 2008, 35: 344-355. 10.1177/0093854807311972.

    Article  Google Scholar 

  18. Webster CD, Martin ML, Brink J, Nicholls TL, Desmarais SL: Short-term assessment of risk and treatability (START) version 1.1. Coquitlam. 2009, Hamilton, Ontario: British Columbia, Mental Health and Addiction Services & St Joseph’s Healthcare

    Google Scholar 

  19. Webster CD, Nicholls TL, Martin M-L, Desmarais MA, Brink J: Short-term assessment of risk and treatability (START):the case for a new structured professional judgment scheme. Behav Sci Law. 2006, 24: 747-766. 10.1002/bsl.737.

    Article  PubMed  Google Scholar 

  20. Doyle M, Lewis G, Brisbane M: Implementing the short-term assessment of risk and treatability (START) in a forensic mental health service. Psychiatrist. 2008, 32: 406-408. 10.1192/pb.bp.108.019794.

    Google Scholar 

  21. De Vogel V, De Ruiter C, Bouman Y, De Vries Robbe M: SAPROF. Structured assessment of PROtective factors for violence risk. Versie 1. 2007, Utrecht: Forum Educatief

    Google Scholar 

  22. Kennedy HG, O’Neill C, Flynn G, Gill P: Four structured professional judgment instruments for admission triage, urgency, treatment completion and recovery assessments. The dundrum toolkit. Dangerousness, understanding, recovery and urgency manual (the dundrum quartet) V1.0.21 (18/03/10). 2010, Dublin: Trinity College Dublin,,

    Google Scholar 

  23. O'Dwyer S, Davoren M, Abidin Z, Doyle E, McDonnell K, Kennedy HG: The DUNDRUM quartet: validation of structured professional judgement instruments DUNDRUM-3 assessment of programme completion and DUNDRUM-4 assessment of recovery in forensic mental health services. BMC Res Notes. 2011, 4: 229-10.1186/1756-0500-4-229.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Davoren M, O'Dwyer S, Abidin Z, Naughton L, Gibbons O, Doyle E, McDonnell K, Monks S, Kennedy HG: Prospective in-patient cohort study of moves between levels of therapeutic security: the DUNDRUM-1 triage security, DUNDRUM-3 programme completion and DUNDRUM-4 recovery scales and the HCR-20. BMC Psychiatr. 2012, 12: 80-10.1186/1471-244X-12-80.

    Article  Google Scholar 

  25. Davoren M, O’Dwyer S, Abidin Z, Naughton L, Gibbons O, Doyle E, McDonnell K, Monks S, Kennedy HG: Prospective study of factors influencing conditional discharge from a forensic hospital: the DUNDRUM-3 programme completion and DUNDRUM-4 recovery structured professional judgement instruments and risk. BMC Psychiatr. 2013, 13: 185-10.1186/1471-244X-13-185.,

    Article  Google Scholar 

  26. Flynn G, O’Neill C, McInerney C, Kennedy HG: The DUNDRUM-1 structured professional judgment for triage to appropriate levels of therapeutic security: retrospective-cohort validation study. BMC Psychiatr. 2011, 11: 43-10.1186/1471-244X-11-43.

    Article  Google Scholar 

  27. Rutter M: Resilience in the face of adversity: protective factors and resistance to psychiatric disorder. Br J Psychiatr. 1985, 147: 598-611. 10.1192/bjp.147.6.598.

    CAS  Article  Google Scholar 

  28. Mossman D: Assessing predictions of violence: being accurate about accuracy. J Consult Clin Psychol. 1994, 62: 783-792.

    CAS  Article  PubMed  Google Scholar 

  29. Kay SR, Fiszbein A, Opler LA: The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr Bull. 1987, 13: 261-277. 10.1093/schbul/13.2.261.

    CAS  Article  PubMed  Google Scholar 

  30. Mullen P: A reassessment of the link between mental disorder and violent behaviour, and its implications for clinical practice. Aust N Z J Psychiatr. 1997, 31: 3-11. 10.3109/00048679709073793.

    CAS  Article  Google Scholar 

  31. Hall RC: Global assessment of functioning (GAF). a modified scale. Psychosomatics. 1995, 36: 267-75. 10.1016/S0033-3182(95)71666-8.

    CAS  Article  PubMed  Google Scholar 

  32. Ijaz A, Papaconstantinou A, O’Neill H, Kennedy HG: The suicide risk assessment and management manual (S-RAMM) validation study 1Ir. J Psych Med. 2009, 26: 54-58.

    Google Scholar 

  33. Fagan J, Papaconstantinou A, Ijaz A, Lynch A, O’Neill H, Kennedy HG: The suicide risk assessment and management manual (S-RAMM) validation study IIIr. J Psych Med. 2009, 26: 107-113.

    Google Scholar 

  34. IBM Corp: IBM SPSS Statistics for Windows, Version 20.0. 2011, Armonk, NY: IBM Corp, Released

    Google Scholar 

  35. Bryant T: Confidence interval analysis version 2.2.0 Build 57. 2000–2011, University of Southampton

    Google Scholar 

  36. World Health Organization: The ICD–10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines. 1992, Geneva: WHO

    Google Scholar 

  37. Hillbrand M: Aggression against self and aggression against others in violent psychiatric patients. J Consult Clin Psychol. 1995, 63: 668-671.

    CAS  Article  PubMed  Google Scholar 

  38. Gray NS, Hill C, McGleish A, Timmons D, MacCulloch MJ, Snowden RJ: Prediction of violence and self-harm in mentally disordered offenders: a prospective study of the efficacy of HCR-20, PCL-R, and psychiatric symptomatology. J Consult Clin Psychol. 2003, 71: 443-451.

    Article  PubMed  Google Scholar 

  39. Dernevik M, Grann M, Johansson S: Violent behaviour in forensic psychiatry patients: risk assessment and different risk-management levels using the HCR-20. Psychol Crime Law. 2002, 8: 93-102. 10.1080/10683160208401811.

    Article  Google Scholar 

  40. Muller-Isberner M, Webster CD, Gretenkord L: Measuring progress in hospital order treatment: relationship between levels of security and C and R scores of the HCR-20. Int J Forensic Men Health. 2007, 6: 113-121. 10.1080/14999013.2007.10471256.

    Article  Google Scholar 

  41. Pillay SM, Oliver B, Butler L, Kennedy HG: Risk stratification and the care pathway. Irish J Psychol Med. 2008, 25: 123-127.

    Article  Google Scholar 

  42. Ehmann TS, Smith GN, Yamamoto A, McCarthy N, Ross D, Au T, et al: Violence in treatment resistant psychotic inpatients. J Nerv Ment Dis. 2001, 189: 716-721. 10.1097/00005053-200110000-00009.

    CAS  Article  PubMed  Google Scholar 

  43. Fitzmaurice G: The meaning and interpretation of interaction. Nutrition. 2000, 16: 313-314. 10.1016/S0899-9007(99)00293-2.

    CAS  Article  PubMed  Google Scholar 

  44. Cortina-Borja M, Smith AD, Combarros D, Lehmann DJ: The synergy factor: a statistic to measure interactions in complex diseases. BMC Res Notes. 2009, 2: 105-10.1186/1756-0500-2-105.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Nicholls TL, Brink J, Desmarais SL, Webster CD, Martin M-L: The short-term assessment of risk and treatability (START). a prospective validation study in a forensic psychiatric sample. Assessment. 2008, 13: 313-317.

    Article  Google Scholar 

  46. De Vogel V, De Vries RM, De Ruiter C, Bouman THA: Assessing protective factors in forensic population practice: introducing the SAPROF. Int J Forensic Ment Health. 2011, 10: 171-177. 10.1080/14999013.2011.600230.

    Article  Google Scholar 

  47. De Vries RM, De Vogel V, De Spa E: Protective factors for violence risk in forensic psychiatric patients, a retrospective validation study of the SAPROF. Int J Forensic Ment Health. 2011, 10: 178-186. 10.1080/14999013.2011.600232.

    Article  Google Scholar 

  48. Carroll A: Good (or bad) vibrations: clinical intuition in violence risk assessment. Adv Psychiatr Treat. 2012, 18: 447-456. 10.1192/apt.bp.111.010025.

    Article  Google Scholar 

Pre-publication history

Download references


The authors wish to acknowledge the service users who cooperated with the assessment process and the authors wish to acknowledge the many colleagues who contributed to the rating process.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Harry G Kennedy.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

ZA rated patients using the SAPROF and START and collated the adverse events. MD rated the DUNDRUM-1, DUNDRUM-3 and DUNDRUM-4. LN and OG both rated patients using the PANSS and GAF. AN coordinated and collated ratings of HCR-20 and S-RAMM. HGK wrote the first draft of the paper, designed the study and carried out the data analysis. All authors reviewed the drafts and agreed the final text. All authors read and approved the final manuscript.

Zareena Abidin, Mary Davoren, Leena Naughton, Olivia Gibbons, Andrea Nulty and Harry G Kennedy contributed equally to this work.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Abidin, Z., Davoren, M., Naughton, L. et al. Susceptibility (risk and protective) factors for in-patient violence and self-harm: prospective study of structured professional judgement instruments START and SAPROF, DUNDRUM-3 and DUNDRUM-4 in forensic mental health services. BMC Psychiatry 13, 197 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Risk
  • Violence
  • Self-harm
  • DUNDRUM toolkit
  • HCR-20
  • S-RAMM
  • Forensic
  • Protective