Design
Two independent prospective cohort studies were conducted. The first served to develop the risk assessment instrument (derivation sample). The second patient sample tested the clinical application of the method (validation sample).
The study protocols were reviewed and approved by the research ethics boards of the Cantons Zurich (E-016/2001), Appenzell AR (10/01) and Berne (24.12.2001/IH/Hz/EW).
Setting and sample
Both studies were conducted in acute psychiatric wards in the German speaking part of Switzerland. All participating wards were closed admission wards providing comprehensive psychiatric service to the respective catchment areas. The first sample (derivation dataset) consisted of 219 consecutively admitted patients to six wards within three hospitals during a two-moth period. The number of beds in each ward ranged from 15 to 19. The second sample (validation dataset) consisted of 300 consecutively admitted patients to two wards during a six-month period. These two 12 bed wards were situated in two different hospitals in different cantons (one rural area, one urban area) to assure independence from the derivation dataset.
Instrument development
During instrument development psychiatric nurses responsible for the care of the patient provided an assessment during admission and twice daily (10 a.m. and 6 p.m.) at admission day and during the next three days or until discharge/transferral. Therefore, the maximum number of ratings per patients was 9 in the case of an admission time earlier than the regular rating at 11 a.m. Lower numbers of ratings resulted from missing items and when patients were discharged from the ward prior to the third day after hospitalization. Assessment forms contained the German research version of the BVC and a Visual Analogue Scale (VAS) of 10 cm length. Nurses were asked to indicate the presence or absence of the six behaviors constituting the BVC. In addition, nurses encoded their subjective perception of risk for a physical attack within the next 12 hours on the VAS. The endpoints of the VAS were marked as "no risk" and "very high risk". The data collection form was also used to gather information about any preventive measures taken since the last rating. No clues were provided about the interpretation of the BVC or the VAS. From these data, the final instrument (BVC-VAS) was developed as described in the statistical analysis section. The objective of this instrument to be developed was to integrate the findings from the BVC and the Visual Analogue Scale to a summary score. Crafting an instrument that would be compatible with routine use required graphic refinement of the BVC as well as a simple method to translate VAS-readings into scoring points. The latter was achieved by constructing a slide rule that resembled the VAS on the front side and provided the VAS score reading on the backside. The final instrument was pre-tested in a different ward before application in the validation study.
Instrument validation
The new instrument (BVC-VAS) was integrated into clinical routine in two admission wards in two hospitals. To test the instrument during practical application, staff was aware about the interpretation of the obtained scores. Like in the derivation sample, nurses assessed the risk of newly admitted patients on the day of admission and the following three days twice daily.
Outcome measurements
The main outcome measure was the occurrence of physical attacks on persons during the next shift following assessment. The severity of the aggressive event was recorded using the Staff Observation of Aggression Scale Revised (SOAS-R) [17–19]. Test accuracy was described as the area under the receiver operating characteristic curve [20]. A secondary outcome was the implementation of intense preventive measures such as seclusion or forced injection of psychotropic drugs. While this outcome may not be regarded as independent from the prediction, it allows the evaluation of false positive cases, i. e. to examine whether patients were unable to perpetrate violent attacks because of intense preventive measures. Thus, some of the false positive predictions may in fact be a consequence of effective prevention [13, 21].
Statistical analysis
The overall aim of the development of the BVC-VAS was to arrive at a simple number scoring system with presentation of risk as natural frequencies (e.g. 1 out of 10 patients with this score will attack). Such presentation of results is believed to provide a better framework to base actions than simple categorization as low or high risk. The statistical analysis consisted of two steps: First, an optimized prediction score was derived from the derivation dataset with the aim to provide four distinct risk strata: high, moderate risk, low risk and very low risk. Second, the application of the scoring system was tested under realistic conditions in a validation sample.
During derivation we employed independent logistic regression analyses with attack, aggression and coercive measures as the binary outcome variable. To account for possible non-linear relation between risk and individual BVC items, we performed additional analyses by entering each item as individual variable and by recoding numbers of BVC items into dummy variables. Second, we explored the relation between the VAS-distance measured in mm and the occurrence of physical attacks by independent logistic regression analyses. Within the constraints from the small dataset, these analyses did not suggest superiority of the single coding of symptoms over the simpler adding of symptoms. Next, several transformations of the raw VAS score (logarithmic, quadratic) were carried out, of which the logarithmic transformation yielded the highest discriminatory power. Because replacing the log-transformed VAS with the scoring points did not alter the predictive accuracy, we proceeded to adding the BVC and the VAS to a common summary score. We checked which combination of BVC scores and VAS scores would yield the best performing model, by testing different weights of the two scores. However, due to the small number of observed events, logistic regression analyses failed to ascertain with statistical significance whether non-balanced weighing would have yielded improved diagnostic performance over giving equal weights to the subjective assessment and the BVC. Therefore, we proceeded with equal weights. Thus, the final scale consisted of 12 score points, of which up to 6 were contributed from the BVC and up to 6 from a logarithmic transformation of the VAS. Finally, we calculated multilevel likelihood ratios for ranges of the revised BVC score, to be able to enumerate risk rather than expressing risk with ambiguous wordings. For practicability we chose four risk segments, corresponding to very low risk, low risk, moderate risk and high risk. In the validation dataset we elucidated the discriminatory performance of the total score and each subscore by independent analyses with the respective outcomes (attack/attack or intense preventive measures) as the outcome variables. To compare models, we used the area under the receiver operating characteristic curve (AUC-ROC). The AUC-ROC is determined from plotting sensitivity against 1-specificity for all possible cut-offs, in case of the combined BVC-AUC score for values ranging from 0 to 12. An area of 1 indicates a perfect prediction; an area of 0.5 is a chance result. Few clinical scores achieve AUCs ranging above 0.75, tests with an AUC of 0.95 are considered excellent [20]. Analyses were carried out in SPSS version 10 (SPSS inc, Chicago, Illinois) for obtaining confidence intervals for area under the receiver operating characteristic curves and in SAS (version 8.2, SAS institute, Cary, North Carolina) for model development.