This study is funded by the Brain Foundation Netherlands (grant number HA2017.01.04). The study has been approved by the medical ethical board of University Medical Centre Groningen, Groningen (NL66850.042.18), and is conducted in accordance with the Declaration of Helsinki. The study has been registered prospectively in the Netherlands Trial Register, trial number NL7758.
Patients who receive treatment in ambulatory mental health care for a psychotic disorder are eligible for the study. Patients are recruited from Dutch and Belgian mental health treatment centres. We recruit participants in the following ways: 1) through advertisement of the study, by distributing posters and flyers at the participating centres, allowing participants to enrol themselves; and 2) through clinicians who inform their patients and refer eligible and interested participants to the study.
Patients must meet all of the following criteria to be eligible to participate:
Patients who meet any of the following criteria will be excluded from participation:
An estimated IQ below 70;
Insufficient command of Dutch language.
Received CBTp for paranoid delusions in the past 12 months.
This study is a single-blind multicentre randomized controlled trial (RCT) with two conditions: 1) VRcbt for paranoid delusions as experimental condition, and 2) CBTp for paranoid delusions as active control condition. Participants in both conditions may receive other types of treatment as usual, including antipsychotic medication, with the exception of CBT. The effects of the two conditions are compared at baseline (T0), at post-treatment (T1), and at three month follow-up (T2). A Consolidated Standards of Reporting Trials (CONSORT) inclusion flow diagram is shown in Fig. 1.
Randomization and allocation concealment
Randomization will occur after completion of the baseline assessment. Block randomization will be used, with a block of eight random assignments for each participating mental health centre. The allocator will hide block size from the therapist and research assistants to prevent prediction of the next assignment. After a mental health centre has included eight patients, new blocks will be made available. Allocation to the two conditions will be in a 1:1 ratio. Randomization will be carried out by using the online randomization program www.randomizer.org by an independent researcher of the University Medical Center Groningen who is not involved in the trial. After the baseline assessment, the first author will enter the patients study ID into the online service and receive an email that details the patient’s allocation. The first author will then contact the therapist to inform them of the patients’ allocation. The therapist will then contact the patient about the allocation and to arrange the first therapy session.
Assessment and blinding
Assessments are carried out by independent research assistants blinded to treatment allocation. This is achieved by instructing coordinators, therapists and participants not to disclose group allocation to research assistants. Research assistants are instructed to stop the assessment in case of unblinding, and another research assistant will repeat the assessment. Other precautions include storing data revealing group allocation (e.g., therapist worksheets) in a separate location and using different assistants for each measurement as much as possible. Blinding is evaluated with a self-report form for research assistants at the end of the post-treatment and follow-up assessments. We will perform a sensitivity analysis by testing the treatment effect only for measurements where research assistants reported being completely blinded to group allocation.
Power and sample size calculation
In our previous RCT, comparing VRcbt with waiting list, the effect size on Ecological Momentary Assessment (EMA) paranoid delusions (see primary outcome measures) was 1.6 . In the current study, the effect size is likely to be smaller with standard CBTp as active control condition. Assuming a much lower but still clinically relevant effect size of 0.6, a sample size of 122 (allowing 15% drop-out) will have a power of 80% to detect a multically significant treatment effect, using an alpha of 0.05 and a standard deviation of 1.1 for the outcome measure. Taking the multilevel structure of the data into account, with a (conservative) intra correlation coefficient of 0.8 and a commonly applied multiplication factor , 106 participants are needed for a unilevel equivalent N = 122.
VRcbt will be delivered by trained therapists, with at least a postgraduate qualification in CBT, and with a minimum of half a year of experience in psychosis treatment.
VRcbt consists of maximum 16 sessions within an 8–12-week time-frame. Sessions will last maximum 75 min, of which 40 min are spent in virtual social situations that trigger paranoid delusions and distress. The remaining time will be used to plan and reflect on exercises and to complete the session measurements. Throughout the first sessions, therapists construct individually tailored case-formulations based on CBTp, in close collaboration with the patient in order to create a shared understanding of the current paranoid ideas, related feelings and behaviour. Subsequently, patients are guided by therapists who help them drop safety behaviours and test their paranoid beliefs. The following animated virtual social environments can be used, i.e., café, shopping street, supermarket, bus ride, office and living room. The level of difficulty of the particular social environment can be modified by adjusting the number, gender and ethnic appearance of virtual characters (avatars) present in the situation. The level of hostility and suspicious behaviour can be modified as well. Personalized interactive scenarios can be role-played. The therapist talks via a microphone (with voice distortion) as an avatar and operates the avatars’ body movements. Patients wear an Oculus Rift head-mounted display and navigate through the virtual environments using a controller.
CBTp will be delivered by trained therapists, with at least a postgraduate qualification in CBT, and with a minimum of half a year of experience in psychosis treatment. CBTp also consists of maximum 16 sessions which last up to 75 min within an 8–12 week time-frame. CBTp emphasises cognitive techniques, such as cognitive restructuring and behavioural interventions, including exposure and behavioural experiments. CBTp focuses on exercises aimed at reappraisal of paranoid beliefs’ meaning to reduce distress and improve coping in daily life. Throughout the first sessions, therapists construct individually tailored case-formulations based on CBTp, in close collaboration with the patient in order to create a shared understanding of the current paranoid ideas, related feelings and behaviour. Each session, time will be reserved for planning and reflecting on exercises and completing the session measurements. The Dutch CBT protocol for delusions will be applied (gedachtenuitpluizen.nl).
Treatment quality and fidelity
Both interventions will be delivered by the same therapists, who are trained in both protocols. All therapists are supervised by a highly skilled and experienced mental health care professional with a registration at the Dutch Association of Behavioural and Cognitive Therapy (VGCt). The VGCt is the scientific association for cognitive-behavioural therapists in the Netherlands. The VGCt is committed to high-quality and scientifically sound development and practice of cognitive-behavioural therapy. For each participant, therapists write a individually tailored case conceptualisation, guided and evaluated after session two by the supervising psychologist. The treatment can only continue after approval of the case conceptualisation. In addition, therapists participate every month in 2-h group supervision sessions, both for VRcbt and CBTp, during which ongoing treatments are presented and discussed. Supervisors meet in online sessions every six weeks. All treatment sessions are audio recorded. A selection of treatment sessions will be rated for treatment fidelity, using the Cognitive Therapy Rating Scale (CTRS) . The CTRS  is a reliable  and valid  instrument to measure treatment fidelity when following a CBTp protocol.
Materials and measurement instruments
Primary outcome measure
Momentary paranoia in daily life social situations
Level of momentary paranoia in daily life is measured with Ecological Momentary Assessment (EMA). EMA is a structured diary method in which patients are asked to report their momentary thoughts, feelings and symptoms, as well as the (appraisal of the) social context in daily life . Momentary paranoia is measured ten times a day for seven consecutive days before treatment, after treatment and three months after treatment. Items assessing momentary paranoia include “I feel that others might hurt me”, “I feel that others dislike me” and “I feel suspicious” . Items are scored on a 7-point Likert scale ranging from 1 (not at all) to 7 (very). Mean total scores are calculated based on all 70 measurements. The EMA allows investigation of experiences occurring in daily life environments instead of retrospective self-reflection on feelings and behaviour . Therefore, the EMA is less sensitive to recall bias and has a high ecological validity .
Secondary outcome measures
Social participation is measured by means of EMA. Social participation items assess level of social activities, and proportion of time spent in social company in a natural flow and setting of daily life.
The Green Paranoid Thoughts Scale (GPTS)  is a self-report questionnaire that consists of two subscales each including sixteen items: part A measures paranoid delusions of social reference and part B measures social persecution, in the past month on a five-point Likert scale. Both scales and their dimensions have good internal consistency and validity .
Paranoid delusions and hallucinations
The Psychotic Symptom Rating Scales (PSYRATS)  are semi-structured interviews designed to measure the subjective characteristics of hallucinations and delusions. The PSYRATS has good inter-rater and retest reliability and has good validity, as assessed by internal consistency, sensitivity to change, and in relation to the PANSS .
The Social Interaction Anxiety Scale (SIAS)  consists of nineteen items assessing fear of general social interaction, i.e. distress when meeting and talking with other people, on a five-point Likert scale. High levels of internal consistency and test-retest reliability were established for both scales [27, 28].
The Inventory of Depressive Symptomatology Self-Report (IDS-SR)  is a 30 items self-report questionnaire assessing severity of depressive symptomatology in the past seven days. Psychometric properties of IDS-SR are satisfactory and the instrument has been recommended in research [29, 30].
The Safety Behaviours Questionnaire – persecutory delusions (SBQ)  is a semi-structured interview designed to measure safety behaviours, i.e. actions with the aim of reducing persecutory threat. In case a safety behaviour has been reported a patient is asked to rate its frequency over the past month on a four-point scale. The SBQ has a high inter-rater reliability, and an adequate test-retest reliability. The SBQ has adequate validity .
The Penn State Worry Questionnaire (PSWQ)  is a 16-item self-report questionnaire which assesses the trait pathological worry on a five-point scale. The PSWQ has proven to be a reliable and valid measure .
The Self Esteem Rating Scale – Short Form (SERS-SF) , has been designed to measure self-esteem by means of a positive and negative self-esteem subscale. The instrument is a self-report questionnaire that contains 20 items using a seven-point Likert scale. The SERS-SF has shown to be a reliable and valid instrument .
The Interpersonal Sensitivity Measure (IPSM)  is a self-report questionnaire developed to measure hypersensitivity to interpersonal rejection. The IPSM yields a total score as well as five subscale scores: interpersonal awareness, need for approval, separation anxiety, timidity and fragile inner-self. The IPSM has good psychometric properties [34, 35].
The Brief Core Schema Scales (BCSS) , a 24-item self-report questionnaire, assesses schemata concerning the self and others on four dimensions (negative-self, positive-self, negative-other and positive other) on a five-point Likert scale. The BCSS has good psychometric properties including construct validity .
The Davos Assessment of Cognitive Biases (DACOBS)  is a self-report questionnaire which assesses cognitive problems and biases on seven independent subscales (jumping to conclusions, belief inflexibility bias, attention for threat bias, external attribution bias, social cognition problems, subjective cognitive problems and safety behaviour) in the past two weeks on a seven-point Likert scale. The DACOBS has proven to be a reliable and valid measure for use in clinical practice and research .
Cost-effectiveness and cost-utility
Healthcare and productivity costs
The Trimbos Institute and Institute of Medical Technology Assessment questionnaire for Costs associated with Psychiatric illness (TiC-P)  is a self-report questionnaire assessing direct medical costs and productivity costs due to absence from work or reduced efficiency during paid or unpaid costs. The psychometric properties of the TiC-p are satisfactory .
Quality of life
The EuroQol Five Dimensions Five Levels (EQ-5D-5L)  is a health related self-report questionnaire which assesses quality of life on five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each dimension is rated on a five-level scale that describes the extent of problems in that area. Participants also rate their overall health on the day of the assessment on a 0–100 visual analogue scale (EQ-VAS). The EQ-5D-5L has shown improvement of psychometric properties in relation to the EQ-5D-3 L .
Clinically meaningful change
In order to test differences between VRcbt and CBT in number of sessions needed to achieve clinically meaningful change, Visual Analogue Scales (VAS), the Sheehan Disability Scale (SDS)  and the Clinical Global Impressions Scales (CGI)  are used. Patients-ratings on the VAS in combination with SDS will be administered at the beginning of each session, and clinician-ratings on the CGI will be administered at the end of each session. The Visual Analogue Scales (VAS) consists of nine items extracted from the GPTS , which assesses paranoid thoughts and delusions.
The Sheehan Disability Scale (SDS)  is a patient-rated measurement on a ten-point scale designed to assess functional impairment in three inter-related domains: work/school, social and family life. The SDS is a psychometrically sound instrument . The Clinical Global Impression Scale (CGI)  is a brief clinician-rated measurement on a 7-point scale assessing overall symptom severity (i.e. severity of complaints of paranoia safety behaviour and social avoidance) and global improvement (i.e. overall comparison of the patients baseline condition to a ratio of current therapeutically benefit). The CGI applied to schizophrenia has proven to be a valid and reliable instrument to evaluate severity and treatment response .
The Igroup Presence Questionnaire (IPQ)  is a fourteen-item self-report questionnaire using a seven-point Likert scale, designed to measure the sense of presence experienced in a virtual environment. The IPQ has established good psychometric properties  Social functioning The Personal and Social Performance Scale (PSP)  is an interview designed to assess the extent of disability (1 absent – 5 severe) in four components of social functioning (meaningful activities, personal and social relationships, self-care, and disturbing and aggressive behaviour). The ratings of each component are combined into one score from 0 to 100. The PSP has shown to be a reliable and valid measure . Demographic background The questions regarding demographic information include level of education, ethnicity, age, gender, substance use (alcohol, tobacco, cannabis and illicit drugs) and use of medication (antipsychotics, benzodiazepines, other psychotropic drugs).
Patients who might be eligible for participation will be contacted by their treating clinician and asked if they are interested in participating in the study. Clinicians will ask consent for sharing their contact information with the researchers after which interested patients will be informed about the study and screened by the researchers. After receiving written information about the study, patients will be given a consideration period of one week. If patients decide to participate after one week of consideration, written informed consent will be obtained first, and subsequently patients will complete the GPTS. If the GPTS score is > 40, patients are eligible for the study and the baseline assessment (T0) will continue. Following the baseline assessment, the baseline period of EMA will take place. After a week, patients are randomized to either VRcbt or CBTp. Patients allocated to either VRcbt or CBTp, will start with treatment. At the beginning of each session, participants complete the SDS and VAS. At the end of each session therapists complete the CGI. After the treatment period, a post-treatment (T1) takes place, followed by a seven days EMA. Finally, a follow-up (T2) assessment will be conducted six months after the start of treatment, followed by a final week of EMA. Patients who discontinue participation in the study at an early stage are requested to continue to participate in the measurements.
In both treatment allocations, early completion will be deliberated when paranoid ideations and avoidance on the CGI scale are rated as zero in two consecutive sessions, and the target behaviour has been achieved. Foregoing applies to all the (virtual) situations formulated in the case formulations in agreement with the supervisor.
Patients data will be coded using a study ID. Personal information and informed consent will be stored separately and safely to ensure privacy. Data will be collected by using an electronic case report form and will be stored in the Research Electronic Data Capture (REDCap) . To evaluate quality and integrity of the research, an independent study monitor will inspect annually.
Analysis will be performed according to the intention to treat principle. The groups will be compared at baseline. In case of baseline differences, variables will be added to further analyses as covariates. Potential covariates include age, sex, duration of illness and medication. The effect of VRcbt will be analysed by random intercept mixed effects regression models. The fixed effect of interaction between treatment group (VRcbt or CBTp) and time on momentary paranoia will be fitted as an estimate of the VRcbt treatment effect. Mean scores before and after treatment on each of the dependent variables will be compared between conditions. To determine the number of sessions needed for achieving clinically meaningful change, scores of patients on the VAS, the SDS and the GCI will be compared between conditions. The standard error of measurement (SEM) will be used to determine the proportion of participants with clinically meaningful change in each condition at each time point. Furthermore, cost-effectiveness analyses (CEA) will be conducted using the TiC-P and the EQ-5D-5L questionnaire. The economic evaluation consists of a CEA with improved momentary paranoia in daily life situations and a cost-utility analysis (CUA) with quality-adjusted life years (QALYs) gained as outcome. For both analyses, the incremental cost-effectiveness ratio (ICER) will be calculated as the between-group cost difference divided by the between-group effect difference. The ICER represents the additional costs needed (or saved) for establishing the VRcbt effect. The cost-utility ratio does the same per additional QALY gained. To handle uncertainty in the cost and effect data, nonparametric bootstrapping will be conducted to simulate 2500 ICERs.