Countertransference feelings and personality disorders: A psychometric evaluation of a brief version of the Feeling Word Checklist (FWC-12)

Background The Feeling Word Checklist (FWC) is a self-report questionnaire designed to measure therapists’ countertransference (CT) feelings. The primary aim of the study was to evaluate the psychometric properties of a brief version of the Feeling Word Checklist comprising twelve feeling words (FWC-12). The second aim was to validate the factor structure by examining the associations between the FWC-12 factors, patients’ personality pathology and therapeutic alliance (TA). Methods Therapists at 13 different outpatient units within the Norwegian Network of Personality Disorders completed the FWC-12 every 6 months during the course of treating a patient with a personality disorder (PD), over a period of up to 2.5 years. A large sample of patients with personality pathology participated in the study. The data were analysed with exploratory (EFA) and confirmatory (CFA) factor analysis. Internal consistency was estimated using Cronbach’s alpha. The Structured Clinical Interview for DSM-IV – Axis II (SCID II) and Mini International Neuropsychiatric Interview (MINI) were used as diagnostic instruments, and patient-rated TA was assessed using the Working Alliance Inventory (WAI-SR). Results Factor analyses revealed three clinically meaningful factors: Inadequate, Idealised and Confident. These factors had acceptable psychometric properties. Most notably, a number of borderline PD criteria correlated positively with the factors Inadequate and Idealised, and negatively with the factor Confident. All the factors correlated significantly with at least one of the WAI-SR subscales. Conclusions The FWC-12 measures three clinically meaningful aspects of therapists’ CT feelings. This brief version of the FWC seems satisfactory for use in further research and in clinical contexts.

ways that capture much of the complexity of what the therapist is experiencing (5,8,9).
Two approaches have been used to measure CT empirically. One is to have therapists fill out self-report questionnaires (8,10). The other is to have an external observer evaluate recorded material from sessions (11)(12)(13). One of the benefits of using questionnaires is their quantitative nature; they can be distributed to many therapists, and one can subsequently receive large amounts of experienced feelings that can be used to identify common patterns of feelings (18). The Feeling Word Checklist (FWC) (10,14,15), in various versions, is one of the most used questionnaires for research on CT (16).

The FWC
In the FWC, therapists are given a list of feeling words and must check off the feelings they have experienced in relation to a patient. The first version of the FWC was developed by Whyte et al. (10) and comprised 30 feeling words. Later, different versions of the questionnaire were developed that include between 24 and 58 feeling words. These different versions were created, in part, to include feeling words that experienced therapists found to be missing from the original list and to enhance the stability of the underlying factors in the FWC (17). Different research groups have identified different factors underlying the words in the FWC, and between three and seven factors are described in other studies. Variations in the number of factors used may be explained by the use of different versions of the FWC and dissimilar scale formats, such as multiple response-point Likert scales or dichotomous yes/no versions (15).
The statistical methods used to evaluate them have also varied; while most studies have used principal component analyses, some have used factor analyses (9). Furthermore, the studies conducted have involved therapists of diverse professions, different patient populations and large heterogeneous groups. The first studies using FWC were performed in inpatient departments. More recent studies have examined the factor structure when FWC is applied in individual therapy (9,(17)(18)(19).
There is still no consensus about which FWC version best captures the CT phenomenon. Generally, all studies have found at least one factor reflecting positive feelings and at least one reflecting negative feelings (9). One goal in the CT research is to find many of the same feeling factors, as this may support that important aspects of CT have been captured. So far, many of the factors in the FWC overlap.
In our study, we used a brief FWC version that features only 12 feeling words . A brief version is easier to implement in clinical contexts, thus facilitating the psychotherapy process and outcome research. An instrument such as the FWC-12 can also help therapists become more aware of their feelings. Several studies have demonstrated that therapists' feelings are related to patient outcomes (7,20). For therapists to become aware of this phenomenon, it must be given attention. There is a need for research on the psychometric properties of instruments used in CT research to make future studies more robust (21). A brief questionnaire, such as the FWC-12, will obviously not capture all important CT feelings, but it can provide insight into meaningful aspects or patterns of feelings.

CT and Personality Pathology
Because of the relational strain often reported by therapists working with patients who have a personality disorder (PD), there is extensive clinical and theoretical literature on CT and patients with PD. Particularly, many clinical articles are about borderline PD. Kernberg (22) described that these patients tend to elicit powerful CT reactions in therapists because of their intense, primitive and regressive transferences. Furthermore, it has been argued that specific CT feelings are the most reliable guide to diagnose borderline PD, such as the feeling of being idealised or devaluated as a therapist. There are, however, few empirical studies on PD and CT (5,(23)(24)(25)(26). Two studies have explored therapists' feelings in relation to Diagnostic and Statistical Manual of Mental Disorders Version 4 (DSM-IV) diagnoses at the PD cluster level and found that that patients with cluster A (i.e., paranoid, schizoid and schizotypal PD) and B (i.e., antisocial, borderline, histrionic and narcissistic PD) elicited more negative CT feelings than did patients with cluster C disorders (i.e., avoidant, dependent and obsessive compulsive PD) (5,23). Some studies (5,24)  CT and Therapeutic Alliance (TA) The term therapeutic alliance ( TA) refers to the working relationship between the therapist and patient. To date, only a few studies have investigated the relationship between TA and CT (13,17,21,27). Both the TA measurement instruments and results of these studies vary widely. Existing studies also differ in terms of whether TA is patient-rated, therapist-rated or based on both perspectives. To summarise, two studies reported a negative correlation between negative aspects of CT feelings and TA (13,21), while others found both negative correlations between negative CT feelings and TA and positive correlations between positive CT feelings and TA (17,27).

Aims of the Present Study
The patient populations used in many past studies on CT have been heterogeneous, and many have included a small number of patients. Very few studies have been based on a selection of poorly functioning patients with PD. This study consists of a large sample of patients with significant PD pathology. In this way, it may be possible to examine the feelings that therapists can experience while working with a varied sample of PD patients.
The primary aim of the current study is to explore the factor structure and psychometric properties of the FWC-12, used in a clinical sample of patients with PD or PD traits. Our secondary aim is to validate these factors by examining their relationship with patients' personality pathology and TA.
More specifically, we wanted to answer the following research questions: How many clinically meaningful factors do the items in the FWC-12 represent?
What is the relationship between therapists' CT feelings, assessed by FWC-12, and patients' personality pathology?
What is the relationship between therapists' CT feelings, assessed by FWC-12, and patient-rated The therapists filled out the FWC-12 at 6-month intervals about a patient they had in treatment (i.e., from 6 months up to 2.5 years), with a final assessment at end of the patient's treatment.

Patients
A total of 2,425 adult patients participated in this study. The mean age was 33 years (standard deviation [SD] = 10 years), and 76% of the patients were female.
According to the guidelines given in DSM-IV (29), 71% of participants had one or more PD diagnosis and 94% had at least one symptom disorder, wherein 68% had mood disorders and 57% had anxiety disorders (see Table 1 for prevalence of PDs). In the current study, the mean GAF score was 49.77 (SD = 6.06), the mean WSAS was 22.60 (8.56). In addition, the GSI was 1.54 (SD = 0.66) and IIP was 1.65 (SD = 0.52). All these measures reflect a poorly functioning patient group with a high level of symptom and interpersonal distress. TA was measured using the revised short form of the Working Alliance Inventory (WAI-SR). The patients filled out the WAI-SR at the same intervals as did the therapists when filling out the FWC-12 (i.e., every 6 months from 6 months up to 2.5 years in their treatment period, with a final assessment at the end of the treatment).

Assessment
The FWC The FWC is a self-report measure in which therapists rate their emotional responses toward a patient in a five-point response format (0-4), ranging from 'No such feeling' (0) to 'Very much' (4). The present study uses a short version of the Feeling Word Checklist 58 (FWC-58) that includes 12 items (FWC-12). In the FWC-12, the prompt, 'During recent conversations with the patient I have felt…' is followed by the 12 feeling words: Disliked, Important, Threatened, Exalted, Bored, Confident, Inadequate, Admired, On Guard, Calm, Invaded and Overview. Each of the words is rated from 0 to 4 by therapists, based on how acutely they experience each feeling.
The FWC-12 is new and was constructed for this study with the aim of creating a more applicable and less time-consuming questionnaire, reflecting some positive and some negative feelings. The aim in its creation is to determine whether these positive and negative feelings are important cues to describe therapy processes and outcomes in future studies. The items were selected from the FWC-58 partly using a data-driven method based on former factor analyses of the FWC-58 (15,17) and partly as a result of clinical considerations.

Diagnostics
All patients were diagnosed according to DSM-IV (29) using the Mini International Neuropsychiatric Interview (MINI) (34) for symptom disorders and the Structured Clinical Interview for DSM-IV-Axis II (SCID-II) for PD (35). Diagnostic reliability was not investigated. However, diagnostic assessments were performed in each unit by clinical staff who had received systematic training in diagnostic interviews and principles of the Longitudinal, Expert, All-Data (LEAD) procedure (36,37). This means that diagnoses were based on all available information, including referral letters, self-reported history, complaints, overall clinical impression and the results of the two diagnostic interviews (i.e., the MINI and SCID-II). In DSM-IV, the classification of PD is polythetic-that is, the criteria within each disorder are neither necessary nor sufficient. The number of fulfilled PD criteria can thus be seen as a reflection of the dimensional strength or closeness to prototypic PD constructs.

TA
The patients filled in the WAI-SR (38, 39) every six months during treatment and at discharge from treatment. The WAI-SR is a 12-item questionnaire representing 3 different aspects of the patient's relationship to the therapist; bond, task and goal.
Patients are asked to judge each question on a Likert scale from 'Never' (1) to 'Always' (7). The patients filled out two versions of the WAI-SR: one with reference to their group therapist (WAI-G) and one with reference to their individual therapist (WAI-I).

Statistics Unbalanced Sample
The data of the current study are based on ordinary routine assessments, but it is important to note that these routines sometimes fail for one reason or another.
Sometimes therapists fail to fill out FWC-12 at the proper time, and sometimes administrative routines fail so that the patients do not get their six-month questionnaires. As such, the dataset in the current study is unbalanced. See Table 2 for an assessment of the FWC-12 and WAI-SR.

Factor Analysis
We decided to analyse the FWC-12 after 12 months of therapy, assuming that therapy is well underway by that point. There is usually also some delay from the initial assessment period to inclusion in the treatment programme, although all patients have some kind of individual clinical contact with the unit during this waiting time. As such, there is good reason to assume that the treatment process has stabilised one year after the initial assessment.
The total sample of 2,425 patients was first randomly divided into 2 separate subsamples. This was done to facilitate the exploratory (EFA) and confirmatory factor analysis (CFA). The first sub-sample (n = 1,219) was used to conduct explorative factor analyses and the second (n = 1,206) to cross-validate the suggested factor structure in a confirmatory factor analysis. After 1 year of therapy, the number of completed FWC-12 questionnaires gathered was 869. With respect to the initial factor analysis, sub-sample 1 comprised 439 FWC-12 questionnaires and sub-sample 2 comprised 430 FWC-12 questionnaires. All other analyses are based on the total sample of 2,425 questionnaires.
Using IBM SPSS Statistics for Windows, Version 25 (2017), randomisation of the total sample was done with the Select function (approximate 50%). Group differences were analysed using an independent samples t-test (two-sided). Effect sizes of group differences were estimated using Hedges' g (40). Relationships between variables were estimated by multiple linear regression analysis, and scale reliability was estimated using Cronbach's alpha (41). EFA and CFA were conducted using Moreover, one item of factor two (Threatened) stood out, with a considerable lack of variance (mean: 0.05; SD: 0.28; Skewness: 6.24; Kurtosis: 45.46)-that is, it a hardly a strongly endorsed feeling. Based on these findings, Bored and Threatened were omitted from the item pool, and Inadequate was moved to factor two. The solution was then a three-factor solution based on 10 items/feelings, labelled Idealised, Inadequate and Confident. See Table 3 for the final operationalisation of the three-factor model and estimates of scale reliabilities. As shown in Table 3, the scale reliabilities are in the acceptable range (i.e., at or above 0.70), except for the factor Inadequate.
From the CFA of the new three-factor model, it was reasonable to conduct two modifications. The first was to accept a negative cross-loading from the item Important to the factor Inadequate, and the second was to accept a negative residual covariance between the item Inadequate with the item Overview of the factor Confident. In Table 4 To further validate the CT factors, we explored the relationship between the factors, the number of PD criteria, and TA. As shown in Table 6 It is noteworthy that the therapists overall score their CT feelings as low in intensity. The mean scores range from 0.47 (Inadequate) to 1.12 (Idealised) to 2.74 (Confident).. The scores are in a similar range to that seen in many other FWC studies-that is, quite low scores are consistently reported (9,14,17,18). However, the patients in the present study are mainly poorly functioning PD patients, and one might have expected stronger feelings to be reported. It is also somewhat surprising that Confident is the feeling assigned the highest score. This could be due to the therapists overall being highly experienced and the fact that regular group supervision is part of the therapists' work, in which CT is a focus. This is in line with the report of Ulberg et al. (18), in which Confident was positively associated with more experience as well as with an increased level of supervision-that is, they found lower levels of Inadequate feelings with more supervision.
Another explanation might be that the questionnaires were only filled in every six months. Some therapists have reported that 'overall' they feel relatively confident in meeting with the patient when they look at the relationship over a long period of time. More frequent measurements could likely capture more varied and possibly more intense CT feelings. Alternatively, the result could also be due to 'defensive bias', which is a potential weakness that all self-report questionnaires share. Some therapists may find it difficult to report negative feelings, whether because they find it unprofessional or because they are simply not conscious of the negative feelings.
The three factors, Idealised, Inadequate and Confident, are consistent with aspects of feeling responses when working with PD patients both described in the clinical literature (54,55) and in the existing empirical literature (5,24). Specifically, we found that-especially in meeting with patients who meet many borderline PD criteria-therapists feel more idealised and more inadequate. We also found that the total number of PD criteria correlated positively with the Inadequate factor. This is in line with the report of Dahl et al. (17). One could object that it is a weakness that feelings evoked by the avoidant PD patient group are not better captured in the FWC-12. Avoidant PD and borderline PD constitute the two largest patient groups in this material. The avoidant PD criteria showed a weak, but not significant, positive correlation with a Confident response. Previous empirical studies have reported that Confident is the most significant response from the group in question (16). However, this result might also reflect that the avoidant patients are a more heterogeneous patient group. That is, it is possible that there is greater variation in what therapist feel when treating these patients. From a clinical perspective, there is reason to believe that therapists may experience more negative feelings than previous studies have reported.
Correlational analyses with patient-rated TA revealed several meaningful and significant associations. Patient-rated TA, measured using the WAI-SR, showed a positive correlation with Confident and a negative correlation with Inadequate. This is in line with the study by Dahl et al. (17), although they rated patients' TA with a different instrument, called the Help and Understanding Scale (HUS). The correlations between patient-rated TA and CT is of particular interest because of the non-overlapping perspectives. Idealised correlated positively with patients' evaluation of their individual therapist (WAI-I) but not with their group therapist (WAI-G). From a clinical experience, it is not very surprising that the individual therapist is idealised more than the group therapist. In treatment programs involving both individual and group therapy, patients must share a therapist's attention with up to seven others during group therapy, making way for more complicated feelings such as envy, feelings of exclusion and feelings of being alone among others; as such, the therapist's lack of omnipotence is more striking than it is in individual sessions. As far as is known, no other empirical studies have found this association. However, the results from the correlational analysis with TA should also take into account that the patients' reports are not blind to the therapists.
Thus, there is a possibility that there is some bias with respect to the patients' selfreport reflecting their satisfaction with their therapists.

Strengths and Limitations
A considerable strength of this study is that the sample was large enough to be divided in two, for EFA and subsequent CFA, and that each of the sub-samples was large enough to yield stability in the estimates. Another strength is that the data comprise a large and representative sample of patients with PD and PD traits, wherein PD not otherwise specified (NOS), borderline PD and avoidant PD are the most prevalent diagnoses. The patient sample is also well described, representing a functionally impaired and highly symptomatic patient group-a group known to evoke powerful CT reactions in therapists. No previous studies on PD and CT have investigated CT using such a large sample of poorly functioning patients. This study can thus contribute to illuminating feelings that are typically described in clinical literature but empirically investigated to a small extent. However, the poorly functioning patients in this study may also restrict the generalisation of the results to other clinical samples.
One of the main limitations is that we do not know the number of therapists participating in the study. Additionally, the therapists did not necessarily score FWCs for each patient at every required assessment time (i.e., every six months for that patient). However, data from 13 different treatment units in Norway were collected, and there is reason to believe that the number of therapists is relatively high due to the number of units assessed, which can be regarded as a strength.
conclusion In this study, we found that the FWC-12 comprised three factors, labelled
Goodness-of-fit statistics from CFA based on the three-factor model of FWC-10 questionnaires at different time points. Note. Chi-square statistics: p values < .0001. CFA = confirmatory factor analysis; CFI = comparative fit index; CI = confidence interval; RMSEA = root mean square error of approximation; SRMR = standardised root mean square residual; TLI = Tucker-Lewis Index. Table 5.

Evaluations
Descriptive statistics of FWC subscales. Note: FWC assessed at one year. PD NOS = PD not otherwise specified. a No correction for comorbidity among PDs; only PDs represented by more than ten cases were listed.  **Correlation is significant at the 0.01 level (2-tailed).