Reliability and cultural applicability of the Greek version of the International Personality Disorders Examination.

Background The International Personality Disorders Examination (IPDE) constitutes the proposal of the WHO for the reliable diagnosis of personality disorders (PD). The IPDE assesses pathological personality and is compatible both with DSM-IV and ICD-10 diagnosis. However it is important to test the reliability and cultural applicability of different IPDE translations. Methods Thirty-one patients (12 male and 19 female) aged 35.25 ± 11.08 years, took part in the study. Three examiners applied the interview (23 interviews of two and 8 interviews of 3 examiners, that is 47 pairs of interviews and 70 single interviews). The phi coefficient was used to test categorical diagnosis agreement and the Pearson Product Moment correlation coefficient to test agreement concerning the number of criteria met. Results Translation and back-translation did not reveal specific problems. Results suggested that reliability of the Greek translation is good. However, socio-cultural factors (family coherence, work environment etc) could affect the application of some of the IPDE items in Greece. The diagnosis of any PD was highly reliable with phi >0.92. However, diagnosis of non-specfic PD was not reliable at all (phi close to 0) suggesting that this is a true residual category. Dianosis of specific PDs were highly reliable with the exception of schizoid PD. Diagnosis of antisocial and Borderline PDs were perfectly reliable with phi equal to 1.00. Conclusions The Greek translation of the IPDE is a reliable instrument for the assessment of personality disorder but cultural variation may limit its applicability in international comparisons.


Background
A major goal of WHO's mental health program is the development of a 'common language' [1]. In this frame, the joint program between the World Health Organization (WHO) and the US Alcohol Drug Abuse and Mental Health Administration (ADAMHA) helped not only in the production of the section dealing with mental disorders in ICD- 10 [2] but also in the development of a set of instruments for the assessment of mental disorders [3]. These are the Schedules for Clinical Assessment in Neuropsychi-atry (SCAN) [4], the Composite International Diagnostic Interview (CIDI) [2], and finally the International Personality Disorders Examination (IPDE) [5,6].
The IPDE assesses the most important areas of pathological personality and is compatible both with DSM-IV [7] and ICD-10 diagnosis. Each ICD-10 and DSM-IV criterion is precisely defined and guidelines and anchor points for scoring are provided for each IPDE question. The instrument is designed for use by clinicians (psychiatrists or clinical psychologists) experienced in the assessment of personality disorders.
It consists of 157 items arranged under the following 6 headings: work, self, interpersonal relationships, affects, reality testing and impulse control. The items are introduced by open-ended inquiries and offer the individual the opportunity to discuss the topic and supplement the answers with examples or anecdotes. Additionally, the instrument provides a set of probes to determine whether the individual has met the frequency, duration and age of onset requirements. Scoring of items ranges between 0 (absent or within normal range), 1 (present to an attenuated degree) and 2 (pathological, meets criterion standards). The results include both a categorical diagnosis of personality disorders in both classification systems and a dimensional score for each personality disorder. Both can be obtained by paper and pencil algorithms, but also a computer software is available. A computer software processed data, provides an ICD-10 and DSM-IV categorical diagnosis, number of criteria met and dimensional scores. IPDE also includes a screening questionnaire, with untested reliability and validity [8].
In essence, the IPDE is the outgrowth of the Personality Disorder Examination (PDE) [9]. The examiner is expected to perform a detailed clinical examination of the patient so as to obtain anecdotes, examples and details supporting or disputing the diagnosis of individual criteria. The conservative and somewhat arbitrary convention that a behavior or trait should be present for at least 5 years before it should be used for the diagnosis of personality disorder, and at least one criterion is present before the age of 25, is adopted by the IPDE. This was judged necessary in order to secure that behaviors included are persistent and not episodic and characterize adult life.
The first paper on the feasibility of the IPDE was published by Loranger et al [10]. However it is important to test the reliability and cultural applicability of different IPDE translations. This is true for all the instruments mentioned above, and already several translations in different languages have been performed and tested.
The aim of the current study was to test the reliability and the cultural applicability of the Greek translation of the IPDE in Greece. The study follows WHO field test protocols and is focused on testing cultural appropriateness and inter-rater reliability.
All were inpatients or outpatients of the 3 rd Department of Psychiatry, Aristotle University of Thessaloniki, University Hospital AHEPA, Thessaloniki Greece. All were physically healthy with normal clinical and laboratory findings, including Electroencephalogram and thyroid function.
The initial reason for which patients had asked for help was anxious or depressive symptomatology. At the time of the interview, no patient fulfilled criteria for a DSM-IV axis disorder, so their clinical symptomatology was at least in partial remission. No patient ever suffered from any psychotic or substance abuse disorder. No patient suffered from organic mental disorder, mental retardation or language and communication disorders.
The authors attempted to ensure that around half of the patients selected had personality disorder and the other half did not. This was done in order to give the full range of personality variability to support the aims of the study.
Translation from English into Greek was performed by one of the authors (KNF) and back translation by a second one (AI), who was unaware of the original English text. The final text of the Greek version of the IPDE was produced by consensus of these two authors.
Three examiners (KNF, CI and FB) applied the interview. All had been trained by another author (AD), who had taken part in the original 1994 validation study. The results consisted of 23 interviews of two and 8 interviews of 3 examiners, that is 47 pairs of interviews and 70 single interviews. One of the interviewers performed the interview and the other two were silent observers, who however had the ability to make necessary questions at the end of the interview, to clarify any area they considered of interest.
Since it is known that there is a significant disagreement between patient and informant-based information [11], the rating of the IPDE items was based on the clinical judgment of the examiner, taking into account all available information. This approach is most useful in those disorders, which lack insight (e.g. antisocial, narcissistic)

Clinical Diagnosis
The Schedules for Clinical Assessment in Neuropsychiatry version 2.0 (SCAN v 2.0) were used to assist clinical diagnosis.

Laboratory Testing
It included blood and biochemical testing, T3, T4 and TSH.

Statistical Analysis [12]
The phi coefficient was used to test categorical diagnosis agreement. This coefficient takes values from -1 (total disagreement) to +1 (total agreement). The value 0 means agreement just by chance. The Pearson Product Moment Correlation Coefficient was used to test the agreement between raters concerning the number of criteria met. Both coefficients take values from -1 (total disagreement) to +1 (total agreement). The value 0 means agreement just by chance.

Results
Translation and back-translation from English into Greek did not reveal specific problems. The original IPDE is written in relatively simple language and difficult expressions have been avoided.
The results and the experience of the examiners from the application suggested that the general applicability of the Greek translation is good. However, sociocultural factors concerning Greece (family coherence, work environment etc) were found to affect the application of some of the IPDE items in Greece. These items mainly concerned work and family relationships.
The results concerning the statistics of all 70 interviews are shown in Additional File 1: Table 1 and Additional File 2: Table 2. At least one personality disorder according to either classification systems was present in 38 (54.29%) of these interviews. Not Otherwise Specified (NOS) personality disorder was diagnosed in 15 (21.43%) interviews according to DSM-IV and in 7 (10%) according to ICD-10. Maximum number of personality disorders diagnosed in a single interview was 3 for DSM-IV and 4 for ICD-10.
Pearson Product Correlation Coefficients concerning the number of criteria met and the dimensional scores in the 47 pairs of interviews varied from 0.42-0.93 but generally were between 0.60-0.70 (Additional File 3: Table 3). Phi correlation coefficients ranged between 0.28-0.97. Only schizoid and NOS personality according to ICD-10 had phi near 0 (Additional File 4: Table 4). Antisocial and Borderline PDs were perfectly reliable with phi equal to 1.00.
The diagnosis of any PD was highly reliable with phi >0.92 in both classification systems. On the contrary, not otherwise specified PD was not reliable at all (phi close to 0) suggesting that this is a true residual category.

Discussion
The current study suggests that the Greek version of the IPDE is reliable and culturally applicable. However, further study is necessary to elucidate some particular areas of interest. Therefore, protocols specifically targeted in these areas are necessary. Some of these will be discussed below. The discussion that follows is divided into three parts, one concerning inter-rater reliability, a second on cultural applicability and a third on nosological issues.
The results of the current study should be considered having in mind that it is neither an epidemiological survey nor a study of personality disorders per se, but concerns only the property investigation of the Greek translation and application of a semi-structured interview with proven international applicability. The sampling did not involve consecutive admissions, on the contrary the selection of subjects served the primary aim of the study.

a. Inter-rater reliability
The primary aim of the current study was in fact to test the reliability of the Greek Version of the IPDE. The term 'reliability' here means 'inter-rater agreement'. This issue includes serious drawbacks. It is obvious that the style of the interviewer may provoke different behaviors on behalf of the subject. In addition, the response of the subject is uncertain when repeated interviewing with the same set of questions is used. In the current study a more conservative approach was adopted, that of the silent observer-rater. Validity is of course another issue and was not the focus of the investigation in the current study.
The introduction of semi-structured interviews made it clear that the problem lies with the reliability and validity of the diagnostic concepts of both classification systems rather than with the construction of the instruments. Self report instruments manifest additional drawbacks [13]. An additional problem is that clinicians consider direct questioning to be of limited values in PD diagnosis in contrast with the diagnosis of clinical syndromes. For PD diagnosis information from more than one source are considered necessary.
However, although semi structured interviews were created in order to improve reliability, it is evident that reliability all too often is achieved at the expense of clinical validity, because reducing the interview in simple 'yes' and 'no' answers (suitable for lay interviewers) may lead to perfect agreement between raters (either lay or professional), but lack the documentation of behavior by soliciting convincing examples, anecdotes and details [14]. This may lead to over-diagnosis because subjects tend to reply positively to questions but the characteristic may not be present to the degree necessary to put the diagnosis of a criterion. On the other hand, this approach may miss the diagnosis of disorders with poor insight. To a large extent this is the price one pays in order to create reliable instruments for lay interviewers.
The results of the current study manifest both similarities and differences from the original study of Loranger et al [10] and from other studies that report the reliability of various instruments designed for the diagnosis of personality disorders [15]. The study of Bronisch et al reported a high reliability of the diagnosis of any PD, but a low reliability of the diagnosis of specific PDs. This is rather in contrast with the findings of Loranger et al, which reported a more reliable specific diagnosis. However the general attitude of the literature is in favor of Bronisch et al and our results [16] suggest that most measures manifest high agreement on whether a person has some personality disorder as well as lack of discriminant validity between specific disorders. Many authors suggest that this is in fact a consequence of overlapping of criteria sets and definitions.
There is another issue research has focused on. It is the dispute of whether there should be a categorical or a dimensional approach to the PD diagnosis [17][18][19]. Both classification systems use a categorical approach, but these approaches are not mutually exclusive. The IPDE provides both with a categorical and dimensional approach and the results of the current study support this decision, for example, the schizoid PD, which had the lowest phi value (0.28 and 0.09, Additional File 4: Table 4) had satisfactory or high Pearson coefficients (0.42-0.73, Additional File 3: Table 3). This is in accord with the international literature and the original report [10], supports the complementary function of both approaches, and is most useful for the assessment of sub-threshold traits.
Personality disorders are common. Estimates of overall lifetime suggest that the percentage is above 10% [20]. It is obvious that the introduction of strict diagnostic criteria improved the reliability of diagnosis, however it is also obvious that most clinicians do not follow these strict criteria in everyday clinical practice [21], but rather prefer an idiosyncratic approach to diagnosis even when using DSM or ICD labeling.

b. Cultural applicability
The international application of the IPDE revealed some but impressively few problems with the applicability of the diagnostic criteria in diverse cultures. The most striking finding concerned monogamous relationships (antisocial) and harsh treatment of spouses and children (sadistic). One should be very cautious in drawing conclu-sions from these data. The application of the IPDE in Greece revealed that there are significant problems concerning the applicability of ICD-10 but mainly of DSM-IV diagnostic criteria. The problems concerned the criteria and not the instrument.
The most difficult culturally-bound problem concerned job and the occupational environment of the subject. In Greece many young adults remain within the protection (both emotional and financial) of the parent family, sometimes until the age of 25 or 30. Therefore it is very difficult to assess their level of functioning, because this period is like a 'prolonged adolescence'.
The occupational environment in Greece could be classified in three major types: a. Working in the public sector, in which demands for quality and quantity of work are low.
b. Hard labour working in private small factories and workshops. This is an exhausting environment with demands for high quantity and disregard for quality, c. Modern private firms in the area of service providing which have high standards of quality and quantity control, but at the same time suffer from drawbacks similar with the previous two categories.
Because of the above, it is very difficult to reliably judge the level of functioning of the subject and whether his/her complains reflect reality or are exaggerated demands.
The assessment of the inner experience and moral preferences was difficult and clearly more difficult than the assessment of behaviours, but this is a universal problem. The authors thought that it would be less difficult to assess the inner experience in Greek subjects because of their extraverted temperament, however, on the contrary this might have produced false-positive results. It is not possible to draw conclusions on this issue from the current study alone. Comparison studies are necessary for this. The same is true for interpersonal relationships, which are expected to be more emotional in Mediterranean cultures.
On the contrary, there were no particular problems concerning reality testing and impulse control.

c. Nosological issues
Although the current study is not an epidemiological one, it is important to mention the percentages of each specific PD diagnosis, as this may influence the results. In all 70 interviews, there was no diagnosis of Schizotypal PD, and the most rare diagnosis was that of Histrionic according to both classification systems, and Narcissistic and Dependent PD according to DSM-IV.
A striking finding was the large discrepancy in the diagnosis of the Dependent PD according to DSM-IV (N = 1) in comparison to the same diagnosis according to ICD-10 (N = 8). DSM-IV demands the presence of 5 out of 8 criteria and the ICD-10 the presence of 4 out of 6 (5 of whom are included in the 8 DSM criteria). The criterion that seems to have made the difference was 'subordination of one's own needs to those of others on whom one is dependent and undue compliance with their wishes'. The authors' opinion is that this specific criterion is the source of the disagreement, and this is evident from the similar Pearson coefficients of dimensional scores in both systems (Additional File 3: Table 3), which suggests that in essence disagreement is minimal.
The rest specific diagnostic categories manifested similar results, but a modest degree of disagreement was evident. However the issue of the agreement between the two classification systems, who adopt different nomenclature and criteria sets, is beyond the focus of the current study. The last but important observation concerning this matter is that more ICD based interviews reached a PD diagnosis (N = 38, 54.29%) in comparison to DSM based (N = 30, 42.86%), but generally the IPDE tended to produce less PD diagnosis than unstructured interviewing (the patients' therapists considered almost all of them to suffer from PD) and this is in accord with Loranger et al [10].
The fact that Schizotypal PD was not diagnosed is not surprising since this type of PD is considered to genetically relate to schizophrenia and that is why ICD-10 does not consider schizotypal disorder to be a true personality disorder.
Conflicting opinions are present considering the Schizoid PD, which had the lowest phi values but adequate Pearson coefficients in both systems. According to Akhtar [22] these patients manifest a diffuse identity (are not sure who they are, have conflicting thoughts, feelings, wishes and urges) and this leads them to problematic interpersonal relationships and finally to isolation and avoidance of relationships. However this psychoanalytically oriented approach is not the one accepted by DSM-IV or ICD-10, which both consider schizoid PD to be characterized by emotional coldness (the deficit model). They do not express complains and lack the ability to communicate their feelings. There is again a problem of definitions, and results could receive different explanation, but the above could well lead to a loss of reliability in diagnosis, because in the core of this disorder is the assessment of the inner experience and not of externally observed behavior, which is much easier.
Face to face with Schizoid PD stands the Avoidant PD, which is characterised by the behaviour of avoiding activities that may demand social interaction, but anxiety is not a prominent feature. In this sense it shares similarities with Schizoid PD, but social isolation is significant due to feelings of inadequacy, self-reproach and invalidity, while Schizoid PD patients are simply indifferent.
Specific mention should be made to the difference concerning the Antisocial (according to DSM-IV) and Dissocial (according to ICD-10) PDs, which should be attributed mainly to the large difference in the concept of psychopathic personality between classification systems. Four out of 7 criteria of the A group of DSM criteria for antisocial PD largely coincide with 4 ICD criteria. The remaining DSM criteria focus on impulsivity, irresponsibility and deception, while the remaining ICD criteria focus on emotional cruelty and incapability to keep interpersonal relationships, which according to DSM are more characteristic of narcissistic (which does not exist in ICD) and borderline PDs. So, DSM focuses largely on the legal and behavioural interpersonal aspects of dysfunctioning, and, more important, demands an early (before the age of 15) onset of behaviours. This is a narrow definition, which led to high agreement between ratters both in categorical diagnosis and in dimensional scores (table 3 and 4). On the contrary, ICD focuses on features, which the American system would consider them belonging to a mixed narcissistic-borderline-histrionic PD. This is a wider concept, which leads to lower reliability (Additional File 3: Table 3 and Additional File: Table 4).
Only one interview diagnosed a Narcissistic PD. These patients are very difficult to find [23] and the number of criteria met in the current study was too small to arrive at valid conclusions.
Obsessive-Compulsive PD was the more frequent PD in the current study. These patients are considered to have a hypertrophied superego, which is relentless in its demands for perfection. This harsh superego could lead to depression, when its needs are impossible to meet and therefore it is expected that this PD is over-represented in clinical populations like the sample of the current study. Another reason for this is the presence of insight in these patients, which makes diagnosis easier.
Since the current study included clinical cases in full or partial remission, it is important to have in mind (it is well known and described in the international literature), that the assessment of personality in the presence of anxious or depressive symptomatology (but some say even in remission) is somewhat problematic, and full of methodological pitfalls [24]. It seems that the assessment of longitudinal behaviors (lies, aggressive, antisocial behavior or self-destructive acts) is less problematic than the assessment of the inner experience (e.g. chronic feelings of emptiness) or interpersonal relationships (e.g. desperate efforts to avoid abandonment).
The IPDE has been used in research mainly concerning the relationship between different ways of approaching the diagnosis [15,25], the long term stability of personality disorders [26], and the relationship between PDs and clinical syndromes [27][28][29]. Limited data on the reliability and validity of the IPDE screening questionnaire also exist [30].

Conclusions
The Greek version of the IPDE is both reliable and culturally applicable. Most problems arose to date, from the Greek experience with the IPDE, concerned more the application of internationally accepted diagnostic criteria in a specific culture and not the structure of the instrument itself.