Skip to main content

Table 1 Sources of variation creating unreliable evaluations and procedures to reduce variation (modified from [51])

From: Use of a structured functional evaluation process for independent medical evaluations of claimants presenting with disabling mental illness: rationale and design for a multi-center reliability study

Source of variation

Definition

How source of variation was addressed in the study

Anticipated impact of the study approach on reliability:

1. Information

Raters obtain different information as a result of asking different questions

Structured functional interview with 5 steps and typical questions

Supports experts to elicit similar information

Anticipated impact: ++

2. Observation

Raters differ in what they notice and remember when presented with the same information

Reporting instrument for documenting functional findings with a five item scale for rating limitations and anchor definitions

Detailed job description as currently used by the disability office, all items completed.

Indirect impact on observer variance: raters will elicit information during interview that allows them to fill in the reporting instrument.

Direct impact on observer variance: raters all have identical information on the work place

Anticipated impact: ++

3. Interpretation

Raters differ in the significance they attach to what is observed

Calibration during small group case-based learning

Calibration: Some impact during the training when experts discuss the significance of various findings; intervision / calibration

Anticipated impact: ++

Videotaping may increase interpretation variance when the interviewer omits to elicit relevant information that raters would need to get a clear picture.

Anticipated impact: − / - -

4. Criterion

Raters use different criteria to score the same information

Anchor definitions in the IFAP-instrument

Job descriptions for hypothetical alternative work

Training and calibration

Anchor definitions, explicit qualifiers, joint training calibration should exert a substantial impact

Anticipated impact: ++

In work (in-) ability, the experts’ implicit criteria are often unknown

5. Subject

True differences exist in the subject between testing, e.g., when telling different things to different raters

Videotaping of evaluation interview

Videotaped interviews reduce subject variance.

Anticipated impact: +++

6. Expert/Rater

Raters differ in their understanding of job demands and the consequences of functional limitations for job performance;

Differences in value framework impact on judgment of claimants’ ability to work

Detailed job description as currently in use by the insurers, all items completed.

Job descriptions for hypothetical alternative work

Not addressed

Optimized real-life job descriptions (=all items completed) and provision of job descriptions for hypothetical alternative work will provide the same reference / benchmark to the expert

Anticipated impact: ++

  1. Legend: +small/++ moderate/+++ large impact on enhancing reliability; − small/-- moderate/--- large impact on reducing reliability