Patient-Reported Outcomes

7. The Special Case of Validity and PROs

In research studies using PROs special attention needs to be paid to the assessment of validity. When no gold or criterion standard exists, PRO investigators have bor­rowed validation strategies from clinical and experimen­tal psychologists who have dealt with the problem of deciding whether questionnaires examining intelligence, attitudes, and emotional function are really measuring what they are supposed to measure.


Validity examines whether the instrument is measur­ing what it is intended to measure.

Types of va­lidity

  • Face validity examines whether an instrument appears to be measuring what it is intended to measure; and
  • Content validity examines the extent to which the domain of interest is compre­hensively sampled by the items, or questions, in the instrument. Quantitative testing of face and content va­lidity are rarely attempted.
  • Construct validity involves the logical relations that should exist between two concepts and then comparisons between measures of these concepts to examine if the hypothesized, logical relations are confirmed by the data.

The most rigorous approach to establishing validity is called construct validity. A construct is a theoretically derived notion of the domain(s) we want to measure. An understanding of the construct will lead to expecta­tions about how an instrument should behave if it is valid. The first step in construct validation is to establish a model or theoretical framework that represents an un­derstanding of what investigators are trying to measure. That theoretical framework provides a basis for under­standing the behavior of the system being studied and allows hypotheses or predictions about how the concepts and instru­ments being tested should relate to other concepts and their measures. In­vestigators then administer instruments containing similar and dissimilar concepts to a population of interest and examine the data. Validity is strengthened or weakened when the hypotheses are confirmed or refuted. For example, using a PRO to discriminate between known groups may be validated by comparing two groups of patients: those who received a toxic chemo­therapeutic regimen and those who received a less toxic regimen. Any PRO instrument should distinguish be­tween these two groups; if it does not discriminate, something has gone wrong. Alternatively, correlations between symptoms and functional status can be exam­ined; those patients with a greater number and severity of symptoms should have lower functional status scores on a PRO instrument. Another example is the vali­dation of an instrument discriminating between people according to some aspect of emotional function; results should correlate with existing measures of emotional function.

Example 4

A Detailed Example of Construct Validation

The Inflammatory Bowel Disease Questionnaire (IBDQ) was designed to measure disease-specific HRQL and it includes 30 items directed at 4 domains: bowel symptoms, systemic symptoms, emotional function, and so­cial function. Investigators administered the IBDQ (along with global ratings of change in function, global ratings of change by the physician and a relative, a Disease Activity In­dex, and the emotional function domain of a generic HRQL measure) to 42 patients with inflammatory bowel disease on two occasions separated by 1 month. At the time the investigation was planned, the investigators made predictions about how change in the IBDQ score should relate to change in the other measures if this questionnaire was really measuring HRQL. Examples of the predictions and the results are as follows:

  • The patient's global rating of change in disease activity should relate closely (correlation ~ 0.5) with change in the bowel-symptoms dimension of the In­flammatory Bowel Disease Questionnaire.
    Correlation observed was 0.42.
  • Some relation (correlation ~ 0.3) should exist between change in the Disease Activity Index and change in the bowel-symptoms dimension of the In­flammatory Bowel Disease Questionnaire.
    Correlation observed was 0.33.
  • Some relation (correlation ~ 0.3) should exist between change in the Disease Activity Index and change in the systemic-symptoms dimension of the Inflammatory Bowel Disease Questionnaire.
    Correla­tion observed was 0.04.
  • Change in the emotional-function dimension of the Inflammatory Bowel Disease Questionnaire should relate closely (correlation ~ 0.5) with change in the emotional-function dimension of the generic questionnaire.
    Correlation observed was 0.76.
<a href="/longdesc/Patrick_Sec07_Exer5.html">Flash Description</a>

Flash is not available on mobile devices. Please view the Flash Description.