Measurement of Quality of Life Outcomes
Jason E. Owen
Laura Boxley
Joshua C. Klapow
Introduction
For many years, survival was the primary outcome of clinical trials for those diagnosed with cancer. The introduction of formal assessments of Health-Related Quality of Life (HRQoL) has changed the way in which clinical trials have been conceptualized, administered, and interpreted. For those providing care for persons at the end of life, Quality of Life (QOL) concerns assume primary importance, and strong tools for measuring QOL provide both a valid characterization of QOL and an opportunity to detect and manage unaddressed clinical problems. In this chapter we hope to provide a brief conceptualization of HRQoL assessment in oncology populations, discuss the strengths and weaknesses of existing cancer-specific HRQoL assessment instruments and those specific to palliative oncology, and highlight several important methodologic issues of concern to researchers.
Overview of Health-Related Quality of Life Assessment
A Brief History of Quality of Life Assessment in Palliative Care
The rapid growth of the palliative care movement has paralleled, and perhaps fueled in many respects, the increasing attention that has been given to QOL concerns among those seeking curative cancer treatments. The hospice Medicare benefit was created by the Tax Equity and Fiscal Responsibility Act of 1982 and made permanent by the U.S Congress in 1986. Although preceded by the growth of hospice organizations in Europe, this act made possible the development of community-based hospice services across the United States from the early 1980s to the present time. The rise of modern hospice care represented a paradigm shift for health care in the United States. Survival had been the sole relevant outcome of most clinical trials, and QOL had received only casual treatment in the research literature. By its very definition—‘the achievement of the best possible QOL for patients and families (1)’—palliative care challenged dominant ideas about the nature of outcomes assessment in clinical populations.
The field of QOL assessment perhaps owes a great deal to those who fought the political and medical/cultural battles that facilitated the modern hospice movement. Instruments designed to formally measure ‘QOL’ began to appear in the research literature shortly after the Medicare hospice benefit became law. Paralleling the growth of palliative care, the assessment of QOL as an endpoint in cancer clinical trials first emerged in Europe and slowly became integrated into clinical trials protocols in the United States. By 1995, three major clinical trials groups—the U.K. Medical Research Council (MRC), the European Organization for Research and Treatment of Cancer (EORTC), and the National Cancer Institute of Canada (NCIC)-advocated consideration of QOL as an endpoint in all new clinical trials (2). Tracing this development historically, the MRC was among the first clinical trials groups to introduce self-report diaries to track QOL outcomes in 1981 (3). In the early 1980s, the EORTC began to include endpoints related to QOL in its funded clinical trials, and the organization quickly established a study group to develop standardized, psychometrically evaluated tools for measuring QOL within cancer populations (4). A 1986 EORTC protocol titled ‘Long-term QOL of adult leukemia after BMT versus intensive consolidation in AML’ was among the first phase III controlled clinical trials to employ QOL as the primary trial endpoint. In 1993, the EORTC established a separate data-monitoring unit whose sole function was to manage and evaluate QOL data obtained from EORTC clinical trials. The NCIC mandated that all phase III clinical trials consider the use of a QOL endpoint beginning in 1989 (5). Although the U.S. National Cancer Institute was slow to adopt QOL endpoints, Food and Drug Administration guidelines on anticancer drugs specified benefits to QOL as one possible criterion for approval in 1985 (6). QOL is now widely recognized to be the most salient outcome for trials that involve patients with advanced disease or poor prognosis or that involve treatments with little expected difference in survival outcomes (7, 8, 9).
Models of Health-Related Quality of Life Assessment
In undertaking efforts to assess HRQoL, it is important to recognize that there are multiple ways in which to conceptualize the construct, and decisions about the choice of assessment instruments to use for any given purpose should be made with these considerations in mind. Wenger and Furberg have defined HRQoL as ‘those attributes valued by patients, including their resultant comfort or sense of well-being; the extent to which they were able to maintain reasonable physical, emotional, and intellectual function; and the degree to which they retain their ability to participate in valued activities within
the family, in the workplace, and in the community (cited in 10).’ This definition reflects a growing consensus that HRQoL is a multidimensional construct that involves an individual’s perceived health status, life satisfaction, and physical, social, and psychological well-being (10, 11). Secondary dimensions of HRQoL are thought to include spirituality, relational intimacy, cognitive function, personal productivity, and symptom burden (12). Despite general agreement about the multidimensional nature of HRQoL, there are a number of ways in which HRQoL assessment has been operationalized. The most commonly employed methods include the assessment of health state utilities, general HRQoL, disease-specific HRQoL, and domain-specific HRQoL.
the family, in the workplace, and in the community (cited in 10).’ This definition reflects a growing consensus that HRQoL is a multidimensional construct that involves an individual’s perceived health status, life satisfaction, and physical, social, and psychological well-being (10, 11). Secondary dimensions of HRQoL are thought to include spirituality, relational intimacy, cognitive function, personal productivity, and symptom burden (12). Despite general agreement about the multidimensional nature of HRQoL, there are a number of ways in which HRQoL assessment has been operationalized. The most commonly employed methods include the assessment of health state utilities, general HRQoL, disease-specific HRQoL, and domain-specific HRQoL.
Health State Utilities
Preference-based HRQoL measures provide health state utilities by asking patients to indicate a preference between two choices under conditions of uncertainty. The most common applications of preference-based HRQoL measures include the Time Trade-off (TTO) and Standard Gamble techniques. The TTO method addresses an individual’s willingness to accept a shorter although healthier life (i.e., how much life expectancy would one trade in exchange for improved QOL). Similarly, the standard gamble assesses the probability that an individual would risk death in order to regain perfect health. Utility-based measures of HRQoL, such as TTO or standard gamble techniques, have not been widely used in cancer populations. However, emerging evidence suggests that patient-based utilities for cancer-related health states are not strongly related to self-report QOL measures (13). Utility scores measured on a 0 to 1.0 scale are typically lower than self-report measures based on the same scale.
One method for measuring patient-based utilities for cancer health states has been proposed by Perez etal. (14). Using a TTO method, Perez etal. (14) asked patients with metastatic disease whether they would be willing to trade days in the upcoming month for a single month of perfect health. Patients who were willing to trade days proceeded through a brief nine-item questionnaire which assessed willingness to trade 3, 5, 10, 15, 20, 25, 27, 29, or 30 days using responses of ‘Yes,’ ‘No,’ or ‘Maybe.’ A ‘Maybe’ response is considered to be an equivalence point at which the patient is indifferent between ‘Yes’ and ‘No’ responses. Utility value for their current health state is then calculated as (30 days traded at equivalence point)/30.
In the standard gamble procedure, an individual is asked whether they would be willing to undergo a treatment with two possible outcomes with varying probabilities: cure or immediate death. The health state utility is identified as the probability at which an individual is indifferent between undergoing and foregoing the hypothetical treatment. For example, an individual nearing the end of life would be likely to agree to undergo a hypothetical treatment with a 90% probability of immediate cure and only a 10% chance of death but might be indifferent at the point at which the probability of immediate cure drops to 50%. This person would have a utility for their current health state of 0.50. Another method for assessing preference-based utilities is through the use of the Health Utilities Index (HUI). The HUI is a 5-item interviewer-administered measure that has been used to estimate patient-derived utilities for their present health state.
Preference-based measures provide a number of benefits over the more traditional questionnaires. Utilities can be used to readily determine quality-adjusted survival [e.g., Quality-Adjusted Life-Years (QALYs)]. Additional benefits include ease of interpretation of a single numerical estimate of HRQoL relative to a profile of subscale scores and, in conjunction with a measure of costs, facilitation of cost-utility analyses (15). However, these advantages are strongly outweighed by the confusing nature of the assessment techniques and the resulting cognitive burden imposed on patients. Additionally, utility scores are of limited clinical value.
General Health-Related Quality of Life
Unlike utility-based measures, general HRQoL measures are usually derived from self-reports and in most cases do not provide a single point estimate of an individual’s overall HRQoL. Rather, such measures typically break HRQoL down into its constituent domains (e.g., physical, social, emotional, and functional well-being) and provide individual scores for each domain. General health status instruments assess the impact of disease and treatment on any medical population, thereby enabling administration to patients with any of a number of different medical conditions. General health status instruments include the Nottingham Health Profile, the Sickness Impact Profile, and the Medical Outcomes Study SF-36. A primary advantage of this approach to QOL assessment is the ease with which resulting QOL scores can be compared with norms derived from populations with other medical conditions. The SF-36 is perhaps the most widely used general measure of HRQoL, but its relevance for patients nearing the end of life is questionable. The instrument contains a number of items that are likely to be inappropriate or result in floor effects for a palliative care population (e.g., ‘does your health now limit you from engaging in vigorous physical activity,’ ‘have you had to cut down on the amount of time you spent on work or other activities as a result of your physical health,’ and so on. (16)). Although general measures of HRQoL were designed to be broadly applicable across disease conditions, they appear to be of limited value for palliative care patients and have been used only sparingly in end-of-life populations.
Disease-Specific Measures of Quality of Life
Measures of cancer-specific HRQoL sample aspects of QOL which are particularly salient to cancer patients, such as treatment side-effects and pain, in addition to sampling the broader impact of disease on psychological, physical, social, and functional well-being. Cancer-specific measures assess a level of detail that cannot be achieved with generic or preference-based measures. A handful of cancer-specific HRQoL instruments are available to researchers, including the Functional Living Index for Cancer (FLIC), the Cancer Rehabilitation Evaluation System (CARES), the Functional Assessment of Cancer Therapy (FACT), and the EORTC Quality of Life Questionnaire (QLQ-C30) (17, 18, 19, 20, 21). The most widely used measures of cancer-specific HRQoL employ a ‘core-plus’ approach, which involves the use of a ‘core’ set of items that can be used with any cancer population ‘plus’ a cancer-specific set of items that evaluate symptoms and experiences that may be unique within a specific type of cancer (e.g., lung or colorectal). In palliative care settings, the QLQ-C30 and FACT (also known as FACIT) have been the most commonly used disease-specific HRQoL instruments, and both employ a ‘core-plus’ assessment methodology.
Utilizing a ‘core-plus’ approach is one way in which researchers and practitioners alike have sought to capture a comprehensive picture of QOL. With the core-plus approach, one uses generalized measures supplemented with disease-specific measures. This enables researchers to compare related groups based on common measures, while addressing the specific characteristics of a target population. For example, the Functional Assessment of Cancer Therapy-General Form (FACT-G) is a core instrument used in concert with disease-specific measures such as the FACT-B to assess QOL in women with breast cancer. Using this approach, one can compare QOL scores across different cancer types as well as have critical information about
a specific population. One potential caveat to this approach in palliative care is the increase in response burden for the patient. It is important that the investigator weighs the costs and benefits of increasing the response burden inherent to the assessment procedure. If one’s research questions do not include plans for comparative analysis, including a core instrument with potentially irrelevant items may be unnecessarily burdensome.
a specific population. One potential caveat to this approach in palliative care is the increase in response burden for the patient. It is important that the investigator weighs the costs and benefits of increasing the response burden inherent to the assessment procedure. If one’s research questions do not include plans for comparative analysis, including a core instrument with potentially irrelevant items may be unnecessarily burdensome.
The European Organization for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ-C30/Aaronson etal. (17)) is a 30-item HRQoL measure with excellent reliability, and validity has been evaluated favorably in a number of cancer populations. Widely released in 1993, the EORTC QLQ-C30 version 3.0 was designed for international clinical trials and has been translated into many different languages. This instrument provides five subscales of functioning, including physical functioning, role functioning, social functioning, emotional functioning, and cognitive functioning. Symptom-related items, which pertain to dyspnea, fatigue, sleep disturbance, loss of appetite, nausea, vomiting, constipation, and diarrhea, are also assessed. The EORTC Study Group on QOL has developed a number of cancer type–specific modules for use with the QLQ-C30, including lung, breast, head & neck, esophageal, ovarian, gastric, prostate, multiple myeloma, brain, and colorectal cancer-specific items. For example, the ‘plus’ module for colorectal cancer is a 38-item module that covers aspects of QOL that are more specific to patients with colorectal cancer and includes scorable subscales related to body image, sexual function, future perspective, sexual enjoyment, micturition problems, gastrointestinal symptoms, male and female sexual problems, defecation problems, weight loss, and chemotherapy side-effects. The psychometric properties of the EORTC QLQ-C30 have been well studied across a number of patient populations internationally and demonstrate good reliability and validity.
Although used frequently in palliative care, the EORTC QLQ-C30 was not designed for primary use in this field. As a result, the measure has shortcomings when applied to the terminally ill. The EORTC employs a normative approach, basing comparisons to the average ability level of a health individual. This is an unrealistic comparison for terminally ill patients, and may not be optimally informative. A more appropriate comparison for assessment may be to use a model of optimal QOL for an individual’s disease stage. One of the unique characteristics of palliative care is the need to address death and end-of-life issues. As such, just meeting the patient’s physical needs is not sufficient; an optimal palliative care measure should include a spiritual/existential component to assess patient needs. The QLQ-C30 does not address the specific spiritual needs of palliative care, rendering it ineffective in this domain. The QLQ-C30 questionnaire also includes items that are likely to be inappropriate for many palliative care patients. Examples of these questions include: ‘Do you have any trouble taking a long walk?’ and ‘Are you limited in any way in doing either your work or doing household jobs?’ Fatigue items such as these are likely to produce ceiling affects that will not contribute meaningful information to the QOL assessment. Given the probability of functional and cognitive impairments in study subjects, it is particularly important to avoid including unnecessarily long or insensitive lines of questioning. Some of the QLQ-C30 subscales have performed poorly in studies with patients nearing the end of life. For example, the fatigue subscale of the EORTC QLQ-C30 exhibits clear floor/ceiling effects when used in this population (22). However, a short form of the EORTC QLQ-C30 for use in palliative care is currently in development (see http://www.eortc.be/home/qol/modules.htm).
The FACT-G is a 27-item questionnaire that utilizes 5-point Likert scales to evaluate social well-being, physical well-being, emotional well-being, and functional well-being (18, 23). The FACT-G has been widely used in clinical trials in the United States (23). Sixteen cancer-specific modules are available for use with the core FACT-G measure, including breast, bladder, brain, cervical, colorectal, leukemia, lymphoma, and lung cancer–specific items, to name a few. For example, the FACT-C colorectal-specific instrument consists of the 27 items of the FACT-G plus an additional 9 items that pertain to specific symptoms associated with colorectal cancers (e.g., bowel control, ostomy care, and concern about appearance). Importantly, a palliative care specific module (FACIT-PAL) has been developed for use with patients nearing the end of life. This instrument will be discussed in greater detail in the following text.
Domain-Specific Measures of Quality of Life
The term ‘QOL’ has been used broadly in the medical literature, often without formal definition or explication. As a result, a number of clinical trials have reported ‘QOL’ endpoints using measures that are not consistent with multidimensional conceptualizations of HRQoL. Gill and Feinstein (24) report that published studies including QOL analyses have used more than 159 instruments to measure QOL, with little agreement as to which domains, scales, or items actually measure the construct. Accordingly, the reader should be cautious when reviewing studies that use the term ‘QOL’ or purport to have measured QOL. It is not uncommon to see a self-report measure of a single symptom domain referred to as a measure of HRQoL. Instead, we believe this approach can be more accurately conceptualized as assessment of single domains of HRQoL. This approach generally allows researchers to tailor an assessment battery to the population, and study aims, of interest and can be used to bolster a multidimensional measure of HRQoL. Examples of domain-specific constructs (and measures) relevant for palliative care populations include pain (often measured using the Brief Pain Inventory), fatigue (measured using the Multidimensional Fatigue Inventory or the Brief Fatigue Inventory among others), spiritual well-being (assessed with the Spiritual Well-Being scale of the FACIT or the Spiritual Well-Being Scale), and depression (frequently measured using the Center for Epidemiological Studies-Depression Scale, Hospital Anxiety and Depression Scale, or Beck Depression Inventory).
Quality of Life Assessment in Palliative Care
General Considerations
Measurement of QOL in persons at the end of life is quite distinct from the assessment of individuals who are hoping for and perhaps expecting a complete recovery to their pre-diagnosis baseline. In describing the underpinnings of palliative care, Dame Cicely Saunders emphasizes the importance of the following basic principles: symptom control, maximizing the potential of the patient and family’s relationships, caring for the family unit, and spiritual needs (25). This concise description highlights potential distinctions between conceptualizations of QOL for those in curative care relative to those nearing the end of life. In the context of palliative care, several domains that are carefully measured in general, health-related, and disease-specific QOL measures are rendered either irrelevant or much less useful than in other settings. For example, a generally outstanding HRQoL measure oft-used in cancer research, the FACT, includes several items that ask patients to rate ‘worry about [their] condition getting worse’ or the extent to which
they are ‘losing hope in the fight against … illness.’ Similar problems exist for the EORTC QLQ-C30. Other concerns typically measured in QOL instruments that are nonspecific to the end of life are simply decreased in relevance. Functional well-being, for example, is much less important to those who are no longer working or held responsible for instrumental activities of daily living (e.g., balancing checkbooks, doing household grocery shopping, etc.). Other aspects of QOL that are typically minimized in other QOL assessment instruments may be of maximal importance to those at the end of life. Notably, spiritual and existential concerns are among the most strongly endorsed concerns reported by patients in palliative care (26, 27). Therefore, an assessment of HRQoL that is based largely on physical symptoms and functional complaints that are quite prevalent in palliative care may underestimate actual QOL as perceived by the patient, particularly if there are other salient domains (e.g., spiritual well-being) that may become increasingly salient to the individual.
they are ‘losing hope in the fight against … illness.’ Similar problems exist for the EORTC QLQ-C30. Other concerns typically measured in QOL instruments that are nonspecific to the end of life are simply decreased in relevance. Functional well-being, for example, is much less important to those who are no longer working or held responsible for instrumental activities of daily living (e.g., balancing checkbooks, doing household grocery shopping, etc.). Other aspects of QOL that are typically minimized in other QOL assessment instruments may be of maximal importance to those at the end of life. Notably, spiritual and existential concerns are among the most strongly endorsed concerns reported by patients in palliative care (26, 27). Therefore, an assessment of HRQoL that is based largely on physical symptoms and functional complaints that are quite prevalent in palliative care may underestimate actual QOL as perceived by the patient, particularly if there are other salient domains (e.g., spiritual well-being) that may become increasingly salient to the individual.
Search Strategy
A systematic search strategy was employed in order to identify all relevant palliative care-specific measures of HRQoL. The search was conducted in two phases. First, PubMed and PsycInfo were searched using the terms ‘QOL,’ ‘outcomes,’ ‘assessment or evaluation or measure or questionnaire,’ and ‘hospice or palliat*.’ Second, previously published reviews of QOL issues in palliative care were collected and reviewed for descriptions of appropriate measures (16, 28, 29, 30). Given the breadth of the field of QOL assessment and the large number of high-quality reviews of the general HRQoL literature, we chose to limit our discussion to QOL measures that were as follows:
Specifically developed for use in persons nearing the end of life
Psychometrically evaluated in palliative care populations
Based on self-report (rather than clinician) ratings of HRQoL
Used in more than one study
These criteria resulted in the identification of nine palliative care QOL measures: the McGill Quality of Life Questionnaire (MQOL), Life Evaluation Questionnaire (LEQ), Palliative Care Outcome Scale (POS), FACIT-PAL, Missoula-Vitas Quality of Life Index (MVQLI), Palliative Care Quality of Life Instrument (PQLI), Hospice Quality of Life Index (HQLI), Assessment of Quality of Life at the End of Life (AQEL), and the Quality of Life at the End-of-Life Instrument (QUAL-E). Once measures of palliative care QOL were identified, systematic searches were repeated in PubMed and PsycInfo to select all published articles in which the measure was administered to patients at the end of life. Each measure will be described, and for each measure we will discuss all available evidence for the instrument’s reliability, validity, acceptability, sensitivity to intervention, dissemination/use in clinical trials, and appropriateness for the intended population.
McGill Quality of Life Questionnaire
The MQOL is a 16-item, self-report instrument designed to measure physical symptoms (3 items), physical well-being (1 item) psychological symptoms (4 items), existential well-being (6 items), and social support (2 items). The instrument is scored to generate subscale scores associated with each of these four categories and a total QOL score derived from the average of the five-subscale scores (31). Patients are asked to self-report their QOL concerns using an 11-point Likert-type scale, and each item is anchored with two verbal descriptors (e.g., 0 = ‘no problem’, 10 = ‘tremendous problem’). Respondents are asked to describe their QOL only for the previous 2 days. In addition, a single item on the instruction page of the MQOL can be used for those respondents who are unable to complete the questionnaire in its entirety. Item development was based on existing HRQoL and symptom-assessment measures, and item testing and psychometric properties of the MQOL were conducted in samples consisting almost entirely of patients in palliative care or otherwise nearing the end of life (27, 31, 32, 33, 34). The MQOL was developed in both English and French languages and has been successfully translated into Spanish, Hong Kong Chinese, Taiwan Chinese, and Malaysian (35).
Item characteristics and reliability of the MQOL appear to be acceptable across studies. Two of the five MQOL subscales typically provide negatively skewed distributions (existential well-being and social support), suggesting less than desirable item response characteristics for these items (33). However, it may be possible to transform these skewed distributions in order to more closely approximate normality when the data will be analyzed with parametric statistics whose assumptions involve univariate or multivariate normality. Internal consistency for the entire measure has been shown to be excellent (α’s = 0.81 to 0.91 in separate samples (34)). Internal consistency estimates for the subscales are generally in the acceptable range: physical symptoms (α’s = 0.65 – 0.88), psychological symptoms (α’s = 0.81 – 0.89), existential well-being (α’s 0.75 – 0.86), and support (α’s = 0.73 – 0.79 (34, 36)). Test-retest reliability for individuals reporting stable QOL over a 2-day period is generally good (α = 0.75 (34)).