Measurement of Quality of Life Outcomes



Measurement of Quality of Life Outcomes


Benjamin J. Miriovsky

Amy P. Abernethy



BACKGROUND

According to the American Academy of Hospice and Palliative Medicine, the goal of palliative care is “…to prevent and relieve suffering, and to support the best possible quality of life for patients and their families, regardless of their stage of disease or the need for other therapies, in accordance with their values and preferences” (1). Given the centrality of the concept of quality of life (QoL) to palliative and supportive oncology, assessment of QoL is essential, both as part of routine clinical practice and for research purposes (2). Practical uses of routine measurement of QoL include identifying and prioritizing problems, facilitating communication, screening for unidentified problems, encouraging shared decision making, and monitoring change and effectiveness of treatment (3). The last point in this list is particularly important because without tools to systematically assess QoL and evaluate the effectiveness of standard practices or novel interventions, the extent to which the ultimate goals of palliative medicine are realized (i.e., the prevention and relief of suffering) cannot be honestly assessed.

While there is broad consensus about the importance of routine measurement of QoL in palliative medicine (4), there is little consensus about how this is best achieved (5,6), though the recommendations are consistent that patients themselves, without interpretation by third parties, are the best source of information regarding QoL. Such reports taken directly from patients, without other censoring, are referred to as patient-reported outcomes (PROs) and form the basis for QoL measurement. There are numerous other types of PROs, especially symptoms, such as pain, nausea, breathlessness, and anxiety, which are often related to QoL, pertinent to palliative and supportive oncology, and best collected via PRO assessment instruments as well. This chapter will broadly discuss terminology in QoL research, the statistical basis, relevant available instruments for palliative medicine, and a framework for selecting among the available instruments.


TERMINOLOGY

In order to understand the basis for disagreement about best practices for measuring QoL, it is prudent to examine currently used definitions for the numerous terms. A thorough, but by no means comprehensive, list of relevant terms and definitions is included for quick reference (Table 68.1). QoL, as defined by the World Health Organization (WHO) Quality of Life Group, is an individual’s “ …perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns. It is a broad ranging concept affected in a complex way by the person’s physical health, psychological state, level of independence, social relationships, personal beliefs and their relationship to salient features of their environment” (13). Despite this definition, research in QoL measurement is complicated (11), with multiple authors having acknowledged that there is still a lack of agreement on the exact definition (5,14,15), and that imprecision and inconsistency persist (11). In fact, the term QoL sometimes is used to refer to the general construct related to overall satisfaction with all aspects of an individual’s life, while at other times it is used more specifically to reflect those experiences impacted by disease or its treatment (14). There are a number of terms—functional status, health status, QoL, and health-related quality of life (HRQoL)—that have been used interchangeably in the literature because they overlap to some degree with respect to definition, but in fact, differ in important ways (11). Further discussion on QoL measurement depends upon clarification, to the extent possible, of these terms.

Functional status generally refers to the ability to physically perform tasks related to daily living, such as household activities, personal care, and eating (11), although, in some contexts, the term reflects the ability to perform in expected social roles. Functional status is traditionally assessed by clinicians (e.g., the Karnofsky Performance Status Scale (16), Eastern Cooperative Oncology Group Performance Status Scale (17), and Palliative Performance Scale (18)). Health status is a multidimensional concept that is broader than functional status, typically representing an individual’s perception about overall state of health, both physical and mental (11). Again, the term can be used for even more general concepts, such as how the perception of the overall state of health influences social roles and spirituality (14).

By definition, QoL is a very subjective concept, yet one that almost everyone can intuitively understand. It also has intuitive meaning to most people, but the meaning undoubtedly varies between individuals. It is this variability that complicates QoL research, as consistent implementation of a single, agreed-upon definition (even though one has been put forward by the WHO) across the spectrum of QoL research is impractical.

Because the term QoL extends beyond the scope of health (mental and physical) to include the influence of

social, political, economic, and environmental factors on an individual’s experience, recent literature within the context of QoL research in medicine has focused on the concept of HRQoL. HRQoL is defined as “…the subjective assessment of the impact of disease and treatment across the physical, psychological, social and somatic domains of functioning and well-being” (11). HRQoL is intended to differentiate between the effects of factors intrinsic to the individual from those related to societal factors, such as political and societal norms (14). With respect to measuring HRQoL, there is some disagreement about which domains, among physical, psychological, and social, are necessary for inclusion. There is agreement, however, that a comprehensive approach to measurement of HRQoL is necessary because of the multidimensional nature of the concept (11).








TABLE 68.1 Definitions for commonly encountered terms in quality of life measurement































































Term


Definition


Ability to detect change


Evidence that a PRO instrument can identify differences in scores over time in individuals or groups who have changed with respect to the measurement concept (7)


Clinician-reported outcome (ClinRO)


Outcomes that are either observed by the physician (e.g., cure of infection and absence of lesions) or require physician interpretation (e.g., radiologic results and tumor response). In addition, ClinROs may include formal or informal scales completed by the physician using information about the patient (8)


Concept


The specific measurement goal or the thing that is measured by a PRO (7)


Conceptual framework


Explicitly defines the concepts measured by the instrument in a diagram that presents a description of the relationships between items, domain (subconcepts), and concepts measured and the scores produced by a PRO instrument (7)


Construct validity


The degree to which what was measured reflects the a priori conceptualization of what should be measured (9)


Content validity


The extent to which the instrument actually measures the concepts of interest (10)


Criterion validity


The extent to which the scores of PRO measure reflect the gold standard measure of the same concept (7)


Domain


A subconcept represented by a score of an instrument that measures a larger concept comprised of multiple domains (7)


Health-related quality of life


The subjective assessment of the impact of disease and treatment across the physical, psychological, social, and somatic domains of functioning and well-being (11)


Instrument


A means to capture data (i.e., a questionnaire) plus all the information and documentation that supports its use. Generally, that includes clearly defined methods and instruction for administration or responding; a standard format for data collection; and well-documented methods for scoring, analysis, and interpretation of results in the target population (7)


Item


An individual question, statement, or task (and its standardized response options) that is evaluated by the patient to address a particular concept (7)


Metadata


Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information source (12)


Patient-reported outcome (PRO)


A measurement based on a report that comes directly from the patient (i.e., study subject) about the status of a patient’s health condition without amendment or interpretation of the patient’s response by a clinician or anyone else (7)


Proxy-reported outcome


A measurement based on a report by someone other than the patient reporting as if he or she is the patient (7)


Quality of life


An individual’s perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards, and concerns. It is a broad-ranging concept affected in a complex way by the person’s physical health, psychological state, level of independence, social relationships, personal beliefs, and their relationship to salient features of their environment (13)


Recall period


The period of time patients are asked to consider in responding to a PRO item or question (7)


Reliability


The ability of an instrument to yield the same result on serial administrations when no change in the concept being measured is expected (10)


Scale


The system of numbers of verbal anchors by which a value or score is derived for an item. Examples include visual analog scales (VAS), Likert scales, and rating scales (7)


Score


A number derived from a patient’s response to items in a questionnaire. A score is computed based on a prespecified, validated scoring algorithm and is subsequently used in statistical analyses of clinical results (7)


Acknowledging the preference for the term HRQoL in recent literature, and cogent arguments surrounding its use, we prefer QoL in the setting of palliative and supportive oncology, as attempting to compartmentalize HRQoL from QoL is exceedingly difficult and practically trivial in this setting. In the setting of life-threatening illness, healthrelated aspects touch nearly every aspect of life and are often all encompassing. Thus, further discussions will consistently utilize the term QoL, rather than HRQoL. Additionally, because PROs serve as the foundation for QoL measurement, the terms PRO and QoL instruments are used interchangeably, with PRO instruments reflecting a more general term that may or may not be specifically designed to measure QoL but rather may be specific to other PRO constructs such as symptom assessment.


METHODOLOGICAL BASIS FOR MEASURING QoL


The Importance of Direct Patient Reports

As opposed to traditional evaluations, such as laboratory and imaging data, or even functional status assessments by clinicians, patients themselves are the most appropriate source for QoL measurements, without interpretation by third parties, thus the emphasis on PROs within QoL literature. While direct patient reports are considered the gold standard for QoL measurement, the need for proxy reports of QoL is often necessary in palliative and supportive oncology, where patient’s ability to directly complete questionnaires, even in assisted forms, often becomes limited (19,20,21). In another variation on the theme of direct reporting, caregiver reports—either as proxy reports of patient experience or as reports of caregiver experience with respect to domains such as distress and satisfaction—are an important source of information in palliative medicine. However, such reports are not strictly PROs and are surrogates and cannot serve as the “gold standard” for measuring QoL. Thus, proxy reports and caregiver reports will receive only brief discussion in a later section “Caregiver/Proxy Considerations.”


Developing Measurement Tools

There is an extensive body of literature regarding the development and psychometric assessment of PRO and QoL measurement instruments. The methodological basis for psychometric assessment is crucial for research that involves PRO assessments, as it provides important background to understanding and comparing the validity and reliability of various measurement tools. However, such detail is beyond the scope of this text and readers are referred to other excellent texts on these issues (22,23). Even from a clinician’s viewpoint, however, it is important to understand some overarching concepts of psychometric analysis; hence, the following sections approach these concepts from the clinician’s viewpoint. Practically, an item refers to a single question or statement, a factor or subscale is a collection of items addressing one domain (or subconcept) related to the overarching concept of interest, and an instrument refers to the entire collection of items related to the concept of interest (i.e., survey or questionnaire), in addition to the supporting documentation and associated standard procedures (Table 68.1); the terms questionnaire and survey are used interchangeably with instrument.


Instrument Development

In developing an instrument based on direct patient reports to measure QoL, a sound conceptual framework is necessary. In the context of PRO instrument development, a conceptual framework “explicitly defines the concepts measured by the instrument in a diagram that presents a description of the relationships between items, domain (subconcepts), and concepts measured and the scores produced by a PRO instrument” (7). That is, the conceptual framework identifies the factors that influence the concept of interest and how these factors and the concept are related. Typically, the initial conceptual framework is based upon expert opinion and literature review using a priori hypotheses. After developing an initial conceptual framework, direct patient input is obtained, typically through focus groups or structured interviews of patients within the population of interest to ensure that the a priori hypotheses are consistent with patient’s experience and perception. For complex concepts, such as breathlessness, multiple domains impact the overall concept, so identifying appropriate domains and then assessing these is paramount to assessing the overarching concept. The conceptual framework typically evolves over time, with each iterative change moving the framework closer to actual patient experience.


Validity

Validity is one of the key psychometric properties of measurement scales and reflects the extent to which an instrument measures the concept or domain of interest in the target population (14); that is, validity addresses the question “Is the instrument measuring what you think it is measuring?” From
a psychometric standpoint, validity has three main forms: content, construct, and criterion validity. Content validity describes a qualitative assessment as to whether the items accurately reflect those experiences and perceptions that are important (14). Because it is a qualitative assessment, there is not a formal, standardized metric to score content validity. Rather, the adequacy of content validity is based on expert opinion, literature review, or patient input (14). Content validity has received significant emphasis within the PRO literature, especially recently with the United States Food and Drug Administration (FDA) guidance document for use of PROs in product-labeling claims (7). The FDA guidance document clearly identifies content validity as the psychometric cornerstone for product-labeling claims based on PRO data. Construct validity describes the degree to which what was measured reflects the a priori conceptualization of what should be measured (9). Subcomponents of construct validity are convergent and discriminant validity, which assess the degree of similarity between measures that are theoretically similar (convergent validity) or the extent to which measures that are theoretically different actually differ (discriminant validity). For example, a new measure of anxiety would be expected to have high convergent validity with the anxiety subscale of the Hospital Anxiety and Depression Scale (24). Criterion validity describes the extent to which the scores of PRO instrument reflect the gold standard measure of the same concept (7). Criterion validity is often difficult to assess in the PRO arena because identifying gold standard measures for many PRO concepts is difficult (7), implicitly deemphasizing criterion validity.


Reliability

Reliability describes an instrument’s consistency (14) or the ability of an instrument to yield the same result on serial administrations when no change in the concept being measured is expected (10). It is important to note that validity depends upon reliability (i.e., an instrument that measures the concept of interest accurately must do so consistently), but that reliability does not depend upon validity (i.e., the instrument may consistently measure the wrong thing). Within the realm of PROs, reliability is most commonly assessed via test-retest and internal consistency. With test-retest methods, the same subjects complete the same instruments on two occasions. Any differences in responses between the two occasions not attributable to a true change in the experience is attributed to lack of reliability (14), placing great importance on the time interval between testing (10). Reliability can be quantitatively assessed with Cronbach’s α, which measures the internal consistency of an instrument. Internal consistency reflects the degree to which items within a scale measure the same concept, in a given population (14). Well-established thresholds for interpreting Cronbach’s α are available; in general, coefficient α > 0.7 is the minimum acceptable threshold for comparisons between groups (10). The dependence of reliability on the target population supports the importance of reassessing psychometric properties when instruments are introduced to new populations (14).


Ability to Detect Change

The ability of a PRO measure to detect change is intuitively important since many PROs are collected longitudinally. Demonstration of this ability, according to the FDA, requires that changes in the PRO measure parallel changes in other factors that indicate a change in the status of the concept of interest (7). For example, in patients receiving a new treatment for opioid-induced constipation, changes in a PRO measure designed to assess overall bowel health may be linked with the use of certain other bowel products, such as enemas, to establish the ability to detect change. The measure must demonstrate the ability to detect both improvements in health status and losses. Further, it is important to detect changes throughout the range of possible values.

A clinical trial that includes QoL as a primary or secondary outcome should include an explicit statement of the anticipated minimal change that will be considered evidence of meaningful effect; this should align with the minimally important clinical difference. In registry studies or routine care, where longitudinal collection and analysis are critical, understanding the concept of minimally important change detected (25), rather than establishing that number explicitly, may be sufficient. When interpreting results from intervention and observational studies, it is critical to consider the result within the context of the instrument’s ability to reliably and validly measure meaningful change, the magnitude of change observed, and the related clinical impact.


Areas of Controversy

The increasing emphasis placed upon content validity has generated some controversy as PRO developers attempt to improve content validity, in part by meticulously wording items and instructions to minimize variations in interpretation between patients (26). However, the ability to improve content validity likely is asymptotic, in that individual variability undoubtedly influences interpretation of questions in ways that are not controllable since responses to an instrument capture the patient’s true (and unique) perceptions. There are concerns that in the pursuit of greater content validity, other important characteristics of PRO measures may be underdeveloped or underappreciated (9). For example, in pursuing greater content validity, the constraints placed upon questions may actually limit patient perspective by forcing some degree of conformity or may result in misinterpretation of results. In palliative and supportive oncology, where the patient’s experience is the most important outcome, artificially constraining or limiting the range of experiences communicated risks undermining the foundation of the discipline.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Aug 25, 2016 | Posted by in ONCOLOGY | Comments Off on Measurement of Quality of Life Outcomes

Full access? Get Clinical Tree

Get Clinical Tree app for offline access