Lung cancer is a global health burden and is among the most common and deadly of all malignancies worldwide. Early detection of resectable and potentially curable disease may reduce the overall death rate from lung cancer. However, at the present time, screening for lung cancer is not recommended by most clinical societies and health care agencies in the United States. This article discusses the history of, and rationale for, lung cancer screening, addresses optimization of screening protocols, and describes our current approach for the evaluation of small pulmonary nodules referred for surgical management.
Lung cancer is a global health burden and is among the most common and deadly of all malignancies worldwide. In the United States, lung cancer accounts for more than 25% of all cancer deaths, exceeding deaths from breast, colon, and prostate cancers combined. More than 80% of individuals with lung cancer die of the disease. This is primarily because a large proportion of patients with lung cancer present with locally advanced or metastatic disease. Intuitively, early detection of resectable and potentially curable disease may reduce the overall death rate from lung cancer. However, at the present time, screening for lung cancer is not recommended by most clinical societies and health care agencies in the United States. Most notably, the American Cancer Society and the United States Preventative Services Task Force (USPSTF) do not recommend for or against screening for lung cancer but instead suggest that interested individuals discuss the merits of screening with their physicians. This position is based on the results of 3 randomized trials, conducted in the late 1970s, that examined the value of plain chest radiography (CXR) with or without sputum cytology for lung cancer screening in men who were active or former smokers. The 3 trials showed nearly identical lung cancer–related mortality in the screened populations and in the control groups, although arguably the prespecified 50% reduction in mortality may have been overly optimistic.
The introduction of multislice computed tomography (CT) technology has renewed interest in screening for lung cancer given that CT is more sensitive than CXR for the detection of small pulmonary nodules. CT-based observational studies have consistently shown that lung cancer is detected in approximately 1% to 2% of high-risk individuals and that most of these cases are early stage disease. These studies generated significant debate about the value of CT screening in reducing lung cancer–related mortality, both on a societal and an individual patient basis. To definitively determine the effect of CT screening on disease-related mortality, 2 large randomized trials have been launched including, most prominently, the National Lung Screening Trial (NLST) with 53,000 participants and the Dutch-Belgian randomized lung cancer screening trial (NELSON), which included 15,822 individuals. The National Cancer Institute recently announced that the NLST showed that, after 3 annual screening rounds and 8 years of follow-up, the CT screening arm of the trial was associated with a 20.3% reduction in lung cancer mortality and a 7% reduction in overall mortality compared with CXR screening ( http://www.cancer.gov/newscenter/pressreleases/2010/NLSTresultsRelease ).
This established mortality benefit seems to be a major step forward in lung cancer screening efforts and may prepare the way for national lung cancer screening programs. This article discusses the history of, and rationale for, lung cancer screening, addresses optimization of screening protocols, and describes our current approach for the evaluation of small pulmonary nodules referred for surgical management.
History of lung cancer screening
Interest in screening high-risk patients for lung cancer was sparked when the association between cigarette smoking and lung cancer was first appreciated by Doll and Hill in the 1950s. The first mass screening project was conducted by Brett in London from 1960 to 1964 (1968). Although not a randomized trial, 55,034 men were assigned to undergo either CXR every 6 months for 3 years (the screened group), or a single CXR at the beginning of the study, followed by a repeat CXR at the end of the 3-year period (the unscreened group). At the end of the 3-year period, more lung cancers were detected in the screened group compared with the unscreened group (132 vs 96 cases). In addition, resectability was enhanced in the screened group. Despite these findings, lung cancer–specific mortality was not different between the 2 groups.
In the 1970s, the National Cancer Institute funded 3 randomized trials ( Table 1 ) for lung cancer screening using both CXR and sputum cytology. Two of these trials (the Johns Hopkins Lung Project and the Memorial Sloan-Kettering Cancer Center [MSKCC] trial) focused on the value of the addition of sputum cytology to annual CXRs. In the MSKCC study, patients were randomized to annual CXR alone or annual CXR plus sputum cytologic assessment every 4 months. The same number of cancers was detected in both groups. No difference was detected in resectability rates or lung cancer–specific mortality. This screening protocol was also used in the Johns Hopkins Lung Project randomized trial, with similar results. In the Mayo Lung Project, patients were randomized to undergo CXR and sputum cytologic assessment every 4 months for 6 years (the screened group), or given the usual recommendation of the Mayo Clinic, namely to undergo both of these examinations annually, but without reminders sent to these individuals (the unscreened group). With more than 10,000 participants, the study was powered to show a 50% reduction in lung cancer–related mortality. After a median follow-up period of 3 years, more lung cancers were detected in the screened group compared with the unscreened group. In addition, the resectability rate in the screened group was significantly higher. Nonetheless, there was no statistically significant difference in the lung cancer–specific mortality between the screened and unscreened populations. Several concerns were raised about the conduct of the study, most notably the significant contamination of the control arm as well as the lack of compliance with the screening protocol in the experimental or screened arm of the trial. For example, more than 50% of individuals in the unscreened group had CXRs performed during the course of the study and approximately 25% of participants in the screening group failed to comply with the screening regimen. Furthermore, the trial was significantly underpowered to detect lower, but clinically important, reductions in lung cancer mortality.
Study Institution/Location | MSKCC | Johns Hopkins | Mayo | Czechoslovakia |
---|---|---|---|---|
Years of Accrual | 1974–1982 | 1973–1982 | 1971–1983 | 1976–1980 |
Screened Arm | ||||
Sample size | 4968 | 5226 | 4618 | 3172 |
Protocol | Annual CXR; sputum cytology every 4 mo | Annual CXR; sputum cytology every 4 mo | CXR and sputum cytology every 4 mo for 6 y | CXR and sputum cytology every 6 mo for 3 y |
Baseline cancers | 30 | 39 | Data not available | Data not available |
Repeat screen cancers | 114 | 194 | 206 | 39 |
Lung cancer mortality | 2.7 | 3.4 | 3.2 | 3.6 |
Unscreened Arm | ||||
Sample size | 5072 | 5161 | 4593 | 3174 |
Protocol | Annual CXR | Annual CXR | Advised for annual CXR and sputum cytology | CXR and sputum cytology initially and after 3 y |
Baseline cancers | 23 | 40 | Data not available | Data not available |
Repeat screen cancers | 121 | 202 | 160 | 27 |
Lung cancer mortality | 2.7 | 3.8 | 3.0 | 2.6 |
A similar screening trial was conducted by Kubik and Polak (see Table 1 ) in the late 1970s in Czechoslovakia, which also focused on the combined effects of CXR and sputum cytologic examination for lung cancer screening (1986). In this trial, participants in the screened group underwent CXR and evaluation of sputum cytology at baseline, then every 6 months for 3 years, whereas those in the unscreened group had only the baseline CXR and sputum cytologic examinations, both of which were repeated at the end of the 3-year period. After the initial screening period, both groups underwent annual CXR and sputum assessment for an additional 3 years. Once again, more lung cancers were diagnosed in the screened group compared with the unscreened group (39 vs 27 cases). However, there was no difference in lung cancer–specific mortality between the 2 groups.
Early CT screening trials
In the 1990s, increased resolution and data-acquisition speeds of modern CT scanners rekindled interest in screening for lung cancer. Initial findings from Henschke and colleagues of the Early Lung Cancer Action Project (ELCAP) showed that, in a high-risk population, CT was superior to CXR in detection of lung nodules (1999). Notably, 2.7% of those enrolled in the CT screening program had lung cancer, most of which were stage I (2001). Within the initial ELCAP patient population, 27 screen-diagnosed lung cancers were found at baseline screenings, of which 96% were resectable. A subsequent report by the I-ELCAP group addressed overall curability estimated through 10-year survival rates of patients found to have stage I lung cancer by CT screening. The investigators reported an estimated 88% 10-year survival rate, markedly higher than survival rates predicted by the current staging system or among those presenting as a result of symptoms. Because CT screening leads to early detection of lung cancer and because those lung cancers found as a result of CT screening are curable, they inferred that CT screening leads to a reduction in lung cancer mortality. Several other groups have also evaluated CT screening for lung cancer ( Table 2 ). A review by Black and colleagues published in 2007 identified 12 studies, including 2 randomized and 10 single-arm observational studies (2007). Significant variability existed in the study populations and in the definition of a positive finding in each. Nevertheless, the percentage of positive screenings ranged from 5.1% to 51%. From baseline screenings, 1.8% to 18% of positive findings led to a diagnosis of cancer. Most of the tumors were stage I (53%–100%), with a high resectability rate (>78%). Only 1 of the studies reported 5-year survival: 76% for patients with cancer detected at baseline screening and 65% for patients with cancer detected at annual repeat scanning.
Study/Year | Number Screened | Positive Screen (%) | Total Lung Cancer (%) | Lung Cancer in Screen Positive Patients (%) | Percent Stage I (for NSCLC) | Percent Resectable (for NSCLC) |
---|---|---|---|---|---|---|
ELCAP 2001 | 1000 | 23.3 | 2.7 | 11.6 | 88 | 100 |
Sone et al, 2001 | 5483 | 5.1 | 0.4 | 7.9 | 22 | 100 |
Garg et al, 2002 | 92 | 33 | 3.2 | 10 | NR | NR |
Tiitola et al, 2002 | 602 | 18.4 | 0.8 | 4.5 | 0 | 20 |
Sobue et al, 2002 | 1611 | 11.5 | 0.8 | 7.0 | 77 | 92 |
Nawa et al, 2002 | 7956 | 6.8 | 0.45 | 6.7 | 86 | NR |
Pastorino et al, 2003 | 1035 | 5.9 | 1.1 | 18 | 55 | 91 |
Swensen et al, 2002, 2003 | 1520 | 51 | 1.7 | 3.3 | 69 | NR |
Diederich et al, 2002, 2004 | 817 | 43 | 2.1 | 4.9 | 56 | 100 |
Gohagan et al, 2004 | 1586 | 20.5 | 1.9 | 9.2 | 53 | NR |
MacRedmond et al, 2004 | 449 | 24 | 0.4 | 1.8 | NR | 100 |
Miller et al, 2004 | 3598 | 32 | 0.61 | 1.9 | NR | NR |
In a study published the same year, Bach and colleagues reported the findings from CT screening of 3246 high-risk patients from multiple institutions (2007). The investigators reported a threefold increase in individuals diagnosed with lung cancer and a tenfold increase in patients undergoing lung resection (compared with expected cases). They also found no evidence of a decline in the number of patients with advanced stages of disease or of deaths from lung cancer in the screened groups. The investigators concluded that CT screening may not meaningfully reduce the risk of dying from lung cancer and suggested that CT screening is inherently prone to overdiagnosis, thus exposing patients to unnecessary surgery. The study generated significant controversy given that the follow-up was short (3.9 years) and that at least 1 of the 3 studies did not require the exclusion of symptomatic individuals, possibly undermining the core concept of screening.
Recent CT screening trials
The NLST enrolled 53,454 smokers and ex-smokers between the ages of 55 and 74 years ( Table 3 ). The group compared low-dose CT screening with CXR screening using 3 annual screening rounds with 8 years of follow-up. In the CT screening arm, there were 354 deaths from lung cancer, compared with 442 in the CXR group, translating into a 20.3% reduction in lung cancer–related mortality. In addition, there was a 7% reduction in overall mortality in the CT arm of the trial. This mortality reduction is unprecedented in the history of lung cancer screening and has been greeted with enthusiasm by advocates of CT screening.
CT Screened Arm | CXR Screened Arm | |
---|---|---|
Screening interval and follow-up | Three annual screenings with 8 y of follow-up | |
Number of deaths | 354 | 442 |
Relative decrease in lung cancer mortality in CT screened group | 20.3 | |
Relative decrease in overall mortality in CT screened group | 7 |
Two other recent studies have addressed the magnitude of lung cancer mortality reduction by CT screening using modeling approaches. McMahon and colleagues from the Mayo Clinic used 1520 current or former smokers undergoing CT screening to model predicted cases of lung cancer and deaths, which were compared with a simulated unscreened control arm (2008). The model ultimately simulated 500,000 cases per study arm based on 5 annual screening examinations to generate precise estimates of mortality. At 6 years of follow-up, the screening arm had an estimated 37% relative increase in lung cancer detection compared with the simulated control arm and a 28% relative reduction in cumulative lung cancer–specific mortality. Although the model included many assumptions, such as lung cancer incidence rates, adherence to the screening protocol, and treatment by established guidelines, the study made a compelling argument for a mortality benefit from CT screening.
Similarly, Foy and colleagues used a lung cancer mortality model developed within the Cancer Intervention and Surveillance Modeling Network (CISNET) to address the potential for mortality reduction by CT screening for lung cancer (2011). The comparison matched members of a CT screening trial (NY-ELCAP) with age, sex, and tobacco exposure–matched control patients from the β-Carotene and Retinol Efficacy Trial (CARET), with well-established lung cancer incidence rates (Goodman and colleagues : 16-from Foy). The simulation was repeated 5000 times to compare expected lung cancer mortality between the 2 groups. Although again subject to inherent assumptions made for the purposes of modeling, the study suggested a 45.6% relative reduction in lung cancer mortality in the group of patients screened with CT. These studies all suggest that CT screening protocols, logically followed by earlier treatment of lung cancer, do provide a mortality benefit.
Important statistical concepts
Before the report of the NLST, as judged by the morality paradigm, it was argued that lung cancer screening was not beneficial and was potentially harmful. No randomized trial had yet shown a reduction in cancer-specific or overall mortality. However, the efficacy of screening in reducing cancer-specific mortality may be confounded by lead-time, length, and overdiagnosis biases. Although the statistical arguments may be examined from many different vantage points and are sometimes difficult to interpret, they have been well described previously by Strauss (2000). This article highlights some of those important concepts.
In all screening trials, lead time must be distinguished from lead-time bias. The success of any screening program is dependent on a lead time in diagnosis and treatment. In and of itself, this does not present a problem. Bias can arise when short-term survival rates are used to assess the value of screening in populations with and without lead time. Lead-time bias should not affect resectability or, more importantly, curability. In the subpopulations of patients with lung cancer in the older screening trials, there was an increased proportion of 5-year survivors in the screen-detected cases compared with those in the control arms in both the Mayo and Czech studies. The survival curves never converged, suggesting that screening did increase the cure rate of patients with cancer. These mature data implies that lead-time bias does not explain differences in survival between those groups. The I-ELCAP investigators’ effort to estimate 10-year survival rates, rather than shorter-term rates, was also an attempt to avoid any possible lead-time bias. The I-ELCAP strategy to avoid lead-time bias was to estimate the cure rate, which occurs at the plateau phase of the survival cure, its asymptote, at which point the additional deaths that occur are from competing causes.
Length bias refers to the tendency of screening to lead to the diagnosis of slower-growing cancers more frequently in the baseline round, because these tumors potentially have been present for a considerable amount of time before the screening study. For tumors detected only on repeat rounds of screening, this is less of a concern. However, a review of the Mayo data shows that survival rates were only slightly better in the prevalence cases compared with incidence cases (40% vs 33%), those diagnosed at repeat screening. In the I-ELCAP data, no distinction was made in survival rates between the prevalence and incidence groups.
Similar to the length-bias argument, the overdiagnosis hypothesis is based on the idea that screen-detected cancers may be indolent and perhaps even clinically insignificant. The lung cancer detection rates were higher in the screened groups in both the Mayo and Czech studies. Despite this, mortality for the entire screened cohort in both studies was slightly higher. Similarly, in the more recent CT-based study by Bach and colleagues (2007), there was an increased rate of lung cancer detection (144 cases vs 44.5 expected cases). Despite this increase in detection, there was no decrease in the expected lung cancer mortality. The possibility of overdiagnosis has been used to explain these findings, as well as the excellent projected 10-year survival in the I-ELCAP study. Several authorities suggest that many lung cancers detected by screening would not progress rapidly to the point of clinical detection and would therefore be unlikely to account for a meaningful share of deaths among screened individuals.
There are several arguments against overdiagnosis. The first is based on the known epidemiologic evidence. For example, studies reported by Sobue and colleagues (1992) and by Flehinger and colleagues (1992) documented mortalities in excess of 80% for untreated screen-detected lung cancers. The high mortality of screen-detected small tumors argues against their presumed indolent or nonfatal nature. It seems that even the smallest lung cancers are almost always deadly. An analysis by Henschke and colleagues of the Surveillance, Epidemiology and End Results (SEER) database revealed an 87% 8-year fatality rate for untreated 6-mm to 15-mm primary non–small cell lung cancer (NSCLC) (2003). A more recent review of the California Cancer Center registry by Raz and colleagues examined long-term survival in untreated stage I NSCLC. Five-year overall survival was only 6%, with a median survival of 9 months.
More evidence against overdiagnosis may be found from autopsy studies. McFarlane reported that the rate of surprise lung cancer at autopsy was less than 1% and that many of these patients had died of those cancers (1986). Another study found a slightly higher (3.3%) rate of lung cancer at autopsy, but deemed none of the cancers to have been clinically insignificant, because lung cancer was believed to be the direct cause of death in more than half of the cases. Further evidence against overdiagnosis may also be found in the I-ELCAP data. Henschke and colleagues reported that, for the I-ELCAP screening trial, an expert panel of pulmonary pathologists confirmed that 95% of the patients with stage I cancer had invasive tumors that were morphologically indistinguishable from garden variety lung cancers (2006). In addition, a subgroup of the I-ELCAP screen-detected cancers was analyzed for biomarkers using immunohistochemistry and fluorescence in situ hybridization. The molecular alterations were found to be similar to those found in conventionally diagnosed cancers. All 8 I-ELCAP patients with untreated stage I cancer died within 5 years of screening. In conclusion, the balance of both epidemiologic and pathologic evidence does not seem to make lung cancer a good candidate for overdiagnosis by screening.
Perhaps the most challenging aspect of understanding the issue of overdiagnosis is defining the term itself. The phrase is often used synonymously with pseudodisease, which implies that the disease would progress slowly and would not lead to death before that caused from a competing illness. This definition allows for patients with lung cancer who die of competing (or accidental) causes to be considered as examples of overdiagnosis regardless of tumor stage. This situation is most evident in the Mayo Lung Project, in which the concept of overdiagnosis was first proposed. In that study, there was an excess of early stage lung cancers in the screened group, but no difference in mortality. Therefore, it was concluded that those excess, predominantly early stage, cancers were overdiagnosed. However, when examined critically, they did not fit the profile of indolent cancers. They were on average 2 cm in diameter, not present on the baseline round, had a median growth rate of 101 days, and were nearly all invasive pathologically. Thus, in the screened arm, lung cancers were far more likely to be identified and cannot be described as indolent. The similar disease-specific survival rate between the groups may be explained by the high rate of competing causes of death in the screened group, in which the number of cardiovascular deaths was nearly 4 times the rate of lung cancer deaths.