Plain Language Summary 219
Screening Benefits 222
Screening Harms 228
False-Positive and False-Negative Mammography Results and Biopsies 228
Overdiagnosis and Overtreatment 231
Radiation Exposure 232
Anxiety, Distress, and Pain During Procedures 232
Shared Decision-Making 232
List of Acronyms and Abbreviations 235
Plain Language Summary
Women in their 40s face conflicting breast cancer screening guidelines that are based on different interpretations of research and considerations of the benefits and harms of screening.
How well mammography screening reduces death from breast cancer was evaluated in eight randomized controlled trials (RCTs) of women in their 40s. Neither the individual trials, nor an estimate that combined results of all of the trials, showed reduced deaths with screening. Deaths were reduced by 26% to 44% in two studies comparing women in community screening programs with those not screened. However, these studies provided less reliable results than the trials because women were not randomized to screening and control groups. Screening trials also showed that cases of advanced breast cancer as well as deaths from all causes were not reduced with screening.
A false-positive mammography result occurs when the initial report describes an area of suspicion that is later found to be normal after obtaining additional tests and sometimes biopsies. False-positive results are considered harms of screening because a healthy person is subjected to tests and procedures without direct benefit. A false-negative result occurs when a mammogram is interpreted as normal when a tumor actually exists. In this case, patients are harmed when diagnosis and treatment are delayed.
In a large US study, breast cancer rates were lowest among women in their 40s and increased with age, while rates of false-positive results were highest among women in their 40s and decreased with age. Higher false-positive rates were associated with risk factors for breast cancer. Over 50% of women in their 40s had false-positive results over a 10-year period. Rates of false-negative results and recommendations for biopsy did not differ by age. Rates of false-positives, false-negatives, and recommendations for biopsy did not differ by time since previous screening.
Studies suggest potential reductions in false-positive results with combined use of tomosynthesis and mammography compared to mammography alone, while effects on rates of false-negative results and biopsies and how rates vary by patient factors including age are presently unclear. Results of a single trial indicated increased false-positive and biopsy rates with combined mammography and ultrasonography compared with mammography alone.
Additional potential harms of screening include overdiagnosis and overtreatment; radiation exposure; and anxiety, distress, and pain during procedures. Overdiagnosis refers to diagnosing breast cancer for cases that will not cause symptoms. However, current methods of defining, measuring, and estimating the effects of these harms are inadequate, and how to determine their importance for individual women is unclear.
As a result of the current state of evidence and the dilemma of differing guidelines for women in their 40s, shared decision-making has become an important part of the screening process. In shared decision-making, each woman takes an active role in determining the appropriate screening services for her. Screening decisions could be improved by additional research on the effectiveness of screening women in their 40s who have specific risk factors, methods to reduce harms of screening, and approaches that improve communication and further involve women in the screening process.
Breast cancer screening guidelines are the most debated and least congruent for average-risk women in their 40s, and for this reason, it is particularly important for women, clinicians, researchers, and policy makers to understand the evidence behind them. While some guidelines recommend routine screening beginning at age 40, most others recommend beginning at age 50 or at ages in between. Recommendations also differ regarding annual versus biennial mammography and the role of clinical and self-breast examinations. The context of recommendations ranges from providing explicit directions for screening to relying on shared decision-making in which women take a major role in making the screening decision. For many women, their screening choices are primarily driven by their health insurance plan’s coverage and by their access to health care services.
Screening guidelines are an important influence on screening practices, even when guidelines are dissimilar. A study of changes in screening practices in a large US health system indicated a significant decline in mammography screening among women in their 40s after release of the U.S. Preventive Services Task Force (USPSTF) recommendations in 2009 despite insurance coverage. During the same time period, screening increased among women age 50 to 74, the age group targeted by the USPSTF recommendations, but not until the Affordable Care Act was implemented soon thereafter providing increased coverage of screening services. These relationships differ throughout health systems worldwide, whether private or public, and are important to understand in order to deliver optimal, yet appropriate, breast cancer screening.
In contrast to the conflicting guidelines regarding screening women at average risk, guideline groups generally agree that high-risk women in their 40s may benefit from initiating more frequent screening at younger ages. Clinically significant risk factors associated with high risk for breast cancer include breast cancer susceptibility gene (BRCA) mutations, mutation status unknown but having a first-degree relative with a BRCA mutation, and other hereditary genetic syndromes associated with more than a 15% lifetime risk, including Li-Fraumeni syndrome, Cowden syndrome, or hereditary diffuse gastric cancer. The degree of risk associated with family history of breast cancer varies according to familial patterns of disease. Estimates of lifetime risk of breast cancer determined by kindred analysis of over 15% or 20% are considered high.
Previous diagnosis of invasive breast cancer, ductal carcinoma in situ (DCIS), or high-risk lesions including lobular carcinoma in situ, atypical ductal hyperplasia, atypical lobular hyperplasia, flat epithelial atypia, papillary atypia, and apocrine atypia significantly increase risk for breast cancer to various levels. Also, women with histories of high-dose radiation therapy to the chest between the ages of 10 to 30 years, such as for treatment of Hodgkin lymphoma, are considered at high risk.
Additional factors increase risk for breast cancer to lower levels than for high-risk women, and women in their 40s with these characteristics generally fall within routine screening guidelines. A modeling study indicated that benefits and harms of screening for a hypothetical woman in her 40s is similar to an average-risk woman in her 50s when risk exceeds 2 to 4 times greater than average. However, this screening strategy has not been evaluated in clinical settings.
An important first step in considering risk factors in making screening choices is to identify those that are specific to women in their 40s and determine their impact. Results of a systematic review and analysis of data from a large screening population narrowed the possible potential risk factors to a concise list ( Fig. 9.1 ). These include family history of breast cancer that is less extensive than for high-risk women; prior benign breast biopsy; current use of oral contraceptives; no previous pregnancies; and higher breast density, a radiographic measure of breast tissue that is associated with increased risk for breast cancer and reduced mammography sensitivity. Increased breast density is more common among younger women.
Empiric models that incorporate several of these risk factors have been developed to predict breast cancer risk for individual women. All of the models include age and first-degree family history of breast cancer into their calculations, but vary by other components. However, studies of their diagnostic accuracy indicate that the models are poor predictors of an individual’s risk, and it remains unclear how to apply these models to selecting women for screening. Therefore, despite extensive research, screening decisions for women in their 40s remain complex, individualized decisions based on personal consideration of benefits and harms.
Breast Cancer Mortality
Breast cancer screening trials
Eight main RCTs of breast cancer screening enrolled women in their 40s and reported breast cancer mortality results specifically for them. These include the UK Age trial, Canadian National Breast Screening Study 1 (CNBSS-1), Health Insurance Plan of Greater New York (HIP) trial, Stockholm trial, Malmö Mammographic Screening Trial (referred to separately as MMST I and MMST II), Swedish Two-County Study (referred to separately as Östergötland and Kopparberg), and Gothenburg trial. The descriptions of trials and meta-analyses in this chapter are based on a systematic review of the most recent trial results. In these analyses, the number of trials was counted as the number of discrete data sources contributing to the summary estimate.
All trials were designed as RCTs, although they varied in their recruitment of participants and controls, sizes, methods of randomization, and screening protocols. Additional details of the trials, their screening protocols, and limitations are described in chapter: “Estimates of Screening Benefit: The Randomized Trials of Breast Cancer Screening.” Two trials enrolled only women in their 40s (CNBSS-1 and Age), while others enrolled various broader age ranges that included women in their 40s. The Canadian trials recruited volunteers, the Age trial identified women from general practice lists, and the HIP trial recruited women enrolled in a health insurance plan. The other trials enrolled participants based on their residence in communities.
The UK Age trial is particularly relevant to determining screening effectiveness in women in their 40s because it is the only trial designed to specifically evaluate this question. The Age trial is also the largest trial, randomizing a total of 160,921 women age 39 to 41 to screening with seven rounds of annual mammography until age 48, or usual care which did not involve routine screening. Women were individually randomized and stratified by group practice, and were offered routine screening at age 50 to 52 in accordance with practice standards at the time. The Age trial is the most recent trial, beginning in 1991, and provides 17.5 years of follow-up results. Adherence was relatively high, 68% of women in the trial attended the prevalent screen, and overall, 81% of women attended at least one screen. The mean number of screens in the trial was 4.5. Follow-up conducted through the National Health Service central register was complete, and the analysis included deaths from breast cancer during the trial and during follow-up.
Limitations of the Age trial include use of standard two-view mammography only at the baseline screen and one-view mammography at subsequent screens. The majority of screens used film mammography, rather than digital. These technical differences make the trial less clinically relevant to current practice where two-view digital mammography is the usual standard. Also, it is unclear how many women in the usual care group received mammography during the trial. The Age investigators have reported concerns about the trial’s lack of power to detect statistically significant differences between screening and usual care groups because the trial did not attain recruitment goals, and their initial assumptions regarding population estimates of breast cancer mortality were not accurate because there was an unanticipated decline in mortality rates during the years of the trial.
The initial results of the Age trial, based on a median of 10.7 years of follow-up, indicated a reduction in breast cancer mortality that was not statistically significant [relative risk (RR): 0.83; 95% confidence interval (CI), 0.66 to 1.04]. Results of long-term follow-up in the Age trial, as well as the other screening trials, are based on two methods of accrual of breast cancer cases and deaths (further described in chapter: “Estimates of Screening Benefit: The Randomized Trials of Breast Cancer Screening” ). The short case accrual method includes only deaths occurring among cases of breast cancer diagnosed during the screening intervention period, and in some trials, within an additional defined case accrual period. The long case accrual method counts all of the breast cancer cases contributing to breast cancer deaths diagnosed during the screening intervention period plus the follow-up period. In the Age trial, follow-up after a median of 17 years indicated a RR of breast cancer mortality with screening of 0.88 (95% CI, 0.74 to 1.04) for cases diagnosed during the intervention phase; and 0.93 (95% CI, 0.80 to 1.09) for cases diagnosed during the intervention and follow-up phases. While long-term follow-up may have improved the power of the trial to detect differences between comparison groups, these results indicate an even weaker relationship between screening and breast cancer mortality.
Meta-analysis of screening trials
Five of the screening trials reported results using the long case accrual method (Swedish Two-County [Kopparberg and Östergötland], Age, Gothenburg, and CNBSS-1 ), while four used only the short case accrual method (HIP, Malmö I, Malmö II, and Stockholm ). Using the longest case accrual available across all trials, mean durations of screening were 3.5 to 18.8 years, case accrual 7.0 to 21.9 years, and follow-up 11.2 to 21.9 years.
Age-specific summary estimates of breast cancer mortality reduction were derived from meta-analyses that combined results from the trials. For women age 40 to 49, neither the individual trials nor the combined estimate indicated statistically significant differences between screening and control groups (RR: 0.92; 95% CI, 0.75 to 1.02; nine trials) ( Fig. 9.2 ). The meta-analysis was repeated using estimates from trials with short case accrual methods to examine differences related to how outcomes were captured. Results indicated a RR of 0.87 (95% CI, 0.72 to 1.00), consistent with a borderline statistically significant difference, although power was reduced in this estimate because fewer cases were included.
Absolute rates of breast cancer mortality reduction were derived from results of meta-analyses of RCTs. Results indicated that three deaths could be reduced (prevented) per 10,000 women screened for 10 years for both the long and short case accrual methods, assuming that differences between screening and control groups were actually statistically significant.
Observational studies of mammography screening provide additional information about screening effectiveness in contemporary populations and settings. However, observational studies are subject to biases that may limit their use in determining effectiveness. Most importantly, they lack comparability of comparison groups that is only attainable through randomization.
Recent comprehensive systematic reviews of observational studies summarize most of the relevant research, including reviews conducted by the EUROSCREEN Working Group to assess the effectiveness of population-based mammography screening on breast cancer mortality. In general, observational studies of screening indicate more favorable reductions in breast cancer mortality than results of the screening trials. However, most studies included women age 50 and older and their relevance to women in their 40s is uncertain.
Only two observational studies comparing outcomes of participants in screening programs versus nonparticipants reported results specifically for women in their 40s. A large prospective cohort study of the Mammography Screening of Young Women Cohort in Sweden indicated reduced risk for breast cancer deaths for women age 40 to 49 years invited to screening compared with women not invited (RR: 0.74; 95% CI, 0.66 to 0.83). Although the study included over 600,000 women, it is unclear whether comparable groups were maintained over time due to attrition, crossover, adherence, and contamination.
An incidence-based mortality study of over 2 million women age 40 to 79 in Canada compared screening program participants versus nonparticipants. Results were expressed as standardized mortality ratios (SMR), the ratio of the observed breast cancer mortality of screening participants to province-specific breast cancer mortality based on nonparticipant incidence and survival rates. Breast cancer mortality was reduced for women age 40 to 49 years (SMR: 0.56; 95% CI, 0.45 to 0.67). Although the analysis considered the influence of self-selection bias using historical trend data for women age 35 to 39, the validity of this approach is unclear. In the Canadian randomized trials, screening participants were more educated, had fewer pregnancies, and had overall higher risks for breast cancer suggesting that women who participate in mammography differ from those who do not. These differences introduce bias and influence results of comparisons between screening and nonscreening groups.
Effects of screening intervals and imaging modalities
No trials directly compared the effect of different screening intervals on breast cancer mortality. A registry-based study in Finland indicated no breast cancer mortality differences between annual and triennial screening among women age 40 to 49 years. No trials or observational studies of screening using digital mammography, tomosynthesis, ultrasound, or magnetic resonance imaging (MRI) provide mortality outcomes for women in their 40s.
All-cause mortality outcomes were reported for women age 40 to 49 in seven trials [Age, CNBSS-1, Stockholm, Malmö II, Swedish Two-County Study (separately reported for Östergötland and Kopparberg), and Gothenburg ]. A meta-analysis of results indicated no all-cause mortality differences between screening and control groups (RR: 0.99; 95% CI, 0.95 to 1.05) (details in chapter: “Estimates of Screening Benefit: The Randomized Trials of Breast Cancer Screening” ).
Incidence of Advanced Breast Cancer
The incidence of advanced breast cancer is another clinically important outcome in determining the effectiveness of screening. The most commonly used measures in studies include clinical stage (0 to IV), number of involved lymph nodes (0, 1 to 3, 4+), and tumor size (mm). However, these measures vary across studies and do not capture important prognostic indicators, such as hormone receptor status. Also, most comparisons using these categories provide differences between screening and comparison groups that represent relatively early stages of disease, rather than advanced stages.
An analysis of the screening trials defined advanced breast cancer based on the most severe disease categories reported by the trials. These included stage III and IV disease (ie, regional and metastatic), size 50 mm or greater, or having four or more positive lymph nodes. Trials reporting results specifically for women in their 40s are listed in Table 9.1 . Combining estimates based on these definitions of advanced cancer in a meta-analysis indicated no difference with screening for women age 40 to 49 years (RR: 0.98; 95% CI, 0.74 to 1.37; four trials).
|Trial||Stage||+Lymph Nodes, n a||Size, mm b||Definition of Advanced Cancer c||RR for Advanced Cancer (95% CI) d|
|Age||NR||0, 1–3, 4+||1–9, 10–14, 15–19, 20–29, 30–49, ≥50||Size ≥50 mm||0.85 (0.57 to 1.23)|
|4+lymph nodes||0.77 (0.53 to 1.13)|
|HIP||I, II, III, IV||NR||NR||Stage III−IV||0.87 (0.48 to 1.58)|
|CNBSS-1||NR||0, 1–3, 4+||1–9, 10–14, 15–19, 20–39, ≥40||Size ≥40 mm||1.18 (0.67 to 2.03)|
|4+lymph nodes||2.00 (1.20 to 3.34)|
|Swedish Two-County||I, II, III−IV||0, 1+||1–9, 10–14, 15–19, 20–29, 30–49, ≥50||Size ≥50 mm||1.57 (0.63 to 3.94)|
a Lymph nodes with micrometastases are classified as Stage IB, otherwise ≥1 positive lymph node is classified as Stage IIA or higher.
b Size ≥20 mm is classified as Stage IIA or higher; size ≥50 mm is classified as Stage IIB or higher.
c Represents the highest category of disease reported by the trials.
The relationship of mammography screening and advanced breast cancer outcomes was also evaluated in analyses of data from the Breast Cancer Surveillance Consortium (BCSC), a collaborative network of mammography registries across the United States, supported by the National Cancer Institute (NCI). Registries were linked to pathology databases and tumor registries and included demographic and medical history information from questionnaires.
Since all of the women included in the BCSC database have had previous mammography, studies compared 1-year versus 2-year screening intervals and their relationships with cancer stage at diagnosis ( Table 9.2 ). These intervals represent the time between the two most recent screening mammograms prior to diagnosis. Information regarding influences related to selection of specific screening intervals is not available.
|Author, Year||Study Design||Population; Age, Years; Participants, n||Study Years; Comparison||Outcome Measures||Results|
|Buseman et al., 2003||Case series||US, Kaiser Permanente; 42–49 years; 247||1994 to 2000; screened vs unscreened||Stage II−IV; III−IV|
|Goel et al., 2007||Case series||US, Vermont Breast Cancer Surveillance System; >40 years; 1944||1994 to 2002; 1-year vs 2-year screening intervals||Either Stage IIB+; size >20 mm; >1 positive node||21% vs 24%, p =0.262No statistically significant differences by age|
|Hubbard et al., 2011||Case series||US, BCSC data, multisite; 40–59 years; 4492||1996 to 2006; 1-year vs 2-year screening intervals||Stage IIB+||Adjusted proportion of cancer stage for 2-year vs 1-year intervals|
|Kerlikowske et al., 2013||Case series||US, BCSC data, multisite; 40–74 years; 11,474||1996 to 2008; 1-year vs 2-year vs 3-year screening intervals||Stage IIB−IV||Adjusted OR (95% CI) for 2-year vs 1-year intervals|
|Miglioretti et al., 2015||Case series||US, BCSC data, multisite; 40–49 years; 15,440||1996 to 2012; 1-year vs 2-year screening intervals||Stage IIB −IV; size >15 mm; >1 positive node||Adjusted OR (95% CI) for 2-year vs 1-year intervals, age 40 to 49|