Estimates of Screening Benefit: The Randomized Trials of Breast Cancer Screening

Chapter Outline
Plain Language Summary 30
Introduction 30
Breast Cancer Screening Trials 31
- Design and Methods 32
- Limitations 36
- Individual Trials 36
Breast Cancer Mortality Outcomes 40
- Relative Risk Reduction 40
- Absolute Risk Reduction 41
All-Cause Mortality 41
Effects of Screening Intervals on Mortality 42
Effects of Imaging Modality on Mortality 43
Incidence of Advanced Breast Cancer 43
Treatment Related Morbidity 44
Conclusions 45
Acknowledgments 46
Glossary 46
List of Acronyms and Abbreviations 47
References

Plain Language Summary

The benefits of mammography screening have been studied in nine randomized controlled trials comparing women screened with those not screened. Randomized controlled trials are regarded as the strongest type of study to determine whether screening is effective because they provide the most reliable results about the differences in outcomes between screening and nonscreening groups. It is important to appreciate the differences between the trials and their limitations in order to understand the results.

The trials began between 1965 and 1991, and involved over 600,000 women in the United States, Canada, United Kingdom, and Sweden. The trials varied in their recruitment of women, the numbers of women enrolled, and methods of assigning screening. Some trials enrolled only women in their 40s or 50s, while others included various broader age groups ranging from 39 to 74. In some trials, women were screened every 12 months, while others screened every 33 months. The lengths of screening varied from 4 to 10 years, and follow-up ranged from 11 to 25 years. The trials also differed in the ways they counted breast cancer cases and analyzed their results.

Several limitations of the trials have been recognized. An important concern has been whether the screening and control groups are truly similar at the start of the trials, as well as throughout their duration. Also, the relevance of the screening trials to current populations and practice is likely to have diminished over time. All of the trials were conducted in the past when imaging technologies and breast cancer therapies were markedly different than today. Whether results from trials using these outdated practices would be the same if they were conducted today is questionable. In addition, as with all research studies, the effects of screening for women in trials may be different than for women in the general population. Research has found that women who enrolled in some of the breast cancer screening trials were more educated and had higher risks of breast cancer compared with the general population.

When the results of the trials were combined using statistical methods, summary estimates indicated that death from breast cancer was lower for women undergoing screening compared to those who were not screened, but results varied by age. Results for women in their 40s and 50s were of borderline statistical significance and varied by how cases were counted in the trials. Trials did not enroll enough women age 70 and older to provide reliable results for older women. Assuming that differences actually exist, the estimates can be expressed as the number of breast cancer deaths prevented for 10,000 women screened for 10 years as: 3 deaths for women age 40–49; 5–8 deaths for age 50–59; and 12–21 deaths for age 60–69. The trials also reported that age-specific deaths from all causes, not just death from breast cancer, were not reduced with screening. Cases of advanced breast cancer were reduced for women age 50 and older, but not younger women.

Although the screening trials provide the strongest type of research to determine whether mammography screening is beneficial, the trials are outdated and may not represent current practices. Results indicate only small reductions in breast cancer deaths among women who were screened, but vary by age, with the largest effects (the most benefit) among women age 60–69.

Introduction

The benefits of mammography screening have been studied in nine main randomized controlled trials (RCT) beginning between 1965 and 1991, and involving over 600,000 women in the United States, Canada, United Kingdom, and Sweden. Trials provide estimates of the effectiveness of routine mammography screening by comparing breast-cancer specific mortality between women randomized to screening versus no screening. Additional outcomes related to benefit include reduced all-cause mortality and incidence of advanced breast cancer.

Age is an important factor in the design, analysis, and interpretation of the screening trials because the risk for breast cancer, types of breast cancer, and performance of screening technologies differ by age. Accordingly, guideline development groups depend on age-specific evidence to provide screening recommendations. To address these issues, this chapter presents results of the trials according to the age of participants when available.

RCTs are regarded as the strongest study design to determine effectiveness, because a well-designed RCT is less susceptible to bias than other study designs. Bias refers to any effect of the design or conduct of a study that systematically favors one comparison group over others. Studies with greater risks of bias are more likely to yield incorrect results. RCTs provide direct comparisons between screened and unscreened groups, while studies of women who are screened but not compared with unscreened women are unable to account for other factors that may explain outcomes.

Randomization is an important aspect of a RCT’s design and refers to how participants are allocated to screening and control groups. A truly randomized method of allocation means that the assignment of screening or control groups is not predictable by the participant, researchers, and others. Randomization largely eliminates the problem of selection bias and associated confounding. With successful randomization and a large enough sample size, important confounders are equally distributed among comparison groups, including confounders that are unrecognized or not measured. Ideally, in this way, the comparison groups are similar except for whether they are screened or not, and differences in outcomes can be correctly attributed to screening.

In general, the strength of evidence is determined by judging how well individual studies were designed and executed to reduce bias, as well as their place on a study design-based evidence hierarchy ( Fig. 2.1 ). Results of appropriately performed RCTs are generally more reliable than results of nonrandomized or observational studies, and they are placed at the highest level of the evidence hierarchy. However, the evidence hierarchy assumes that RCTs are well conducted. A poor-quality RCT could yield results that are as or more misleading than those from studies lower in the evidence hierarchy. That is why it is important to critically evaluate and understand the quality of each trial and its contribution to the overall strength of evidence in order to determine the effectiveness of breast cancer screening.

Breast Cancer Screening Trials

The nine main RCTs of breast cancer screening include the Health Insurance Plan of Greater New York (HIP) trial, Edinburgh trial, Canadian National Breast Screening Study 1 (CNBSS-1), Canadian National Breast Screening Study 2 (CNBSS-2), United Kingdom Age trial, Stockholm trial, Malmö Mammographic Screening Trial (referred to separately as MMST I and MMST II), Gothenburg trial, and Swedish Two-County Study (referred to separately as Östergötland and Kopparberg). For some trials, results are sometimes combined (eg, Canadian) or provided separately (eg, CNBSS-1, CNBSS-2) leading to different ways of counting the number of trials. The descriptions of trials and metaanalyses in this chapter are based on a systematic review of the most recent trial results. In these analyses, the number of trials was counted as the number of discrete data sources contributing to each summary estimate.

Design and Methods

All trials were designed as RCTs, although they varied in their recruitment of participants and controls, sizes, methods of randomization, and screening protocols ( Tables 2.1 and 2.2 ). Some trials enrolled only women in their 40s (CNBSS-1 and Age) or 50s (CNBSS-2), while others included various broader age ranges. The two Canadian trials recruited volunteers, the Age trial identified women from general practice lists, and the HIP trial recruited women enrolled in a health insurance plan. The other trials enrolled participants based on their residence in communities.

Table 2.1

Breast Cancer Screening Trials

Adapted from Ref. .

Trial	Age, y	Year Trial Began	Screening, n ; Control, n ^a	Population	Comparison Groups	Method of Randomization
Health Insurance Plan of New York (HIP)	40–64	1965	30,239; 30,765	New York health plan members	Mammography +clinical breast examination vs usual care	Individual based on stratification by age and family size and drawn from a list
Canadian National Breast Screening Studies (CNBSS-1 & CNBSS-2)	CNBSS-1: 40–49; CNBSS-2: 50–59	1980	CNBSS-1: 25,214; 25,216; CNBSS-2: 19,711; 19,694	Self-selected from 15 centers in Canada	Mammography +clinical breast examination vs usual care ^b	Block stratified by center and 5-year age group
Edinburgh	45–64	1978	28,628; 26,015	All women from 87 general practices in Edinburgh, Scotland	Mammography +clinical breast examination vs usual care	Cluster based on general practitioner practices
Malmö Mammographic Screening Trial (MMST I & II)	43–69	1976–78	MMST I: 21,088; 21,195; MMST II: 9581; 8212	All women born between 1908 and 1945 living in Malmö, Sweden	Mammography vs usual care	Individual within birth year
Stockholm	40–64	1981	40,318; 19,943	Residents of Stockholm, Sweden	Mammography vs usual care	Individual by day of month
Swedish Two-County	40–70	1977	77,080; 55,985	Women from Östergötland and Kopparberg counties in Sweden	Mammography vs usual care	Cluster based on demographically homogeneous geographic units
Gothenburg	39–59	1982	21,650; 29,961	All women born between 1923 and 1944 living in Gothenburg, Sweden	Mammography vs usual care	Cluster based on day of birth for women born 1923–35 (18%); individual for women born 1936–44 (82%)
Age	39–41	1991	53,884; 106,956	Women from 23 National Health Service breast screening units in England, Scotland, and Wales	Mammography vs usual care	Individual stratified by general practitioner group ^c

a Numbers of participants in screening and control groups vary by publication.

b All women were prescreened with clinical breast examinations and instructed in breast self-examination. For women 50–59, usual care involved annual clinical breast examinations.

c Used random number generation between 1991 and 1992, then Health Authority computer system.

Table 2.2

Protocols for Breast Cancer Screening Trials

Adapted from Ref. .

Trial	Screening Interval, months	Rounds, n	Views, n	Adherence, %	Screening Duration, y	Controls Screened	Longest Follow-Up, y
Health Insurance Plan of New York (HIP)	12	4	2	46	4	After trial completed	18
Canadian National Breast Screening Studies (CNBSS-1 & CNBSS-2)	12	4–5	2	85	4.5	At age ≥50 after trials completed	25
Edinburgh	24	2–4 varied by cohort	1–2	61	2–8 varied by cohort	After 6–10 years, varied by cohort	10–14 varied by cohort
Malmö Mammographic Screening Trial (MMST I & II)	18–24	9	1–2	70	10+	After 14 years	11–13; 15.5
Stockholm	24–28	2	1	81	4.8	After 5 years	11.4
Swedish Two-County	24–33	3	1	84	7	After 7 years	20; 15.5
Gothenburg	18	5	1–2	75	9	After 5 years	12
Age	12	4–6, varied by center	2	57	9	At ages 50–52	17