Study Design

Stephen J. Gange

Elizabeth T. Golub

INTRODUCTION

Epidemiologic methods constitute the scientific frameworks, concepts, and tools that are used for epidemiological evaluations. This chapter presents an overview of these methods in the context of evaluating the epidemiology of infectious diseases. As an organizational framework, it is useful to conceptualize epidemiologic methods as addressing key questions (Figure 3-1):

Who is to be studied? Epidemiologists study diseases in populations of individuals. While studies of entire populations may be of interest, typically only a sample of individuals contributes to an epidemiologic evaluation, selected according to specific study designs.
Which data will be collected? Observing the occurrence of diseases and their determinants requires measurement for purposes of constructing measures of disease occurrence.
Which inferences will be made from the analysis? Epidemiology, as the study of the distribution, determinants, and control of disease, is predicated on the fact that disease occurrence in humans does not occur at random. Epidemiologic evaluations play an important descriptive role in characterizing diseases among populations. When evaluating the determinants of diseases, epidemiologic evaluations typically require comparisons of populations for making causal inferences.

Figure 3-1 A Framework for Epidemiologic Methods

In infectious disease epidemiology, an intricate network of causal determinants influences the susceptibility to and development of disease. A useful framework for organizing these important determinants is the epidemiologic triangle (Figure 3-2), a diagram that emphasizes the interrelationship between three components:

Host. Human hosts differ in susceptibility to infections because of genetic, environmental, behavioral, and other characteristics. Major epidemic diseases, such as malaria, tuberculosis, smallpox, and plague, have led to selective genetic changes in human populations. For example, the evolution of several genetic mutations among Africans and Asians has resulted primarily from the selective pressure of hyperendemic malaria. Sickle hemoglobin, glucose-6-phosphate dehydrogenase deficiency, thalassemia, hemoglobin C, and hemoglobin E may be disadvantageous in homozygous individuals, but these traits have evolved in certain populations because they confer significant protection from malaria in heterozygous individuals.¹
Agent. The agent constitutes the infecting pathogen (e.g., virus, bacterium, parasite, or fungus). Agents have certain characteristics that influence their infectivity. For example, one important characteristic of a pathogen is its escape mechanism, such as the evolution of resistance to antibiotics and antiviral
therapies. This capability is recognized as a growing threat to public health and was presciently predicted in 1945 by the discoverer of penicillin, Sir Alexander Fleming, who said, “The greatest possibility of evil in selfmedication is the use of too-small doses, so that, instead of clearing up the infection, the microbes are educated to resist penicillin and a host of penicillin-fast organisms is bred out which can be passed on to other individuals.”²
Environment. The environment constitutes the setting in which transmission occurs. It is important to understand and characterize the environment in which transmission occurs and to be aware of environmental factors that may facilitate the agent’s survival or infectivity. It is not difficult to envision the role of environment for some agents, such as hookworm, where soil humidity, temperature, and other soil characteristics can influence the development of infectious Ancylostoma duodenalae larvae. However, the environment is also important in the transmission of airborne viruses, such as influenza and varicella, because it affects the length of time that the viral particles remain infective as an aerosol. The winter environment in temperate climates also facilitates transmission of influenza by bringing people indoors. Conversely, influenza epidemics have been interrupted by extreme cold weather that has forced schools to close, thereby interrupting transmission among children and introduction of the virus into the home.³

Figure 3-2 Epidemiologic Triangle (agent/host/environment)

By understanding these determinants and their interrelationship, we can identify and implement interventions for disease prevention and treatment. While the prescription and use of individual medications (e.g., antibiotics, antivirals, antifungals) may be

the natural “intervention” that comes to mind when thinking about infectious diseases, it is important to recognize that “interventions” actually include the vast breadth of public health measures, policies, and guidance that affect large populations (e.g., vaccination programs, chlorination of the water supply, social marketing).

POPULATIONS

Epidemiologists take a multifaceted approach to defining populations. First, populations are generally identified in terms of three basic characteristics: person, place, and time. Second, it is rarely possible to study the entire population about which we want to make inferences for a particular disease (the “target population”). Instead, we must identify a source from which study participants can be identified and then define a study population of those persons who will ultimately be included in the study. Finally, we must consider multiple factors that influence study populations, such as eligibility criteria, feasibility of enrollment, and the refusal rate among those invited to participate.

How Epidemiologists Describe Populations: Person, Place, and Time

The central focus of studying diseases in populations reflects an assumption that individual persons can be aggregated by some common characteristics. The population chosen for study ultimately depends on the purpose of the investigation. We usually describe populations in terms of factors that are well known to influence disease risk. Epidemiologists generally classify such factors as those that are related to person, place, and time.

Attributes of person include individual-level characteristics believed to influence disease. These
might include demographic characteristics (e.g., age, sex, race/ethnicity), socioeconomic characteristics (e.g., education, income), or biologic factors (genetics).

The description of place spans different geographical characteristics, which can be as broad as a continent, more mid-level such as a country or city, or as specific as a neighborhood. It might also include even more specific attributes, such as place of employment, patients in a certain clinic or hospital ward, or distance from a certain environmental site. The characteristics of place add to those of person in providing a specific description of our population; for example, we might describe our population as women ages 18-59 living in Baltimore, Maryland.

It is clear that defining a population using these two criteria alone may be insufficient. Continuing the preceding example, are we studying women in that age group who ever lived in Baltimore? To provide more specificity, we can include in our definition an aspect of time. Time, like the attributes of person and place, can be thought to influence disease. For example, it is reasonable to assume that a 25-yearold woman in Baltimore in 1882 had a much different disease risk profile than a woman of the same age living in Baltimore today. Likewise, time might refer to a scale other than calendar time—we might characterize individuals in terms of their life-course— although most commonly we choose calendar time as our parameter of time when describing a population.

Types of Populations: Target, Source, and Study Populations

Epidemiologists conceptualize several different types of populations (Figure 3-3). A target population comprises those individuals about whom we will want to make inferences based on the results of our study. Identifying a target population is often subjective because the group to which we want to make inference may be a conceptual construct and not a group of individuals who can be specifically enumerated. Ideally, this population is most relevant to the research question being investigated, in terms of its person, place, and time characteristics.

The target population serves as the background for the source population. A source population is a subset of the target population that can be enumerated and further studied. For example, in a study of the prevalence of sexually transmitted infections (STIs), we may wish to make inferences from the findings to a broad community and, therefore, identify as our target population sexually active men and women ages 18-49 living in the United States during 2005-2010. Of course, it would not be feasible to enumerate, or enroll into a study, all individuals meeting those criteria. Instead, we might identify a source population from which we can enroll study participants. One example of a source population for this study might be individuals attending specific STI clinics in Chicago or Baltimore during a certain period of time. There are, of course, alternative source populations that could be chosen to conduct a study whose results would be relevant to the same target population.

Figure 3-3 Populations Moving Through Time

A study population comprises those individuals in the source population who contribute data to an epidemiologic investigation. In some settings, the study population might be equal to an entire source population (or even to an entire target population). Whether a member of the source population ultimately becomes a study participant is influenced by a number of factors. Eligibility criteria are those characteristics that are necessary for an individual to be considered for enrollment, and may include attributes of person, place, and/or time. It may also not be feasible to study the entire source population (due to cost or logistics). Finally, some individuals in the source population may decline the invitation to participate.

In summary, identifying the characteristics of these different populations is an essential component of epidemiology. In the next section, we outline various descriptive and analytical designs that are used in epidemiology as applied to infectious diseases.

EPIDEMIOLOGIC STUDY DESIGNS

Epidemiologic studies of infectious diseases aim to evaluate the contributions of various factors in the transmission and acquisition of infectious pathogens, as well as those factors favoring endemic transmission and epidemics. The design of such studies must optimize the researcher’s ability to measure and evaluate the relationships between exposures and the occurrence of disease in the study population.

Studies of infectious disease can be designed to explore landmarks along the entire temporal process during which an individual is at risk, acquires infection, develops an infectious disease, or succumbs to it. The duration of this process can be short, such as with highly virulent infections (e.g., Ebola virus), or it can be very long, as with chronic infectious diseases such as HIV/AIDS. Epidemiologists strive to understand the population-level burden of disease, including the reasons for increased susceptibility of one population relative to another, the factors that affect the susceptibility of particular individuals in a population, and the factors leading to epidemics.

Several study designs are used to address research questions regarding the risk factors for, and burden of, disease in human populations. For example, descriptive designs are typically not initiated to make comparisons across populations but rather provide an opportunity to describe important characteristics of individuals with disease. Such descriptive designs include case reports, case series, and ecological and surveillance studies. In contrast, analytical designs are initiated to draw particular conclusions regarding the association between exposures and outcomes; they include cohort studies, case-control and other nested studies, and randomized clinical trials. Metaanalysis and systematic reviews, wherein either primary or published data from individual studies are systematically combined to investigate a research question, are also being conducted with increasing frequency. The optimal study design is a function of the hypothesis under investigation. In this section, we review several important and frequently used epidemiologic study designs and illustrate their use in evaluating infectious diseases.

Descriptive Study Designs

When a new disease is recognized, it may be of interest to describe the nature of the disease and to evaluate the probable means of transmission, reservoir, and natural history. Sometimes a new disease can be quickly linked to a specific organism, such as staphylococcal toxic shock syndrome. More often, however, epidemiologic studies contribute to the discovery and characterization of new pathogens, as with hantavirus pulmonary syndrome, Legionnaires’ disease, and AIDS.

Early studies may consist of descriptions of cases that may be linked by a route of transmission or common exposure. Descriptive studies do not typically make inferential comparisons of cases to individuals without disease (controls); rather, they only describe aspects of the disease and circumstances surrounding the acquisition and occurrence of disease. Surveillance methods capture cases of disease and are an excellent source for identifying individuals for further follow-up. At times, case reports or case series provide considerable insight into the epidemiology of an infectious disease.

Case Reports

Case reports involve a careful evaluation of a single case of disease, and may describe the transmission, natural clinical history, and/or response to treatment. Although case reports are based on an infection in a single patient, they may yield important new epidemiologic information regarding the disease. Examples of illustrative case reports follow.

Rabies Rabies is a zoonotic viral infection that is spread to humans by contact with body fluids (most commonly saliva) from an infected animal. Prior to rabies vaccination of domestic animals, most transmissions of this disease in the United States were associated with domestic animal bites.⁴ Rabies transmission was initially believed to require direct inoculation via a bite or other invasive contact with the infected animal. Infection is initially confined to the site of exposure without systemic viremia; hence the rabies vaccine can be given after exposure to prevent infection of the central nervous system (CNS). Such postexposure prophylaxis is usually successful. Moreover, it was universally believed that, in the absence of such treatment, rabies was fatal once the virus infected the CNS and signs and symptoms of CNS infection occurred. Early case reports of rabies overturned these long-held beliefs about the means of transmission of the virus and its natural history.

Aerosol transmission of rabies was described in a cave explorer, a spelunker, who developed rabies after exploring a cave inhabited by large numbers of bats in Frio, Texas.⁵ In this case, there was no history of a bite. This case was followed by a series of experiments in which animals were placed in the
cave and protected from bites and even insect transmission but were exposed to the air in the infected cave. After several animals developed rabies during this exposure, the classic concepts of rabies transmission were challenged.⁵ This hypothesis was confirmed in additional laboratory studies that showed rodents could be infected by aerosol inoculation.⁶, ⁷ This route of infection was further explored by a review of case reports of rabies in the United States between 1980 and 1996, which found that, in the majority of human cases, there was no clear documented evidence of a bite; however, the reports also suggested that an unreported or undetected bite remained the most plausible hypothesis regarding route of transmission.⁸ The control of rabies in domestic animals in the United States has resulted in fewer human cases, with a higher proportion of cases being attributed to wild animal exposures. These exposures are not as readily recognized as rabies risks, so preventive vaccination and postexposure prophylaxis may not be initiated. Reports of individual cases continue to provide information that contributes to our general understanding of the transmission of rabies.

The uniform fatality of rabies has also been challenged. In October 1970, a 6-year-old boy was bitten by a rabid bat. He was given 14 doses of duck embryo rabies vaccine but developed rabies 21 days later. He eventually recovered completely after treatment in an intensive care unit for nearly 2 months.⁹

Figure 3-4 Temporal trends in the diagnosis of rabies in the United States, 1994 to 2002. From the New England Journal of Medicine, Rupprecht, C.E., Gibbons, RV Prophylaxis against Rabies, Vol. 351; 25 pp. 2627, Figure 1. Copyright © 2004 Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.

A second report of survival from clinical rabies was reported in October 2004 in a previously healthy 15-year-old Wisconsin girl who was bitten on the left finger after handling a bat.¹⁰, ¹¹ Three weeks later she complained of fatigue, double vision, vomiting, and tingling in her left arm. Within 3 days she developed diploplia, and subsequently slurred speech and an unsteady gait. On the sixth day of her illness, the diagnosis of rabies was considered when the history of a bat bite was obtained. The patient was transferred to a tertiary care hospital and treated aggressively with a series of medications that included ketamine, midazolam, ribavirin, and amantadine. Ketamine had been shown in laboratory studies to inhibit rabies viral transcription.¹² The use of gamma-aminobutyric acid (GABA) receptor agonists with benzodiazepines and barbiturates was intended to reduce excitotoxicity, brain metabolism, and autonomic reactivity. Clinical reports of rabies cases had suggested that death resulted from the secondary complications of infections, primarily “neurotransmitter imbalance” and autonomic failure, rather than from direct cytolysis of the rabies virus. The patient survived and regained most of her cognitive function and has, so far, been able to live without major impairments.¹³ This was the first case of human rabies reported to have survived without the use of rabies vaccine or rabies immunoglobulin; since then, the same aggressive treatment protocol has been used on additional cases and is credited with saving the life of an 8-year-old
girl in Cali, Colombia. A third patient, a 15-yearold boy in Brazil, was also successfully administered this treatment protocol, after receiving postexposure prophylaxis.¹⁴

Table 3-1 Sources of Human Exposure to Rabies in the United States

		Wildlife	Other Sources^†
Year	Domestic Animal^*	*number of cases (percent)*		Unknown^‡	Total No. of Cases
1946-1955	86 (72)	8 (7)	0	26 (22)	120
1956-1965	21 (55)	7 (18)	0	10 (26)	38
1966-1975	6 (38)	7 (44)	1 (6)	2 (12)	16
1976-1985	6 (30)	1 (5)	2 (10)	11 (55)	20
1986-1995	2 (12)	1 (6)	0	14 (82)	17
1996-2003	4 (19)	2 (10)	0	15 (71)	21
^*After 1979, there were no cases involving documented exposure to a domestic animal known to be rabid or probably rabid. Thereafter, all cases originated in countries where canine rabies was endemic. ^†Other sources of exposure include laboratory aerosol (in 1972 and 1977) and corneal transplantation (in 1978). ^‡If a definitive source of exposure was not identified in the patient’s history, the source of exposure was considered to be unknown, regardless of the source suspected on the basis of antigenic or genetic characterization. From the New England Journal of Medicine, Rupprecht, C.E., Gibbons, RV Prophylaxis against Rabies, Vol. 351; 25 pp. 2627, Figure 1. Copyright © 2004 Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.

Although rabies still has among the highest case fatality rates, the successful treatment of these patients provides important insight into the pathophysiology of human rabies and offers promise for advances in treatment.

HIV/AIDS: Cure of HIV HIV is unique among infectious diseases in that clearance of disease was not believed to ever occur. Bryson and her colleagues from the University of California at Los Angeles reported the case of an infant who was born after 36 weeks’ gestation to an asymptomatic HIV-infected woman.¹⁵ The pregnancy was uncomplicated, and the mother had a CD4+ T-cell count of more than 1000 cells/mm³ at the time of delivery. At birth, the infant required hospitalization for 8 days because of mild respiratory distress syndrome. Laboratory studies on the newborn found a negative culture of cord blood for HIV. However, the infant’s blood culture was positive at 19 and 51 days of age, and the PCR was positive at 33 days of life. Subsequently, HIV antibodies disappeared by 12 months of age. Multiple cultures of peripheral blood lymphocytes and plasma for HIV were negative between 3 months and 5 years of age. The child was asymptomatic and had no laboratory evidence of HIV infection at 5 years of age. The authors believed that the infant was infected but cleared the HIV infection by immunologic or other mechanisms. This case report was followed up by a search for similar cases of spontaneous resolution of perinatal HIV infection in infants by other investigators; to date, no such cases have been reported.

More recently, physicians reported a case of an HIV-infected man who may have been cured with allogeneic stem-cell transplantation. The biological mechanism by which HIV binds to cells during infection has been well established: attachment occurs with a primary receptor (CD4) and with a co-receptor (either CCR5 or CXCR4). Epidemiologic studies played a critical role in identifying the importance of these co-receptors, whereby individuals with certain mutations (Δ32/Δ32 homozygosity) were at greatly reduced risk for HIV infection.¹⁶ In this case, an HIV-infected man who was undergoing allogeneic stem-cell transplantation to treat acute myeloid leukemia was specifically given cells from a matched donor who was also homogeneous for the Δ32/Δ32 mutation. The initial report in 2009 documented a lack of detectable virus through 20 weeks.¹⁷ A 2011 follow-up study documented continued absence of evidence of viral infection and recovery of the patient’s immune system through 3.5 years.¹⁸ The impact of these case reports has helped propel the search for a cure for HIV as a top research priority.

Case Series

A second type of descriptive epidemiologic study is the case series. In this kind of study, data from a cluster or series of cases are reported. No comparison is made with controls; instead, the case series may be reported in sufficient epidemiologic detail that it is possible to infer the means of transmission and the risk factors for infection. A case series
of AIDS patients, which was reported early in the epidemic and prior to the identification of HIV, is described next.

AIDS Cluster A cluster of homosexual men with Kaposi’s sarcoma (KS) and/or Pneumocystis carinii pneumonia (PCP) was reported in 1984, prior to the identification of HIV.¹⁹ The investigators enumerated the sexual contacts of the first 19 homosexual male AIDS patients reported from Southern California. One man reported sexual contact with 12 of the AIDS patients within 5 years of the onset of their symptoms. Four of the patients from Southern California had contact with a non-California AIDS patient, who was also the sex partner of four AIDS patients from New York City. Ultimately, 40 AIDS patients in 10 cities were linked by sexual contact in this extensive sexual network (Figure 3-5). At the epicenter of this cluster was “patient zero,” who estimated that he had had approximately 250 different male sexual partners each year from 1979 through 1981 and was able to name 72 of his 750 partners during this 3-year period; 8 of these partners had developed AIDS. The sexual network linking these patients with the new disease (AIDS) was remarkably similar to the networks of patients with syphilis that were described four decades earlier. This remarkable study led the investigators to conclude that AIDS was caused by a sexually transmitted agent.

Figure 3-5 Sexual contacts among homosexual men with AIDS. Each circle represents an AIDS patient. Lines connecting the circles represent sexual exposures. Indicated city or state is place of residence of a patient at the time of diagnosis. “0” Indicates Patient 0. Reprinted from the American Journal of Medicine, Vol. 76, D. Auerbach et al, Cluster of Cases of the Acquired Immune Deficiency Syndrome Patients linked by Sexual Contact, pp. 487-492, © 1984, with permission from Excerpta Medica Inc.

Ecologic Studies

Ecologic studies utilize populations with different levels of exposure and examine the correlation of exposure levels with population-level disease frequency. In a typical ecologic study, data are not available at the individual level to determine whether those individuals who are truly exposed have a higher (or lower) occurrence of disease; the researcher simply knows that in the population with greater exposure, there is more (or less) disease.

Ecologic studies may be useful in exploring hypothesized associations by comparing disease frequencies among populations from different geographic regions or from different time periods. Population-level data may be available from national
or community-wide surveys of exposure frequencies and disease rates, which can often be obtained inexpensively. Ecologic studies also allow for comparisons where the range of exposure in one particular population may be too narrow to correlate with a disease outcome at the individual level. For example, the association of vitamin A deficiency with an infectious outcome would be difficult to evaluate in a population consisting of only vitamin A-deficient individuals. Alternatively, an ecologic study comparing infection outcomes across populations with varying prevalence of vitamin A deficiency would permit a better assessment of the correlation. Similarly, studies of the relationship between infectious agents and unusual outcomes—such as the liver fluke Opisthorchis viverrini and bile duct cancer, and Helicobacter pylori and stomach cancer—can be strengthened by ecologic data from populations with widely varying levels of infections and cancer.

Two ecologic studies, one of rheumatic fever and one of HIV infection, are described here.

Figure 3-6 The correlation between the incidence of rheumatic heart disease per 100,000 and the number of persons per room (×100), as found by Perry and Robers in various districts of the city of Bristol, England, in 1927-1930. (The size of the dots indicates roughly the comparative population size of the districts.) Reproduced from E. Kass. Infectious Diseases and Social. Change. Journal of Infectious Diseases, Vol. 23(1):110-114. © 1971. By permission of Oxford University Press.

Crowding and Rheumatic Fever Early studies led to the hypothesis that household crowding was an important environmental factor in the transmission of group A streptococci and high rates of acute rheumatic fever. Moreover, it has been hypothesized that the reduction in household crowding may have been one factor leading to the decreased rates of acute rheumatic fever in the last half of the 1900s in comparison with earlier periods.²⁰ The data in Figure 3-6 show the association between the incidence of rheumatic heart disease and crowding (as measured by household size) in various districts in the city of Bristol, England, 1927-1930. Compared to districts with high household crowding, those with low crowding show lower rates of disease.

Circumcision and HIV Transmission Male circumcision (removal of the foreskin) is a common surgical procedure undertaken for a variety of cultural and medical reasons. Biologically, the foreskin is rich in immune cells and may develop micro-tears that may
serve as an entry point for HIV. The foreskin may also trap HIV in a warm moist environment, allowing more time for infection to occur. Given these factors, it is not surprising that circumcised men have been found to have lower rates of sexually transmitted diseases.²¹

In the late 1990s, data began to emerge suggesting that circumcised men were at lower risk for HIV infection. An ecologic study contributed to this evidence by examining the association of the prevalence of circumcision and HIV in several African countries.²² Data on circumcision practices were extracted from an ethnographic database and were combined with published HIV seroprevalence data. By mapping these data, the authors identified a strong correlation between the practice of male circumcision and the prevalence of HIV infection among males (Figure 3-7).

The challenge in conducting this analysis was that a variety of behavioral, cultural, and religious

differences between ethnic groups may alter the risk of HIV acquisition. Most notably, circumcised men in the study were more likely to be Muslim, and it was possible that behavioral factors may have contributed to their lower risk of infection. As noted by Gray:

Figure 3-7 Map of Africa showing political boundaries and usual male circumcision practice, with point estimates of general adult population HIV seroprevalence superimposed. Reproduced from Moses et al., Geographical Patterns of Male Circumcision Practices in Africa: Association with HIV Seroprevalence. International Journal of Epidemiology, Vol. 19, pp. 693-697. © 1990. By permission of Oxford University Press.

[M]arried Muslim men are predominantly polygamous, and polygamous unions may provide a closed sexual network reducing the risk of HIV introduction. Also, Muslim men abstain from alcohol consumption, and alcohol is associated with high-risk behaviors. Key informant interviews suggest that penile hygiene may be important. Under Islam, individuals are considered unclean after intercourse, and Muslim men and women are required to perform post-coital ablutions. In addition, observant Muslims will often wash before daily prayer. Hygienic practices
associated with religion may thus partly explain the protective effects of circumcision among Muslims.²³

Because an ecologic study design does not collect individual-level data, it cannot account for differences in cultural or hygienic practices that may differ between those men who are and are not circumcised.

Based on the strength of the ecologic studies and other emerging data, three randomized clinical trials of male circumcision were initiated in 2001 in Kenya, South Africa, and Uganda. The results of these studies demonstrated convincingly that male circumcision reduced the incidence of HIV acquisition by more than 50%.²⁴, ²⁵ and ²⁶ Although no benefits were seen for circumcision of HIV-infected men in protecting against transmission to their female partners,²⁷ additional studies have demonstrated benefits of circumcision for genital ulcer disease²⁸ and high-risk human papillomavirus.²⁹ The next generation of studies will need to evaluate the expansion of circumcision as part of national HIV prevention strategies and the impact on regional HIV incidence—again requiring further ecologic designs.

Analytical Study Designs

Analytical study designs are fundamental tools for epidemiological inference. In contrast to descriptive study designs, analytical studies collect individual-level data and compare the occurrence of disease with exposure. These designs are best described in the context of a study population moving through time, as illustrated in Figure 3-3, which emphasizes several interrelated methodological considerations:

Selection of the study population from the target and source populations. As noted previously, there are usually individuals who are part of the target population but who are outside the study population. Thus the manner in which individuals are selected (or self-select) for participation into the study population is an important consideration in evaluating the inferences of a study.
Determination of time metric and follow-up. The choice of an appropriate metric for conceptualizing the study population moving through time is an important design consideration. Typically, studies are described in terms of the calendar time during which individuals are enrolled and followed. Keep in mind, however, that time can be defined by any number of measures, such as chronological age, biological life stages (e.g., before or after menopause), or other events (e.g., jobs, marriage, retirement).³⁰

Each line in Figure 3-3 represents the time when an individual begins and ends his or her time at risk for experiencing the disease outcome. Determining whether an individual is at risk incorporates both biological and methodological considerations. For example, individuals vaccinated against Morbillivirus will not be susceptible to measles and, therefore, cannot be considered at risk for this outcome. Furthermore, if an individual moves out of a study catchment area, he or she would also no longer be considered at risk (if the event occurred, it would not be recorded by the study). Depending on the time metric of interest, not all study participants might enter into a study at the same time. For example, if “time” is measured by age, then participants might enter at different ages, even if they are all enrolled during the same calendar period.

The importance of a well-defined and relevant time metric becomes more evident when we are thinking about the group of individuals who are at risk for an event at a particular time point—the risk set. As we will discuss in more detail later, risk sets are important in both the design and the analysis of epidemiologic studies.

Exposure assessment. In the simplest studies evaluating the link between a particular exposure (e.g., “exposed” and “unexposed,” illustrated as solid and dashed lines, respectively, in Figure 3-3) and outcome, it is important to determine whether and how exposure may change over time. Some exposures may be unchanging (“fixed”) within an individual (e.g., genetics). Others may be time varying but may change in different ways. For example, some infectious exposures may be transitory and recurrent (e.g., influenza), whereas others are persistent and lifelong once acquired (e.g., HIV). When exposures can change, an important aspect of the study design is the time when they are assessed. It is vital that they are measured at a relevant time point so that they can be temporally linked to a disease outcome.
Outcome assessment. At the end of the period at risk, some individuals in the study population may develop the disease of interest (Figure 3-3, circles). Like exposures, some disease outcomes may be transient and recur; others are lifelong or defined that way in the analysis (e.g., defining a disease outcome as the first occurrence).

Furthermore, a study will usually end before all individuals develop disease; the term we use for those individuals whose period of follow-up ends while they are still disease-free and at risk is “censored.” Thus we need to think about the different types of censoring that may occur within our study design. Individuals may be censored due to logistics of cutting off their follow-up time for an analysis (e.g., administrative censoring). Alternatively, censored individuals may include those who drop out or otherwise become lost to follow-up during the course of the study.

In this section, we provide a brief survey of the various study designs used in epidemiologic analysis, highlight key issues for each design in light of these four methodologic characteristics, and provide examples in the context of infectious disease studies.

Randomized Clinical Trials

Clinical trials evaluate the effect of planned interventions in an experimental manner. The investigator assigns certain participants to receive one treatment— the experimental group—and others to receive another treatment—the control or comparison group. In this subsection, we highlight several key methodological issues in clinical trial designs.

Selection of the Study Population from the Target & Source Populations Clinical trials are initiated to make inferences about an intervention among a specific target population. Thus the eligibility criteria that apply to the study population are a key element of the design. Some trials aim to produce widely generalizable results, exemplified by the advocacy for “large simple trials.”³¹ With this design, study investigators impose few specific eligibility criteria, with the goal of making inferences about a large and diverse population. More typically, however, clinical trials incorporate strict eligibility criteria, with the goal of eliminating potential factors that might obscure the evaluation of the safety and efficacy of an intervention. This effort would include identifying participants who will be compliant with the study protocol and are either healthy enough to gain some benefit from the intervention or more ill with fewer clinical options.

The decision to impose strict eligibility criteria may particularly affect the enrollment of minorities, women, and persons who have other comorbidities. While not limited to studies of infectious diseases, several studies have documented these effects in HIV disease.³², ³³ These disparities remain despite evidence that minorities are willing to participate in research studies.³⁴, ³⁵

Determination of Time Metric and Follow-up The natural time origin in a clinical trial is the date of randomization, and the time elapsed since randomization is the natural metric for measuring followup. The study protocol will usually specify the target minimal time for follow-up of the primary endpoint after which data are analyzed—for example, the trial may continue until all patients have a minimum follow-up of 5 years. Completeness of follow-up is a vital part of clinical trials to ensure the results are not subject to selection bias. The Consolidated Standards of Reporting Trials (CONSORT) Group has developed a variety of initiatives and recommendations that address problems arising from inadequate reporting of randomized controlled trials (RCTs).³⁶

Many studies conduct planned interim analyses, with the possibility of stopping a trial early for futility or if strong safety and/or efficacy signals are observed. A substantial number of statistical issues arise with interim analyses, and careful planning and adaptation of specialized methods are necessary to ensure the study maintains its statistical integrity.³⁷ Further, the decision to stop a study early requires the study team, usually in collaboration with an independent data safety and monitoring board, to balance a variety of complex considerations.³⁸, ³⁹

Exposure Assessment In an RCT, the exposure of interest (i.e., the intervention) is randomized and administered to participants according to a prespecified study protocol. In the classic double-masked (or double-blinded) trial, subjects are assigned by a random procedure to receive either an experimental treatment or placebo, and neither the subject nor the investigator knows which treatment the subject is receiving. Under some circumstances, this type of trial is not possible, for a variety of reasons. It may not always be possible to conceal the treatment group from the trial participants or the investigators. For instance, trials of medical procedures may be obvious, or medications may have certain side effects. Also, there may be times where a suitable placebo is not available.

In an ideal setting, all individuals would receive and adhere to the intervention to which they were randomized. However, issues of crossovers between treatment groups and adherence are important and raise a key consideration of whether individuals should be analyzed as they are randomized (“intention to treat” analysis) or as they actually use the treatment (“as-treated” analysis). Describing the crossover and
adherence in a standardized manner is aided by guidelines such as the CONSORT statement.

Only gold members can continue reading. Log In or Register to continue