Immunogenicity and Antigen Structure

Jay A. Berzofsky

Ira J. Berkower

THE NATURE OF ANTIGENIC DETERMINANTS RECOGNIZED BY ANTIBODIES

Haptens

In the antigen-antibody binding reaction, the antibodybinding site is often unable to accommodate the entire antigen. The part of the antigen that is the target of antibody binding is called an antigenic determinant, and there may be one or more antigenic determinants per molecule. To study antibody specificity, we need to have antibodies against single antigenic determinants. Small functional groups that correspond to a single antigenic determinant are called haptens. For example, these may be organic compounds, such as trinitrophenyl or benzene arsonate, a monosaccharide or oligosaccharide such as glucose or lactose, or an oligopeptide such as pentalysine. Although these haptens can bind to antibody, immunization with them usually will not provoke an antibody response (for exceptions, see Goodman¹). Immunogenicity often can be achieved by covalently attaching haptens to a larger molecule, called the carrier. The carrier is immunogenic in its own right, and immunization with the hapten-carrier conjugate elicits an antibody response to both hapten and carrier. However, the antibodies specific for hapten can be studied by equilibrium dialysis using pure hapten (without carrier) or by immunoprecipitation using hapten coupled to a different (and non-cross-reacting) carrier or by inhibition of precipitation with free hapten.

This technique was pioneered by Landsteiner² and helped to elucidate the exquisite specificity of antibodies for antigenic determinants. For instance, the relative binding affinity of antibodies prepared against succinic acid-serum protein conjugates shows marked specificity for the maleic acid analog, which is in the cis configuration, as compared to the fumaric acid (trans) form.³ Presumably, the immunogenic form of succinic acid corresponds to the cis form.³ This ability of antibodies to distinguish cis from trans configurations was reemphasized in later studies measuring relative affinities of antibodies to maleic and fumaric acid conjugates⁴ (Table 23.1A). Table 23.1B shows the specificity of antibodies prepared against p-azobenzenearsonate coupled to bovine gamma globulin.⁵ As the hapten is coupled through the p-azo group to aromatic amino acids of the carrier, haptens containing bulky substitutions in the para position would most resemble the immunizing antigen. In fact, p-methyl-substituted benzene arsonate has a higher binding affinity than unsubstituted benzene arsonate. However, methyl substitution elsewhere in the benzene ring reduces affinity, presumably due to interference with the way hapten fits into the antibody-binding site. Thus, methyl substitutions can have positive or negative effects on binding energy, depending on where the substitution occurs. Table 23.1C shows the specificity of antilactose antibodies for lactose versus cellobiose.⁶ These disaccharides differ only by the orientation of the hydroxyl attached to C4 of the first sugar either above or below the hexose ring. The three examples in this table, as well as many others,¹ show the marked specificity of antibodies for cis-trans, ortho-meta-para, and stereoisomeric forms of the antigenic determinant.

Comparative binding studies of haptens have been able to demonstrate antibody specificity despite the marked heterogeneity of antibodies. Unlike the antibodies against a multideterminant antigen, the population of antibodies specific for a single hapten determinant is a relatively restricted population due to the shared structural constraints necessary for hapten to fit within the antibody-combining site. However, the specificity of an antiserum depends on the collective specificities of the entire population of antibodies, which are determined by the structures of the various antibody-binding sites. When studying the cross-reactions of hapten analogs, some haptens bind all antibodies but with reduced K_A. Other hapten analogs reach a plateau of binding because they fit some antibody-combining sites quite
well but not others (see discussion of cross-reactivity in Chapter 7). Antibodies raised in different animals may show different cross-reactivities with related haptens. Even within a single animal, antibody affinity and specificity are known to increase over time following immunization under certain conditions.⁷ Thus any statements about the cross-reactivity of two haptens reflect both structural differences between the haptens that affect antigen-antibody fit and the diversity of antibody-binding sites present in a given antiserum.

TABLE 23.1 Exquisite Specificity of Antihapten Antibodies

Hapten	Structure		K_rel of Antibody Specific for
A.			Maleic (cis)	Fumaric (trans)
	Maleanilate		1.0	<0.01
	Fumaranilate		<0.01	1.0
B.			Parasubstituted benzene arsonate
	Benzene arsonate		1.0
	o-Methyl benzene arsonate		0.2
	m-Methyl benzene arsonate		0.8
	p-Methyl benzene arsonate		1.9
C.			Lactose
	Lactose	&bgr; Gal (1 → 4) Glu	1.00
	Cellobiose	&bgr; Glu (1 → 4) Glu	0.0025
Part A from Pressman and Grossberg,4 part B from Pressman et al.,5 and part C from Karush,6 with permission.

Carbohydrate Antigens

The antigenic determinants of a number of biologically important substances consist of carbohydrates. These often occur as glycolipids or glycoproteins. Examples of the former include bacterial cell wall antigens and the major blood group antigens, whereas the latter group includes “minor” blood group antigens such as Rh. In addition, the capsular polysaccharides of bacteria are important for virulence and are often targeted by protective antibodies. A number of spontaneously arising myeloma proteins have been found to show carbohydrate specificity, possibly reflecting the fact that carbohydrates are common environmental antigens. In the days prior to hybridoma technology, these carbohydratespecific myeloma proteins provided an important model for studying the reaction of antigen with a monoclonal antibody.

Empirically, the predominant antigenic determinants of polysaccharides often consist of short oligosaccharides (one to five sugars long) at the nonreducing end of the polymer chain.⁸ This situation is analogous to a hapten consisting of several sugar residues linked to a large nonantigenic polysaccharide backbone. The remainder of the polysaccharide is important for immunogenicity, just as the carrier molecule was important for haptens. In addition, branch points in the polysaccharide structure allow for multiple antigenic determinants to be attached to the same macromolecule. This is important for immunoprecipitation by lattice formation, as discussed in Chapter 5. Several examples illustrating structural studies of oligosaccharide antigens are given later.

The technique used most widely to analyze the antigenic determinants of polysaccharides is called hapten inhibition.⁸ In this method, the precipitation reaction between antigen and antibody is inhibited by adding short oligosaccharides. These oligosaccharides are large enough to bind with the same affinity and specificity as the polysaccharide, but because they are monomeric, no precipitate forms. As more inhibitor is added, fewer antibody-combining sites remain available for precipitation. Using antiserum specific for a single antigenic determinant, it is often possible to block precipitation completely with a short oligosaccharide corresponding to the nonreducing end of the polysaccharide chain. Besides showing the “immunodominance” of the nonreducing end of the chain, this result also shows that the structure of the antigenic determinant of polysaccharides depends on the sequence of carbohydrates and their linkage, rather than their conformation. For inhibition by hapten to be complete, the antigen-antibody system studied must be made specific for a single antigenic determinant. For optimal sensitivity, the equivalence point of antigen and antibody should be used.

We illustrate the types of carbohydrate antigens encountered by examining three classic examples in more detail: the salmonella O antigens, the blood group antigens, and dextrans that bind to myeloma proteins.

Immunochemistry of Salmonella O Antigens

The antigenic diversity among numerous salmonella species resides in the structural differences of the lipopolysaccharide (LPS) component of the outer membrane.⁹ These molecules are the main target for antisalmonella antibodies. The polysaccharide moiety contains the antigenic determinant, whereas the lipid moiety is responsible for endotoxin effects. The chemical structure of LPS can be divided into three regions (Fig. 23.1). Region I contains the antigenic O-specific polysaccharide, usually made up of repeated oligosaccharide units, which vary widely among different strains. Region II contains an oligosaccharide “common core” shared among many different strains. Failure to synthesize region II oligosaccharide or to couple completed region I polysaccharide to the growing region II core results in R (rough) mutants, which have “rough” colony morphology and lack the O antigen. Region III is the lipid part, called lipid A, which is shared among all salmonellae and serves to anchor LPS on the outer membrane. Early immunologic attempts to classify the O antigens of different salmonellae revealed a large number of cross-reactions between different strains. These were detected by preparing antiserum to one strain of salmonella and using it to agglutinate bacteria of a second strain. Each cross-reacting determinant was assigned a number, and each strain was characterized by a series of O antigen determinants (in aggregate, the “serotype” of the strain) based on its pattern of cross-reactivity. Each strain was classified within a group, based on sharing a strong O determinant. For example, group A strains share determinant 2, whereas
group B strains share determinant 4 (Table 23.2). However, within a group, each strain possesses additional O determinants, which serve to differentiate it from other members of that group. Thus, determinant 2 coexists with determinants 1 and 12 on Salmonella paratyphi A. This problem of cross-reactivity based on sharing of a subset of antigenic determinants is commonly encountered in complex antigen-antibody systems. The problem may be simplified by making antibodies monospecific for individual antigenic determinants. To do this, antibodies are absorbed to remove irrelevant specificities, or cross-reactive strains are chosen that share only a single determinant with the immunizing strain. The reaction of each determinant with its specific antibody can be thought of as an antigen-antibody system. Thus, for the strains shown in Table 23.2, antiserum to Salmonella typhi (containing anti-9 and anti-12 antibodies) may be absorbed with S. paratyphi A to remove anti-12, leaving a reagent specific for antigen 9 (see Table 23.2). Alternatively, the unabsorbed antiserum may be used to study the system antigen 12-anti-12 by allowing it to agglutinate S. paratyphi B, which shares only antigen 12 with the immunogen. Because the other determinants on S. paratyphi B were absent from the immunizing strain, the antiserum contains no antibodies to them.

FIG. 23.1. Structure of Salmonella Lipopolysaccharide. Region I contains the unique O-antigen determinants, which consist of repeating units of oligosaccharides. These are attached to lipid moiety through the core polysaccharide. Three examples of oligosaccharide units are shown.⁹ (Part A adapted from Kabat,⁸ with permission; part B based on Jann and Westphal.⁹)

TABLE 23.2 Salmonella Q Antigen Serotyping

Salmonella Strain	Serogroup		O Antigenic Determinants
S. paratyphi	A		A 1, 2, 12
S. paratyphi	B		B 1, 4, 5, 12
S. typhi	D		9, 12
Antiserum	Absorbed	Tested on	Single Determinant Measured
Anti-S. typhi		S. paratyphi B	12
Anti-S. typhi	S. paratyphi A	S. typhi	9
Reprinted with permission from Kabat.8

Once the antigen-antibody reaction is made specific for a single determinant, a variety of oligosaccharides can be added
to test for hapten inhibition. Because the O antigens contain repeating oligosaccharide units, it is often possible to obtain model oligosaccharides by mild chemical or enzymatic degradation of the LPS itself. Once the most inhibitory oligosaccharide is found, its chemical structure is determined. Alternatively, a variety of synthetic monosaccharide, disaccharide, trisaccharide and oligosaccharides are tested for hapten inhibition of precipitation. For example, as shown in Table 23.3, antigen 1-anti-1 antibody precipitation is inhibited by methyl-&agr; D-glucoside. Therefore, various disaccharides incorporating this structure were tested, of which &agr;-D-Glu-(1 → 6)-D-Gal was the most inhibitory. Then various trisaccharides incorporating this sequence were tested. The results indicate the sequence and size of the determinant recognized by anti-1 antibodies to be a disaccharide with the previously discussed structure. The test sequences can be guessed by analyzing the oligosaccharide breakdown products of the LPS, which include tetramers of D-Glu-D-Gal-D-Man-L-Rham. The results in Table 23.3 also suggest that the difference between determinants 1 and 19 is the length of oligosaccharide recognized by antibodies specific for each determinant. This hypothesis is supported by the observation that determinant 1 is found in some strains with, and in other strains without, determinant 19; whereas determinant 19 is always found with determinant 1. As shown in Table 23.3, determinant 19 requires the full tetrasaccharide for maximal hapten inhibition, including the sequence coding for determinant 1. Besides identifying the antigenic structures, these results indicate that there is variation in the size of different antigenic determinants of polysaccharides.

TABLE 23.3 Analysis of Salmonella O-Antigen Structure by Hapten Inhibition

	Antigen System
Maximum Inhibition by Hapten (%)	1: anti-1	19: anti-19
d-Glu	—	0
Me-&agr;-D-Glu	35	10
&agr;-D-Glu(1 → 6)-D-Gal	80	25
Glu.Gal.Man	80	70
Gtu.Gal.Man.L-Rham	>70	>70
Deduced structure	&agr;-D-Glu(1 → 6)-D-Gal	D-Glu-D-Gal-D-Man-L-Rham
Reprinted with permission from Kabat.8

Blood Group Antigens

The major blood group antigens A and B were originally detected by the ability of serum from individuals lacking either determinant to agglutinate red blood cells bearing them (for reviews, see Kabat,⁸ Springer,¹⁰ Marcus,¹¹ and Watkins¹²). In addition, group O individuals have an H antigenic determinant that is distinct from A or B types, and individuals in all three groups may have additional determinants such as the Lewis (Le) antigens. Although the ABH and Le antigenic determinants are found on a carbohydrate moiety, the carbohydrate may occur in a variety of biochemical forms. On cell surfaces, they are either glycolipids that are synthesized within the cell (AB and H antigens) or glycoproteins taken up from serum (Le antigens). In mucinous secretions, such as saliva, they occur as glycoproteins. Milk, ovarian cyst fluid, and gastric mucosa contain soluble oligosaccharides containing blood group reactivity. In addition, these antigens occur frequently in other species, including about half of the bacteria in the normal flora of the gut.¹⁰ This widespread occurrence may account for the ubiquitous anti-AB reactivity of human sera, even in people never previously exposed to human blood group substances through transfusion or pregnancy.

The immunochemistry of these antigens was simplified greatly by the use of oligosaccharides in hapten inhibition studies. Group A oligosaccharides, for example, would inhibit the agglutination of group A red blood cells by anti-A antibodies. They could also inhibit the immunoprecipitation of group A-bearing glycoproteins by anti-A antibodies. Because the oligosaccharides are monomeric, their reaction with antibody does not form a precipitate but does block an antibody-combining site.

The inhibitory oligosaccharides from cyst fluid were purified and found to contain D-galactose, L-fucose, N-acetylgalactosamine, and N-acetylglucosamine. The most inhibitory oligosaccharides for each antigen are indicated in Figure 23.2. As can be seen in Figure 23.2, the ABH and Le antigens all share a common oligosaccharide core sequence, and the antigens appear to differ from each other by the sequential addition of individual sugars at the end or at branch points. Besides hapten inhibition, other biochemical data support this relationship among the different determinants. Enzymatic digestion of A, B, or H antigens yields a common core oligosaccharide from each. This product cross-reacts with antiserum specific for pneumococcal polysaccharide type XIV, which contains structural elements shared with blood group determinants, as shown at the bottom of Figure 23.2. In addition, this structure, known as precursor substance, has been isolated from ovarian cyst fluid.

Starting from precursor substance, the H determinant results from the addition of L-fucose to galactose, whereas Le^a determinant results from the addition of L-fucose to N-acetylglucosamine and Le^b from the addition of L-fucose to both sugars. Addition of N-acetylgalactosamine to H substance produces the A determinant, whereas addition of galactose produces the B determinant, in each case blocking reactivity of the H determinant.

The genetics of ABH and Le antigens is explained by this sequential addition of sugars via glycosyltransferases. The allelic nature of the AB antigens is explained by the addition of N-acetylgalactosamine, galactose, or nothing to the H antigen. The rare inherited trait of inability to synthesize the H determinants from precursor substance (Bombay phenotype) also blocks the expression of A and B antigens because the A and B transferases lack an acceptor substrate. However, the appearance of the Le^a antigen on red cells is independent of H antigen synthesis. Its structure, shown in Figure 23.2, can be derived directly from precursor substance without going through an H antigen intermediate. Comparing different individuals, the appearance of Le^a antigen on red blood cells correlates with its presence in saliva, as the Le^a
antigen is not an intrinsic membrane component but must be absorbed from serum glycoproteins, which, in turn, depend on secretion. In addition to the independent synthetic pathway, the secretion of Le^a antigen is also independent of the secretory process for ABH antigens. Therefore, salivary nonsecretors of ABH antigens (which occur in 20% of individuals) may still secrete Le^a antigen if they have the fucosyl transferase encoded by the Le gene. In contrast, salivary secretion of ABH is required for red blood cells to express Le^b.

FIG. 23.2. Oligosaccharide Chain Specificity. Structure of the ABH and Le blood group antigens as determined by hapten inhibition studies.⁸^,¹¹ There are two variants of each of these determinants. In type 1, the Gal-GNAc linkage is &bgr;(1 → 3), whereas in type 2, the Gal-GNAc linkage is &bgr;(1 → 4). In addition, there is heterogeneity in the A and B antigens with respect to the presence of the Le fucose attached to the GNAc. In the molecules that contain the extra fucose, when the Gal-GNAc linkage is &bgr;(1 → 3) (type 1), the fucose must be linked &agr;(1 → 4), whereas the type 2 molecules, with the &bgr;(1 → 4) Gal-GNAc linkage, contain &agr;(1 → 3)-linked fucose. The asterisks indicate the sites of this variability in linkage.

Dextran-Binding Myeloma Proteins

Because polysaccharides are common environmental antigens, it is not surprising that randomly induced myeloma proteins were frequently found to have carbohydrate specificities. Careful studies of these monoclonal antibodies support the clonal expansion model of antibody diversity: heterogeneous antisera behave as the sum of many individual clones of antibody with respect to affinity and specificity. In the case of the Ig A&kgr; myeloma proteins W3129 and W3434, both antibodies were found to be specific for dextrans containing &agr;-glu (1 → 6)glu bonds.¹² Hapten inhibition with a series of monosaccharide or oligosaccharides of increasing chain length indicated that the percentage of binding energy derived from the reaction with one glucose was 75%, two glucoses 95%, three glucoses 95% to 98%, and four glucoses 100%. This suggests that most binding energy between antidextran antibodies and dextran derives from the terminal monosaccharide, and that oligosaccharides of chain length four to six commonly fill the antibody-combining site. Human antidextran antisera behaved similarly, with tetrasaccharides contributing 95% of the binding energy. These experiments provided the first measure of the size of an antigenic determinant, four to six residues.¹³^,¹⁴ In addition, as was observed for antisera, binding affinity of myeloma proteins was highly sensitive to modifications of the terminal sugar and highly specific for &agr;(1 → 6) versus &agr;(1 → 3) glycosidic bonds. However, modification of the third or fourth sugar of an oligosaccharide had relatively less effect on hapten inhibition of either myeloma protein or of antisera reacting with dextran.

Studies with additional dextran-binding myeloma proteins¹⁵ revealed that not all antipolysaccharide monoclonal antibodies are specific for the nonreducing end, as exemplified by QUPC 52. Competitive inhibition with monosaccharide and oligosaccharides revealed that <5% of binding energy derived from monosaccharides or disaccharides, 72% from trisaccharides, 88% from tetrasaccharides, and 100% from hexasaccharides, in marked contrast to other myeloma proteins. A second distinctive property of myeloma protein QUPC 52 was its ability to precipitate unbranched dextran of chain length 200. As the unbranched dextran has only one nonreducing end, and as the myeloma protein has only one specificity, lattice formation due to cross-linking between the nonreducing ends is impossible, and precipitation must be explained by binding some other determinant. Therefore, QUPC 52 appears to be specific for internal oligosaccharide units of three to seven chain length. The W3129 is specific for end determinants and will not precipitate unbranched dextran chains. Antibodies precipitating linear dextran were also detected in six antidextran human sera, comprising 48% to 90% of the total antibodies to branched chain dextran. Thus, antidextrans can be divided into those specific for terminal oligosaccharides and those specific for internal oligosaccharides; monoclonal examples of both types are available, and both types are present in human immune serum. Cisar et al.¹⁵ speculated as to the different topology of the binding sites of W3129 or QUPC 52 necessary for terminal or internal oligosaccharide specificity. Both terminal and internal oligosaccharides have nearly identical chemical structures, differing at a single C-OH or glycoside bond. Perhaps the terminal oligosaccharide specificity of W3129 is due to the shape of the antibody-combining site—a cavity into which only the end can fit—whereas the
internal oligosaccharide-binding site of QUPC 52 could be a surface groove in the antibody, which would allow the rest of the polymer to protrude out at both ends. A more definitive answer depends on x-ray crystallographic studies of the combining sites of monoclonal antibodies with precisely defined specificity, performed with antigen occupying the binding site.

With the advent of hybridoma technology, it became possible to produce monoclonal antibodies of any desired specificity. Immunizing mice with nearly linear dextran (the preferred antigen of QUPC 52), followed by fusion and screening (with linear dextran) for dextran-binding antibodies, yielded 12 hybridomas,¹⁶ all with specificity similar to QUPC 52. First, oligosaccharide inhibition of all 12 monoclonals showed considerable increments in affinity up to hexasaccharides, with little affinity for disaccharides and only 49% to 77% of binding energy derived from trisaccharides.¹⁷ Second, all 12 monoclonals had internal &agr;(1 → 6) dextran specificity, as they could all precipitate linear dextran. Third,⁹ out of 11 BALB/c monoclonals shared a cross-reactive idiotype with QUPC 52, whereas none shared idiotype with W3129.¹⁸ These data support the hypothesis that different antibodies with similar specificity and similar groove-type sites may be derived from the same family of germline V_H genes bearing the QUPC 52 idiotype.¹⁸

The large number of environmental carbohydrate antigens and the high degree of specificity of antibodies elicited in response to each carbohydrate antigen suggest that a tremendous diversity of antibody molecules must be available, from which some antibodies can be selected for every possible antigenic structure. Studies of a series of 17 monoclonal anti-&agr;(1 → 6) dextran hybridomas¹⁹^,²⁰ have investigated whether the binding sites of closely related antibodies were derived from a small number of variable region genes, for both heavy and light chains, or whether antibodies of the same specificity could derive from variable region genes with highly divergent sequences. Each monoclonal had a groove-type site that could hold six or seven sugar residues (with one exception), based on inhibition of immunoprecipitation by different length oligosaccharides. Thus, unlike monoclonals to haptenated proteins, the precise epitope could be well characterized and was generally quite similar among the entire series.

Studies of the V_&kgr; sequences revealed that only three V_&kgr; groups were used in these hybridomas. Use of each V_&kgr; group correlated with the particular antigen used to immunize the animals, whether linear dextran or short oligosaccharides, so that 10 of the monoclonals from mice immunized the same way all used the same V_&kgr;.

In contrast, the 17 V_H chains were derived from at least five different germline genes from three different V_H gene families.²¹ The two most frequently used germline V_H genes were found in seven and five monoclonals, respectively, with minor variations explainable by somatic mutations. The remarkable finding is that very different V_H chains (about 50% homologous) can combine with the same V_&kgr; to produce antibody-binding sites with nearly the same size, shape, antigen specificity, and affinity. Even when different V_H sequences combine with different V_&kgr; sequences, they can produce antibodies with very similar properties. Dextran binding depends on the antigen fitting into the groove and interacting favorably with the residues forming the sides and bottom of the groove. The results indicate that divergent variable region sequences, both in and out of the complementarity-determining regions, can be folded to form similar binding site contours, which result in similar immunochemical characteristics. Similar results have been reported in other antigen-antibody systems, such as phenyloxazolone.²²

Additional studies of carbohydrate binding monoclonal antibodies have revealed significant information about how the antibody variable regions can bind a carbohydrate structure with high affinity and specificity. Several examples are now available of crystal structures of carbohydrates bound to antibodies.

For example, monoclonal antibody Se155-4 is specific for the group B determinant of the salmonella O antigen, which consists of the sugars Gal-Abequose-Man.²³^,²⁴ The crystal structure of antibody bound to the polysaccharide shows that one hexose, abequose, fits into the binding pocket, while the rest of the interactions occur along the surface of the antibody, similar to the groove-type sites described previously. Binding energy depends on hydrogen bonds formed between the protein residues and the hydroxyl groups of the carbohydrate. The protein residues include aromatic amines, such as His 32, Trp 91, and Trp 96 of the light chain, as well as His 97 and His 35 of the heavy chain. In addition, one of the sugars is hydrogen bonded via a water molecule bridge to the amide bonds of the protein backbone. About three quarters of all sugar hydroxyl groups are involved in hydrogen bonds with the protein. Although each H-bond is relatively weak by itself, the combined effect of eight hydrogen bonds results in high-affinity binding. Antibody specificity derives from the fact that the carbohydrate fits into a binding pocket, where H-bond formation depends on precise interactions with amino acid residues that are oriented about the pocket. Surprisingly, most of these bonds are formed between sugar hydroxyls and aromatic amino acids that are neither charged nor very polar at neutral pH.

Similarly, monoclonal antibody BR96 and the humanized monoclonal hu3S193 are specific for the Le Y antigen, which resembles the Le B antigen described in Figure 23.2, except that the fucose-N-acetylglucosamine bond is &bgr;(1 → 3) instead of &bgr;(1 → 4). The Le Y antigen is commonly expressed on tumor cells of epithelial origin. The crystal structures have revealed the sources of the binding energy that results in affinity and specificity for this carbohydrate antigen.²⁵^,²⁶ These two monoclonals bind Le Y antigen in a large, deep pocket, which accommodates all four hexoses and correspond to the cavity type binding site predicted by Kabat.15 The terminal fucose goes in first while the other three sugars are hydrogen bonded to amino acid side chains lining the pocket, including Tyr 33, Tyr 35, and Gln 52 of the heavy chain and His 27 of the light chain. Once again, hydrogen bonds between hydroxyl groups of the sugars and aromatic amines (Trp and Tyr) of the protein play a dominant role in determining affinity and specificity of binding. A smaller number of H-bonds depend on amide groups of the protein backbone.

A third example is provided by human monoclonal 2G12, which has neutralizing activity against a broad spectrum of human immunodeficiency virus (HIV) isolates. This antibody binds the mannose-rich oligosaccharide side chains that form a protective surface, called a glycoshield, on the envelope glycoprotein gp120. The crystal structure shows that the two terminal mannose sugars of each oligosaccharide bind end on into a deep pocket of the antibody, in a cavity-type site.²⁷^,²⁸ Twelve hydrogen bonds form between the two terminal mannose residues and the protein, depending mainly on the amide groups of the protein backbone. Additional hydrogen bonds form between the third mannose residue and the side chain of Asp 100 of the heavy chain and between the fourth mannose residue and Tyr 94 of the light chain and Tyr 56 of the heavy chain. A unique feature of this antibody is the crossover of variable regions between heavy and light chains so that each binding site is made up of the V_H from one HL pair combining with the V_L chain of the opposite pair. This arrangement allows the antibody to bind one branch of an oligosaccharide and the opposite branch of a nearby oligosaccharide and makes it ideally suited for cross-linking the densely clustered oligosaccharides of gp120.

Immunogenicity of Polysaccharide Conjugates

Capsular polysaccharides are the main target of protective antibodies against bacterial infection, and, as such, are important vaccine antigens. In adults, the chain length of the polysaccharide is an important determinant of immunogenicity, and the polysaccharides induce a T-independent response that cannot be boosted on repeat exposure. In young children, whose maternal antibodies wane by 6 months of age and who most need immunity to pathogens such as Haemophilus influenzae type b and Streptococcus pneumoniae of multiple serotypes, the T-independent response to these polysaccharides is weak, regardless of chain length. To immunize children, the polysaccharides were coupled to a protein carrier to create a new T-dependent antigen that gained immunogenicity from T-cell help and boosted antibody titers with each successive dose. This strategy has produced highly successful conjugate vaccines against H. influenzae type b,²⁹ resulting in a markedly reduced incidence of meningitis caused by this agent in immunized children³⁰^,³¹ and evidence of herd immunity even among unimmunized children. The same strategy has produced an effective vaccine against invasive disease³² and otitis media³³ caused by the most prevalent serotypes of S. pneumoniae.

Protein and Polypeptide Antigenic Determinants

Like the proteins themselves, the antigen determinants of proteins consist of amino acid residues in a particular threedimensional array. The residues that make contact with complementary residues in the antibody-combining site are called contact residues. To make contact, of course, these residues must be exposed on the surface of the protein, not buried in the hydrophobic core. As the complementarity-determining residues in the hypervariable regions of antibodies have been found to span as much as 30 to 40 Å × 15 to 20 Å × 10 Å (D. R. Davies, personal communication), these contact residues comprising the antigenic determinant may cover a significant area of protein surface, as measured by x-ray crystallography of antibody-protein antigen complexes.³⁴^,³⁵^,³⁶^,³⁷ The size of the combining sites has also been estimated using simple synthetic oligopeptides of increasing length, such as oligolysine. In this case, a series of elegant studies³⁸^,³⁹^,⁴⁰ suggested that the maximum chain length a combining site could accommodate was six to eight residues, corresponding closely to that found earlier for oligosaccharides,¹³^,¹⁴ as discussed previously.

Several types of interactions contribute to the binding energy. Many of the amino acid residues exposed to solvent on the surface of a protein antigen will be hydrophilic. These are likely to interact with antibody contact residues via polar interactions. For instance, an anionic glutamic acid carboxyl group may bind to a complementary cationic lysine amino group on the antibody, or vice versa, or a glutamine amide side chain may form a hydrogen bond with the antibody. However, hydrophobic interactions can also play a major role. Proteins cannot exist in aqueous solution as stable monomers with too many hydrophobic residues on their surface. Those hydrophobic residues that are on the surface can contribute to binding to antibody for exactly the same reason. When a hydrophobic residue in a protein antigenic determinant or, similarly, in a carbohydrate determinant⁸ interacts with a corresponding hydrophobic residue in the antibody-combining site, the water molecules previously in contact with each of them are excluded. The result is a significant stabilization of the interaction. A thorough review of these aspects of the chemistry of antigen-antibody binding is in Getzoff et al.⁴¹

Mapping Epitopes: Conformation versus Sequence

The other component that defines a protein antigenic determinant, besides the amino acid residues involved, is the way these residues are arrayed in three dimensions. As the residues are on the surface of a protein, we can also think of this component as the topography of the antigenic determinant. Sela⁴² divided protein antigenic determinants into two categories, sequential and conformational, depending on whether the primary sequence or the three-dimensional conformation appeared to contribute the most to binding. On the other hand, as the antibody-combining site has a preferred topography in the native antibody, it would seem a priori that some conformations of a particular polypeptide sequence would produce a better fit than others and therefore would be energetically favored in binding. Thus, conformation or topography must always play some role in the structure of an antigenic determinant.

Moreover, when one looks at the surface of a protein in a space-filling model, one cannot ascertain the direction of the backbone or the positions of the helices (contrast Figs. 23.3A and 23.3B).⁴³^,⁴⁴^,⁴⁵^,⁴⁶^,⁴⁷ It is hard to recognize whether two residues that are side by side on the surface are adjacent on the polypeptide backbone or whether they come from different parts of the sequence and are brought together by the folding of the molecule. If a protein maintains its native conformation when an antibody binds, then it must similarly

be hard for the antibody to discriminate between residues that are covalently connected directly and those connected only through a great deal of intervening polypeptide. Thus, the probability that an antigenic determinant on a native globular protein consists of only a consecutive sequence of amino acids in the primary structure is likely to be rather small. Even if most of the determinant were a continuous sequence, other nearby residues would probably play a role as well. Only if the protein were cleaved into fragments before the antibodies were made would there be any reason to favor connected sequences.

FIG. 23.3. A: Artist’s representation of the polypeptide backbone of sperm whale myoglobin in its native three-dimensional conformation. The &agr; helices are labeled A through H from the amino terminal to the carboxyl terminal. Side chains are omitted, except for the two histidine rings (F8 and E7) involved with the heme iron. Methionines at positions 55 and 131 are the sites of cleavage by cyanogen bromide (CNBr), allowing myoglobin to be cleaved into three fragments. Most of the helicity and other features of the native conformation are lost when the molecule is cleaved. A less drastic change in conformation is produced by removal of the heme to form apomyoglobin, as the heme interacts with several helices and stabilizes their positions relative to one another. The other labeled residues (Glu 4, Lys 49, Glu 83, Lys 140, Ala 144, and Lys 145) are residues that have been found to be involved in antigenic determinants recognized by monoclonal antibodies.⁴³ Note that cleavage by CNBr separates Lys 79 from Glu 4 and separates Glu 83 from Ala 144 and Lys 145. The “sequential” determinant of Koketsu and Atassi⁴⁴ (residues 15 to 22) is located at the elbow, lower right, from the end of the A helix to the beginning of the B helix. (Adapted from Dickerson.⁴⁵) B: Stereoscopic views of a computer-generated space-filling molecular model of sperm whale myoglobin, based on the Takano⁴⁶ x-ray diffraction coordinates. This orientation, which corresponds to that in Panel A, is arbitrarily designated the “front view.” The computer method was described by Feldmann et al.⁴⁷ The heme and aromatic carbons are shaded darkest, followed by carboxyl oxygens, then other oxygens, then primary amino groups, then other nitrogens, and finally side chains of aliphatic residues. The backbone and the side chains of nonaliphatic residues, except for the functional groups, are shown in white. Note that the direction of the helices is not apparent on the surface, in contrast to the backbone drawing in Panel A. The residues Glu 4, Lys 79, and His 12 are believed to be part of a topographic antigenic determinant recognized by a monoclonal antibody to myoglobin.⁴³ This stereo pair can be viewed in three dimensions using an inexpensive stereoviewer such as the “stereoscopes” sold by Abrams Instrument Corp. (Lansing, MI) or Hubbard Scientific Co. (Northbrook, IL). (Adapted from Berzofsky et al.⁴³)

This concept was analyzed and confirmed quantitatively by Barlow et al.,⁴⁸ who examined the atoms lying within spheres of different radii from a given surface atom on a protein. As the radius increases, the probability that all the atoms within the sphere will be from the same continuous segment of protein sequence decreases rapidly. Correspondingly, the fraction of surface atoms that would be located at the center of a sphere containing only residues from the same continuous segment falls dramatically as the radius of the sphere increases. For instance, for lysozyme, with a radius of 8 Å, fewer than 10% of the surface residues would lie in such a “continuous patch” of surface. These are primarily in regions that protrude from the surface. With a radius of 10 Å, almost none of the surface residues fall in the center of a continuous patch. Thus, for a contact area of about 20 Å × 25 Å, as found for a lysozyme-antibody complex studied by x-ray crystallography, none of the antigenic sites could be completely continuous segmental sites (see following discussion and Fig. 23.4). On the other hand, other analyses did not find a correlation of epitope residues with surface accessibility, suggesting that the situation is more complex.⁴⁹

Antigenic sites consisting of amino acid residues that are widely separated in the primary protein sequence but brought together on the surface of the protein by the way it folds in its native conformation have been called “assembled topographic” sites⁵⁰^,⁵¹ because they are assembled from different parts of the sequence and exist only in the surface topography of the native molecule. By contrast, the sites that consist of only a single continuous segment of protein sequence have been called “segmental” antigenic sites.⁵⁰^,⁵¹

FIG. 23.4. Assembled Topographic Sites of Lysozyme Illustrated by the Footprints of Three Nonoverlapping Monoclonal Antibodies. Shown are the &agr;-carbon backbones of lysozyme in the center and the Fv portions of three antilysozyme monoclonal antibodies D1.3, HyHEL-5, and HyHEL-10. The footprints of the antibodies on lysozyme and lysozyme on the antibodies (ie, their interacting surfaces) are shown by a dotted representation. Note that the three antibodies each contact more than one continuous loop of lysozyme and so define assembled topographic sites. Reproduced from Davies and Padlan³⁷ with permission.

In contrast to T-cell recognition of “processed” fragments retaining only primary and secondary structures, the evidence is overwhelming that most antibodies are made against the native conformation when the native protein is used as immunogen. For instance, antibodies to native staphylococcal nuclease were found to have about a 5000-fold higher affinity for the native protein than for the corresponding polypeptide on which they were isolated (by binding to the peptide attached to Sepharose).⁵² An even more dramatic example is that demonstrated by Crumpton⁵³ for antibodies to native myoglobin or to apomyoglobin. Antibodies to native ferric myoglobin produced a brown precipitate with myoglobin but did not bind well to apomyoglobin, which, without the heme, has a slightly altered conformation. On the other hand, antibodies to the apomyoglobin, when mixed with native (brown) myoglobin, produced a white precipitate. These antibodies so strongly favored the conformation of apomyoglobin, from which the heme was excluded, that they trapped those molecules that vibrated toward that conformation and pulled the equilibrium state over to the apo form. One could almost say, figuratively, that the antibodies squeezed the heme out of the myoglobin. Looked at it thermodynamically, it is clear that the conformational preference of the antibody for the apo versus native forms, in terms of free energy, had to be greater than the free energy of binding of the heme to myoglobin. Thus, in general, antibodies are made that are very specific for the conformation of the protein used as immunogen. Other more recent examples also show that antibodies can
enforce structures on disordered or denatured structures in proteins such as HIV-1 Tat⁵⁴ or influenza hemagglutinin.⁵⁵

Synthetic peptides corresponding to segments of the protein antigen sequence can be used to identify the structures bound by antibodies specific for segmental antigenic sites. To identify assembled topographic sites, more complex approaches have been necessary. The earliest was the use of natural variants of the protein antigen with known amino acid substitutions, where such evolutionary variants exist.⁵⁰ Thus, substitution of different amino acids in proteins in the native conformation can be examined. The use of this method, which is illustrated later, is limited to studying the function of amino acids that vary among homologous proteins, that is, those that are polymorphic. It may now be extended to other residues by use of site-directed mutagenesis. A second method is to use the antibody that binds to the native protein to protect the antigenic site from modification⁵⁶ or proteolytic degradation.⁵⁷ A related but less sensitive approach makes use of competition with other antibodies.⁵⁸^,⁵⁹^,⁶⁰ A third approach, taking advantage of the capability of producing thousands of peptides on a solid-phase surface for direct binding assays,⁶¹ is to study binding of a monoclonal antibody to every possible combination of six amino acids.⁶¹ If the assembled topographic site can be mimicked by a combination of six amino acids not corresponding to any continuous segment of the protein sequence but structurally resembling a part of the surface, then one can produce a “mimotope” defining the specificity of that antibody.⁶¹ Mimotopes have become widely used and can be combined with mutational analysis to map assembled topographic epitopes.⁶² Mimetics have even been made for quaternary structural epitopes.⁶³ Many mimotope approaches use phase display peptide libraries to map epitopes of monoclonal antibodies.⁶⁴^,⁶⁵^,⁶⁶ However, other studies have been less optimistic about the ability to predict assembled topographic or discontinuous epitopes from mimotope binding⁶⁷ or random peptide libraries.⁶⁸

Myoglobin also serves as a good model protein antigen for studying the range of variation of antigenic determinants from those that are more sequential in nature to those that do not even exist without the native conformation of the protein (see Fig. 23.3). A good example of the first more segmental type of determinant is that consisting of residues 15 to 22 in the amino terminal portion of the molecule. Crumpton and Wilkinson⁶⁹ first discovered that the chymotrypsin cleavage fragment consisting of residues 15 to 29 had antigenic activity for antibodies raised to either native or apomyoglobin. Two other groups⁴⁴^,⁷⁰ then found that synthetic peptides corresponding to residues 15 to 22 bind antibodies made to native sperm whale myoglobin, even though the synthetic peptides were only seven to eight residues long. Peptides of this length do not spend much time (in solution) in a conformation corresponding to that of the native protein. On the other hand, these synthetic peptides had a several hundred-fold lower affinity for the antibodies than did the native protein. Thus, even if most of the determinant was included in the consecutive sequence 15 to 22, the antibodies were still much more specific for the native conformation of this sequence than for the random conformation peptide. Moreover, there was no evidence to exclude the participation of other residues, nearby on the surface of myoglobin but not in this sequence, in the antigenic determinant.⁷¹^,⁷²^,⁷³^,⁷⁴^*

A good example of the importance of secondary structure is the case of the loop peptide (residues 64 to 80) of hen egg white lysozyme.⁷⁵ This loop in the protein sequence is created by the disulfide linkage between cysteine residues 64 and 80 and has been shown to be a major antigenic determinant for antibodies to lysozyme.⁷⁵ The isolated peptide 60 to 83, containing the loop, binds antibodies with high affinity, but opening of the loop by cleavage of the disulfide bond destroys most of the antigenic activity for antilysozyme antibodies.⁷⁵

At the other end of the range of conformational requirements are those determinants involving residues far apart in the primary sequences that are brought close together on the surface of the native molecule by its folding in three dimensions, called assembled topographic determinants.⁵⁰^,⁵¹ Of six monoclonal antibodies to sperm whale myoglobin studied by Berzofsky et al.,⁴³^,⁷⁶ none bound to any of the three cyanogen bromide (CNBr) cleavage fragments of myoglobin that together span the whole sequence of the molecule. Therefore, these monoclonal antibodies (all with affinities between 2 × 10⁸ and 2 × 10⁹ M^-1) were all highly specific for the native conformation. These were studied by comparing the relative affinities for a series of native myoglobins from different species with known amino acid sequences. This approach allowed the definition of some of the residues involved in binding to three of these antibodies. Two of these three monoclonal antibodies were found to recognize topographic determinants, as defined previously. One recognized a determinant including Glu 4 and Lys 79, which come within about 2 Å of each other to form a salt bridge in the native molecule (see Fig. 23.3A, B). The other antibody recognized a determinant involving Glu 83, Ala 144, and Lys 145 (see Fig. 23.3A). Again, these are far apart in the primary sequence but are brought within 12 Å of each other by the folding of the molecule in its native conformation. Similar examples have been reported for monoclonal antibodies to human myoglobin⁷⁷ and to lysozyme³⁷^,⁵⁸ as well as the HIV-1 envelope protein (neutralizing epitopes)⁷⁸^,⁷⁹ and the prion protein.⁸⁰ Other examples of such conformation-dependent antigenic determinants have been suggested using conventional antisera to such proteins as insulin,⁸¹ hemoglobin,⁸² tobacco mosaic virus,⁸³ and cytochrome c.⁸⁴ Moreover, the crystallographic structures of lysozyme-antibody³⁴^,³⁶^,³⁷ and neuraminidase-antibody³⁵ complexes, as well as HIV-1 envelope antibody complexes,⁷⁸^,⁷⁹ show clearly that, in both cases, the epitope bound is an assembled topographic site.
In the case of the three monoclonal antibodies binding to nonoverlapping sites of lysozyme (Fig. 23.4), it is clear that the footprints of all three antibody-combining sites cover more than one loop of polypeptide chain, and thus, each encompasses an assembled topographic site.³⁷ This result illustrates the concept that most antibody-combining sites must interact with more than a continuous loop of polypeptide chain and thus must define assembled topographic sites.⁴⁸ Another important example is represented by neutralizing antibodies to the HIV envelope protein that similarly bind assembled topographic sites⁸⁵^,⁸⁶ (see the end of this section).

How frequent are antibodies specific for topographic determinants compared to those that bind consecutive sequences when conventional antisera are examined? This question was studied by Lando et al.,⁸⁷ who passed goat, sheep, and rabbit antisera to sperm whale myoglobin over columns of myoglobin fragments, together spanning the whole sequence. After removal of all antibodies binding to the fragments, 30% to 40% of the antibodies remained that still bound to the native myoglobin molecule with high affinity but did not bind to any of the fragments in solution by radioimmunoassay. Thus, in four of four antimyoglobin sera tested, 60% to 70% of the antibodies could bind peptides, and 30% to 40% could bind only native-conformation intact protein.

On the basis of studies such as these, it has been suggested that much of the surface of a protein molecule may be antigenic,⁵⁰^,⁸⁸ but that the surface can be divided up into antigenic domains.⁴³^,⁷³^,⁷⁴^,⁷⁷ Each of these domains consists of many overlapping determinants recognized by different antibodies.

An additional interesting point is that in three published crystal structures of protein antigen-antibody complexes, the contact surfaces were broad, with local complementary pairs of concave and convex regions in both directions.³⁴^,³⁵^,³⁶^,³⁷ Thus, the concept of an antigen binding in the groove or pocket of an antibody may be oversimplified, and antibodies may sometimes bind by extending into pockets on an antigen.

Further information on the subjects discussed in this section is available in the reviews by Sela,⁴² Crumpton,⁵³ Reichlin,⁸⁹ Kabat,⁹⁰ Benjamin et al.,⁵⁰ Berzofsky,⁵¹ Getzoff et al.,⁴¹ and Davis and Padlan.³⁷

Conformational Equilibria of Protein and Peptide Antigenic Determinants

There are several possible mechanisms to explain why an antibody specific for a native protein will bind a peptide fragment in random conformation with lower affinity. Of course, the peptide may not contain all the contact residues of the antigenic determinant so that the binding energy would be lower. However, for cases in which all the residues in the determinant are present in the peptide, several mechanisms still remain. First, the affinity may be lower because the topography of the residues in the peptide may not produce as complementary a fit in the antibodycombining site as the native conformation would. Second, the apparent affinity may be reduced because only a small fraction of the peptide molecules are in a native-like conformation at any time, assuming that the antibody binds only to the native conformation. Because the concentration of peptide molecules in native conformation is lower than the total peptide concentration by a factor that corresponds to the conformational equilibrium constant of the peptide, the apparent affinity is also lower by this factor. This model is analogous to an allosteric model. A third, intermediate hypothesis would suggest that initial binding of the peptide in a nonnative conformation occurs with submaximal complementarity and is followed by an intramolecular conformational change in the peptide to achieve energy minimization by assuming a native-like conformation. This third hypothesis corresponds to an induced fit model. The loss of affinity is due to the energy required to change the conformation of the peptide, which in turn corresponds to the conformational equilibrium constant in the second hypothesis. To some extent, these models could be distinguished kinetically, as the first hypothesis predicts a faster “on” rate and a faster “off” rate than does the second hypothesis.⁹¹ Such kinetic approaches have likewise been used to support an “encounter-docking” model related to this concept.⁹²

Although not the only way to explain the data, the second hypothesis is useful because it provides a method to estimate the conformational equilibria of proteins and peptides.⁵²^,⁹³ The method assumes the second hypothesis, which can be expressed as follows:

where A = antibody, P_n = native peptide, and P_r = random conformation peptide so that

Thus, the ratio of the apparent association constant for peptide to the measured association constant for the native molecule should give the conformational equilibrium constant of the peptide. Note the implicit assumption that the total peptide concentration can be approximated by [P_r]. This will generally be true, as most peptide fragments of proteins demonstrate little native conformation, that is, K_conf = [Pn] / [Pr] is much less than one. Also note that if the first hypothesis (or third) occurs to some extent, this method will overestimate K_conf. On the other hand, if the affinity for the peptide is lower because it lacks some of the contact residues of the determinant, this method will underestimate K_conf (by assuming that all the affinity difference is due to conformation). To some extent, the two errors may partially cancel out. When this method was used to determine the K_conf for a peptide from staphylococcal nuclease, a value of 2 × 10^-4 (unitless because it is a ratio of two concentrations) was obtained.⁵² Similarly, when antibodies raised to a peptide fragment were used, it was possible to estimate the fraction of time the native nuclease spends in nonnative
conformations.⁹³ In this case, the K_conf was found to be about 3000-fold in favor of the native conformation.

Antipeptide Antibodies that Bind to Native Proteins at a Specific Site

In light of the conformational differences between native proteins and peptides and the observed K_conf effects shown by antibodies to native proteins when tested on the corresponding peptides, it was somewhat surprising to find that antibodies to synthetic peptides show extensive cross-reactions with native proteins.⁹⁴^,⁹⁵ These two types of cross-reactions can be thought of as working in opposite directions: The binding of antiprotein antibodies to the peptide is inefficient, whereas the binding of antipeptide antibodies to the protein is quite efficient and commonly observed. This finding is quite useful, as automated solid-phase peptide synthesis has become readily available. This has been particularly useful in three areas: exploitation of protein sequences deduced by recombinant deoxyribonucleic acid (DNA) methods, preparation of site specific antibodies, and the attempt to focus the immune response on a single protein site that is biologically important but may not be particularly immunogenic. This section focuses on the explanation of the cross-reaction, uses of the cross-reaction, and the potential limitations regarding immunogenicity.

The basic assumption is that antibodies raised against peptides in an unfolded structure will bind the corresponding site on proteins folded into the native structure.⁹⁵ This is not immediately obvious, as antibody binding to antigen is the direct result of the antigen fitting into the binding site. Affinity is the direct consequence of “goodness of fit” between antibody and antigen, whereas antibody specificity is due to the inability of other antigens to occupy the same site. How then can the antipeptide antibodies overcome the effect of K_conf and still bind native proteins with good affinity and specificity? The whole process depends on the antibodybinding site forming a three-dimensional space and the antigen filling it in an energetically favorable way.

Because the peptides are randomly folded, they rarely occupy the native conformation, so they are not likely to elicit antibodies against a conformation they do not maintain. If the antibodies are specific for a denatured structure, then, like the myoglobin molecules that were denatured to apomyoglobin by antibody binding,⁵³ the cross-reaction may depend on the native protein’s ability to assume different conformational states. If the native protein is quite rigid, then the possibility of it assuming a random conformation is quite small; if it is a flexible three-dimensional spring, then local unfolding and refolding may occur all the time. Local unfolding of protein segments may permit the immunologic cross-reaction with antipeptide antibodies, as a flexible segment could assume many of the same conformations as the randomly folded peptide.⁹⁵ On the other hand, peptides with more stable conformations may be more likely to elicit antibodies that bind both the peptide and the native protein.⁹⁶ To this end, scaffolding has been used to maintain the conformation of peptides or protein fragments to be used as immunogens/vaccines, such as for respiratory syncytial virus⁹⁷ or HIV epitopes.⁹⁸^,⁹⁹

In contrast, the ability of proteins to crystallize (a feature that allows the study of their structure by x-ray crystallography) has long been taken as evidence of protein rigidity.¹⁰⁰ In addition, the existence of discrete functional states of allosteric enzymes¹⁰¹ provides additional evidence of stable structural states of a protein. Finally, the fact that antibodies can distinguish native from denatured forms of intact proteins is well known for proteins such as myoglobin.⁵³

However, protein crystals are a somewhat artificial situation, as the formation of the crystal lattice imposes order on the components, each of which occupies a local energy minimum at the expense of considerable loss of randomness (entropy). Thus, the crystal structure may have artificial rigidity that exceeds the actual rigidity of protein molecules in solution. On the contrary, we may attribute some of the considerable difficulty in crystallizing proteins to disorder within the native conformation. Second, allosterism may be explained by two distinct conformations that are discrete without being particularly rigid. Finally, the ability to generate antiprotein antibodies that are conformation specific does not rule out the existence of antipeptide antibodies that are not. All antibodies are probably specific for some conformation of the antigen, but this need not be the crystallographic native conformation in order to achieve a significant affinity for those proteins or protein segments that have a “loose” native conformation.

Antipeptide antibodies have proved to be very powerful reagents when combined with recombinant DNA methods of gene sequencing.⁹⁵^,¹⁰² From the DNA sequence, the protein sequence is predicted. A synthetic peptide is constructed, coupled to a suitable carrier molecule, and used to immunize animals. The resulting polyclonal antibodies can be detected with a peptide-coated enzyme-linked immunosorbent assay plate (see Chapter 7). They are used to immunoprecipitate the native protein from a ³⁵S-labeled cell lysate and thus confirm expression of the gene product in these cells. The antipeptide antibodies can also be used to isolate the previously unidentified gene product of a new gene. The site-specific antibodies are also useful in detecting posttranslational processing, as they bind all precursors and products that contain the site. In addition, because the antibodies bind only to the site corresponding to the peptide, they are useful in probing structure-function relationships. They can be used to block the binding of a substrate to an enzyme or the binding of a virus to its cellular receptor.

Immunogenicity of Proteins and Peptides

Up to this point, we have considered the ability of antibodies to react with proteins or peptides as antigens. However, immunogenicity refers to the ability of these compounds to elicit antibodies following immunization. Several factors limit the immunogenicity of different regions of proteins, and these have been divided into those that are intrinsic to protein structure itself versus those extrinsic to the antigen that are related to the responder and vary from one animal or species to another.⁵¹ In addition, we consider the special case of peptide immunogenicity, as it applies to vaccine development. The features of protein structure that have been suggested to explain the results include surface accessibility
of the site, hydrophilicity, flexibility, and proximity to a site recognized by helper T cells.

When the x-ray crystallographic structure and antigenic structure are known for the same protein, it is not surprising to find that a series of monoclonal antibodies binding to a molecule such as influenza neuraminidase choose an overlapping pattern of sites at the exposed head of the protein.¹⁰³

The stalk of neuraminidase was not immunogenic apparently because it was almost entirely covered by carbohydrate. Beyond such things as carbohydrate, which may sterically interfere with antibody binding to protein, accessibility on the surface is clearly a sine qua non for an antigenic determinant to be bound by an antibody specific for the native conformation, without any requirement for unfolding of the structure.⁵¹ Several measures of such accessibility have been suggested. All these require knowledge of the x-ray crystallographic three-dimensional structure. Some have measured accessibility to solvent by rolling a sphere with the radius of a water molecule over the surface of a protein.¹⁰⁴^,¹⁰⁵ Others have suggested that accessibility to water is not the best measure of accessibility to antibody and have demonstrated a better correlation by rolling a sphere with the radius of an antibody-combining domain.¹⁰⁶ Another approach to predicting antigenic sites on the basis of accessibility is to examine the degree of protrusion from the surface of the protein.¹⁰⁷ This was done by modeling the body of the protein as an ellipsoid and examining which amino acid residues remain outside ellipsoids of increasing dimensions. The most protruding residues were found to be part of antigenic sites bound by antibodies, but usually, these sites had been identified by using short synthetic peptides and so were segmental in nature. As noted previously, for an antigenic site to be contained completely within a single continuous segment of protein sequence, the site is likely to have to protrude from the surface, as otherwise residues from other parts of the sequence would fall within the area contacting the antibody.⁴⁸ However, inability of such surface or protrusion information to predict antigenic sites has also been encountered in some studies.⁴⁹

Because the three-dimensional structure of most proteins is not known, other ways of predicting surface exposure have been proposed for the vast majority of antigens. For example, hydrophilic sites tend to be found on the water-exposed surface of proteins. Thus, hydrophilicity has been proposed as a second indication of immunogenicity.¹⁰⁸^,¹⁰⁹^,¹¹⁰ This model has been used to analyze 12 proteins with known antigenic sites: The most hydrophilic site of each protein was indeed one of the antigenic sites. However, among the limitations are the facts that a significant fraction of surface residues can be nonpolar,¹⁰⁴^,¹⁰⁵ and that several important examples of hydrophobic and aromatic amino acids involved in the antigenic sites are known.⁴²^,⁸³^,¹¹¹^,¹¹² Specificity of antibody binding likely depends on the complementarity of surfaces for hydrogen bonding and polar bonding as well as van der Waals contacts, whereas hydrophobic interactions and the exclusion of water from the interacting surfaces of proteins may contribute a large but nonspecific component to the energy of binding.¹¹³ Another study suggested that amino acid pairs were better predictors of epitopes.¹¹⁴

A third factor suggested to play a role in immunogenicity of protein epitopes is mobility. Measurement of mobility in the native protein is largely dependent on the availability of a high-resolution crystal structure, so its applicability is limited to only a small subset of proteins. Furthermore, it has been studied only for antibodies specific for segmental antigenic sites; therefore, it may not apply to the large fraction of antibodies to assembled topographic sites. Studies of mobility have taken two directions. The case of antipeptide antibodies has already been discussed, in which antibodies made to peptides corresponding to more mobile segments of the native protein were more likely to bind to the native protein.⁹⁵^,¹¹⁵ This is not considered just a consequence of the fact that more mobile segments are likely to be those on the surface and therefore more exposed because in the case of myohemerythrin (which was used as a model), two regions of the native protein that were equally exposed but less mobile did not bind nearly as well to the corresponding antipeptide antibodies.¹¹⁶ However, as is clear from the previous discussion, this result applies to antibodies made against short peptides and therefore is not directly relevant to immunogenicity of parts of the native protein. Rather, it concerns the cross-reactivity of antipeptide antibodies with the native protein and therefore is of considerable practical importance for the purposes outlined in the section on antipeptide antibodies.

Studies in the other direction—that is, of antibodies raised against native proteins—would be by definition more relevant to the question of immunogenicity of parts of the native protein. Westhof et al.¹¹⁷ used a series of hexapeptides to determine the specificity of antibodies raised against native tobacco mosaic virus protein and found that six of the seven peptides that bound antibodies to native protein corresponded to peaks of high mobility in the native protein. The correlation was better than could be accounted for just by accessibility because three peptides that corresponded to exposed regions of only average mobility did not bind antibodies to the native protein. However, when longer peptides—on the order of 20 amino acid residues—were used as probes, it was found that antibodies were present in the same antisera that bound to less mobile regions of the protein.¹¹⁸ They simply had not been detected with the short hexapeptides with less conformational stability. Thus, it was not that the more mobile regions were necessarily more immunogenic but rather that antibodies to these were more easily detected with short peptides as probes. A similar good correlation of antigenic sites with mobile regions of the native protein in the case of myoglobin¹¹⁷ may also be attributed to the fact that seven of the nine sites were defined with short peptides of six to eight residues.⁷¹ Again, this result becomes a statement about cross-reactivity between peptides and native protein rather than about the immunogenicity of the native protein. For reviews, see Van Regenmortel¹¹⁹ and Getzoff et al.⁴¹

To address the role of mobility in immunogenicity, an attempt was made to quantitate the relative fraction of antibodies specific for different sites on the antigen myohemerythrin.¹²⁰ The premise was that, although the entire surface of the protein may be immunogenic, certain regions may elicit significantly more antibodies than others and therefore may
be considered immunodominant or at least more immunogenic. Because this study was done with short synthetic peptides from 6 to 14 residues long based on the protein sequence, it was limited to the subset of antibodies specific for segmental antigenic sites. Among these, it was clear that the most immunogenic sites were in regions of the surface that were most mobile, convex in shape, and often of negative electrostatic potential. Other more recent studies corroborate the greater immunogenicity of more flexible segments of protein structures.¹²¹ The role of these parameters has been reviewed.⁴¹

These results have important practical and theoretical implications. First, to use peptides to fractionate antiprotein antisera by affinity chromatography, peptides corresponding to more mobile segments of the native protein should be chosen when possible. If the crystal structure is not known, it may be possible to use peptides from amino or carboxyl termini or from exon-intron boundaries, as these are more likely to be mobile.¹¹⁵ Second, these results may explain how a large but finite repertoire of antibody-producing B cells can respond to any antigen in nature or even artificial antigens never encountered in nature. Protein segments that are more flexible may be able to bind by induced fit in an antibody-combining site that is not perfectly complementary to the average native structure.⁴¹^,⁵¹ Indeed, evidence from the crystal structure of antigen-antibody complexes¹²²^,¹²³^,¹²⁴ suggests that mobility in the antibody-combining site as well as in the antigen may allow both reactants to adopt more complementary conformations on binding to each other, that is, a two-way induced fit. A nice example comes from the study of antibodies to myohemerythrin,¹²³ in which the data suggested that initial binding of exposed side chains of the antigen to the antibody promoted local displacements that allowed exposure and binding of other, previously buried residues that served as contact residues. The only way this could occur would be for such residues to become exposed during the course of an induced fit conformational change in the antigen.⁴¹^,¹²³ In a second very clear example of induced fit, the contribution of antibody mobility to peptide binding was demonstrated for a monoclonal antibody to peptide 75 to 110 of influenza hemagglutinin, which was crystallized with or without peptide in the binding site and analyzed by x-ray crystallography for evidence of an induced fit.¹²⁴ Despite flexibility of the peptide, the antibody-binding site probably could not accommodate the peptide without a conformational change in the third complementarity determining region of the heavy chain, in which an asparagine residue of the antibody was rotated out of the way to allow a tyrosine residue of the peptide to fit in the binding pocket of the antibody.¹²⁴

Regarding host-limited factors, immunogenicity is certainly limited by self-tolerance. Thus, the repertoire of potential antigenic sites on mammalian protein antigens such as myoglobin or cytochrome c can be thought of as greatly simplified by the sharing of numerous amino acids with the endogenous host protein. For mouse, guanaco, or horse cytochrome c injected into rabbits, each of the differences between the immunogen and rabbit cytochrome c is seen as an immunogenic site on a background of immunologically silent residues.⁵⁰^,⁸⁴^,¹²⁵ In another example, rabbit and dog antibodies to beef myoglobin bound almost equally well to beef or sheep myoglobin.¹²⁶ However, sheep antibodies bound beef but not sheep myoglobin, even though these two myoglobins differ by just six amino acids. Thus, the sheep immune system was able to screen out those clones that would be autoreactive with sheep myoglobin.

Ir genes of the host also play an important role in regulating the ability of an individual to make antibodies to a specific antigen.¹²⁷ These antigen-specific immune response genes are among the major histocompatibility complex (MHC) genes that code for transplantation antigens. Structural mutations, gene transfer experiments, and biochemical studies¹²⁷ all indicate that Ir genes are actually the structural genes for MHC antigens. The mechanism of action of the MHC antigens works through their effect on helper T cells (described later in this chapter). There appear to be constraints on which B and T cells of a given specificity can help,¹²⁸^,¹²⁹ a process called T-B reciprocity.¹³⁰ Thus, if Ir genes control helper T-cell specificity, they will in turn limit which B cells are activated and which antibodies are made.

The immunogenicity of peptide antigens is also limited by intrinsic and extrinsic factors. With less structure to go on, each small peptide must presumably contain some non-self-structural feature in order to overcome self-tolerance. In addition, the same peptide must contain antigenic sites that can be recognized by helper T cells as well as by B cells. When no T-cell site is present, three approaches may be helpful: graft on a T-cell site, couple the peptide to a carrier protein, or overcome T-cell nonresponsiveness to the available structure with various immunologic agents, such as interleukin 2.

An example of a biologically relevant but poorly immunogenic peptide is the asparagine-alanine-asparagine-proline (NANP) repeat unit of the circumsporozoite (CS) protein of malaria sporozoites. A monoclonal antibody to the repeat unit of the CS protein can protect against murine malaria.¹³¹ Thus, it would be desirable to make a malaria vaccine of the repeat unit of Plasmodium falciparum (NANP)_n. However, only mice of one MHC type (H-2^b) of all mouse strains tested were able to respond to (NANP)_n.¹³²^,¹³³ One approach to overcome this limitation is to couple (NANP)_n to a site recognizable by T cells, perhaps a carrier protein such as tetanus toxoid.¹³⁴ In human trials, this conjugate was weakly immunogenic and only partially protective. Moreover, as helper T cells produced by this approach are specific for the unrelated carrier, a secondary or memory response would not be expected to be elicited by the pathogen itself.

Another choice might be to identify a T-cell site on the CS protein itself and couple the two synthetic peptides together to make one complete immunogen. The result with one such site, called Th2R, was to increase the range of responding mouse MHC types by one, to include H-2^k as well as H-2^b.¹³⁵ This approach has the potential advantage of inducing a state of immunity that could be boosted by natural exposure to the sporozoite antigen. As CS-specific T and B cells are both elicited by the vaccine, natural exposure to the antigen could help maintain the level of immunity during the entire period of exposure.

Another strategy to improve the immunogenicity of peptide vaccines is to stimulate the T- and B-cell responses artificially by adding interleukin 2 to the vaccine. Results with myoglobin indicate that genetic nonresponsiveness can be overcome by appropriate doses of interleukin 2.¹³⁶ The same effect was found for peptides derived from malaria proteins.¹³⁷^,^137a

One of the most important possible uses of peptide antigens is as synthetic vaccines. However, even though it is possible to elicit with synthetic peptides anti-influenza antibodies to nearly every part of the influenza hemagglutinin,⁹⁴ antibodies that neutralize viral infectivity have not been elicited by immunization with synthetic peptides. This may reflect the fact that antibody binding by itself often does not result in virus inactivation. Viral inactivation occurs only when antibody interferes with one of the steps in the life cycle of the virus, including binding to its cell surface receptor, internalization, and virus uncoating within the cell. Apparently, antibodies can bind to most of the exposed surface of the virus without affecting these functions. Only those antibodies that bind to certain “neutralizing” sites can inactivate the virus. In addition, as in the case of the VP1 coat protein of poliovirus, certain neutralizing sites are found only on the native protein and not on the heat-denatured protein.¹³⁸ Thus, not only the site but also the conformation that is bound by the antibodies may be important for the antibody to inactivate the virus. These sites may often be assembled topographic sites not mimicked by peptide segments of the sequence. Perhaps binding of an antibody to such an assembled site can alter the relative positions of the component subsites so as to induce an allosteric neutralizing effect. Alternatively, antibodies to such an assembled site may prevent a conformational change necessary for activity of the viral protein.

One method of mapping neutralizing sites is based on the use of neutralizing monoclonal antibodies. The virus is grown in the presence of neutralizing concentrations of the monoclonal antibody, and virus mutants are selected for the ability to overcome antibody inhibition. These are sequenced, revealing the mutation that permits “escape” by altering the antigenic site for that antibody. This method has been used to map the neutralizing sites of influenza hemagglutinin¹³⁹ as well as poliovirus capsid protein VP1.¹⁴⁰ The influenza escaping mutations are clustered to form an assembled topographic site, with mutations distant from each other in the primary sequence of hemagglutinin but brought together by the three-dimensional folding of the native protein. At first, it was thought that neutralization was the result of steric hindrance of the hemagglutinin-binding site for the cell surface receptor of the virus.¹⁴¹ However, similar work with poliovirus reveals that neutralizing antibodies that bind to assembled topographic sites may inactivate the virus at less than stoichiometric amounts, when at least half of the sites are unbound by antibody.¹⁴² The neutralizing antibodies all cause a conformational change in the virus, which is reflected in a change in the isoelectric point of the particles from pH 7 to pH 4.¹⁴⁰^,¹⁴³ Antibodies that bind without neutralizing do not cause this shift. Thus, an alternative explanation for the mechanism of antibody-mediated neutralization is the triggering of the virus to self-destruct. Perhaps the reason that neutralizing sites are clustered near receptor-binding sites is that occupation of such sites by antibody mimics events normally caused by binding to the cellular receptor, causing the virus to prematurely trigger its cell entry mechanisms. However, in order to transmit a physiologic signal, the antibody may need to bind viral capsid proteins in the native conformation (especially assembled topographic sites), which antipeptide antibodies may fail to do. Antibodies of this specificity are similar to the viral receptors on the cell surface, some of which have been cloned and expressed without their transmembrane sequences as soluble proteins. The soluble recombinant receptors for poliovirus¹⁴⁴ and HIV-1¹⁴⁵^,¹⁴⁶^,¹⁴⁷ exhibit high-affinity binding to the virus and potent neutralizing activity in vitro. The HIV-1 receptor, cluster of differentiation (CD)4, has been combined with the human Ig heavy chain in a hybrid protein CD4-Ig,¹⁴⁸ which spontaneously assembles into dimers and resembles a monoclonal antibody, in which the binding site is the same as the receptor-binding site for HIV-1. In these recombinant constructs, high-affinity binding depends on the native conformation of the viral envelope glycoprotein gp120. Binding of CD4 to gp120 elicits a conformational change exposing a CD4-induced epitope, and fusions of CD4 domains to gp120 can be used as vaccines to elicit such antibodies.¹⁴⁹

For HIV-1, two types of neutralizing antibodies have been identified. The first type binds a continuous or segmental determinant, such as the “V3 loop” sequence between amino acids 296 and 331 of gp120.¹⁵⁰^,¹⁵¹^,¹⁵² Antipeptide antibodies against this site can neutralize the virus.¹⁵⁰ However, because this site is located in a highly variable region of the envelope, these antibodies tend to neutralize a narrow range of viral variants with nearly the same sequence as the immunogen. Even for this highly variable site, more broadly neutralizing antibodies can be obtained that recognize conserved conformations.¹⁵³^,¹⁵⁴^,¹⁵⁵ The second type of neutralizing antibody binds conserved sites on the native structure of gp120, allowing them to neutralize a broad spectrum of HIV-1 isolates. These antibodies are commonly found in the sera of infected patients,¹⁵⁶ and a panel of neutralizing monoclonals derived from these subjects has been analyzed.

These monoclonals can be divided into three types. One group, possibly the most common ones in human polyclonal sera, bind at or near the CD4 receptor-binding site of gp120.⁷⁹^,¹⁵⁷^,¹⁵⁸^,¹⁵⁹^,¹⁶⁰^,¹⁶¹ A second type of monoclonal, called 2G12, binds a conformational site on gp120 that also depends on glycosylation, but has no direct effect on CD4 binding.¹⁶² A third type, quite rare in human sera, is represented by monoclonal antibody 2F5¹⁶³ and binds a conserved site on the transmembrane protein gp41. Although this site is contained on a linear peptide ELDKWA, antibodies such as 2F5 cannot be elicited by immunizing with the peptide, again suggesting the conformational aspect of this site.¹⁶⁴^,¹⁶⁵ Indeed, the binding of antibodies to this membrane-proximal site has even been found to involve interaction of the antibody with the lipid membrane.¹⁶⁶ One might view this intriguing case as an example of a discontinuous or assembled topographic site created by the proximity of residues of the protein with structures in the lipid membrane, thus, spanning more than just different parts of the antigenic protein.

These monoclonals neutralize fresh isolates, as well as laboratory-adapted strains, and they neutralize viruses tropic for T cells or macrophages,¹⁶⁷ regardless of the use of CXCR4 or CCR5 as second receptor. These monoclonals, which target different sites, act synergistically. A cocktail combining all three types of monoclonals can protect monkeys against iv challenge or vaginal challenge with a simian immunodeficiency virus/HIV hybrid virus, indicating the potential for antibodies alone to prevent HIV infection.¹⁶⁸^,¹⁶⁹ Because each of the three conserved neutralizing determinants depends on the native conformation of the protein,¹⁷⁰ a prospective gp120 vaccine (or gp160 vaccine) would need to be in the native conformation to be able to elicit these antibodies.

ANTIGENIC DETERMINANTS RECOGNIZED BY T CELLS

Studies of T-cell specificity for antigen were motivated by the fact that the immune response to protein antigens is regulated at the T-cell level. A hapten, not immunogenic by itself, will elicit antibodies only when coupled to a protein that elicits a T-cell response in that animal. This ability of the protein component of the conjugate to confer immunogenicity on the hapten has been termed the “carrier effect.” Recognition of the carrier by specific helper T cells induces the B cells to make antibodies. Thus, the factors contributing to a good T-cell response appear to control the B-cell response as well.

“Nonresponder” animals display an antigen-specific failure to respond to a protein antigen, both for T cells and antibody responses. The “high responder” phenotype for each antigen is a genetically inheritable, usually dominant trait. Using inbred strains of mice, the genes controlling the immune response were found to be tightly linked to the MHC genes.¹²⁷^,¹⁷¹ MHC-linked immune responsiveness has been shown to depend on the T-cell recognition of antigen bound within a groove of MHC antigens of the antigen-presenting cell (APC) (discussed herein below and see Chapters 21 and 22). The recognition of antigen in association with MHC molecules of the B cell is necessary for carrier-specific T cells to expand and provide helper signals to B cells.

In contrast to the range of antigens recognized by antibodies, the repertoire recognized by helper and cytotoxic T cells appears to be limited largely to protein and peptide antigens, although exceptions such as the small molecule tyrosine-azobenzene arsonate¹⁷² exist. Once the antigenic determinants on proteins recognized by T cells are identified, it may be possible to better understand immunogenicity and perhaps even to manipulate the antibody response to biologically relevant antigens by altering the helper T-cell response to the antigen.

Defining Antigenic Structures

Polyclonal T-Cell Response

Significant progress in understanding T-cell specificity was made possible by focusing on T-cell proliferation in vitro, mimicking the clonal expansion of antigen specific clones in vivo. The proliferative response depends on only two cells: the antigen-specific T cell and an APC, usually a macrophage, dendritic cell, or B cell. The growth of T cells in culture is measured as the incorporation of [³H]thymidine into newly formed DNA. Under appropriate conditions, thymidine incorporation increases with antigen concentration. This assay permits the substitution of different APCs and is highly useful in defining the MHC and antigen-processing requirements of the APCs.

Using primarily this assay, several different approaches have been taken to mapping T-cell epitopes. First, T cells immunized to one protein have been tested for a proliferative response in vitro to the identical protein or to a series of naturally occurring variants. By comparing the sequences of stimulatory and nonstimulatory variants, it was possible to identify potential epitopes recognized by T cells. For example, the T-cell response to myoglobin was analyzed by immunizing mice with sperm whale or horse myoglobin and testing the resulting T cells for proliferation in response to a series of myoglobins from different species with known amino acid substitutions.¹⁷³ Reciprocal patterns were observed in T cells from mice immunized with sperm whale or horse myoglobin. The response to the cross-stimulatory myoglobins was as strong as to the myoglobin used to immunize the mice. This suggested that a few shared amino acid residues formed an immunodominant epitope, and that most substitutions had no effect on the dominant epitope. A comparison of the sequences revealed that substitutions at a single residue could explain the pattern observed. All myoglobins that cross-stimulated sperm whale-immune T cells had Glu at position 109, whereas all that cross-stimulated horse-immune T cells had Asp at 109. No member of one group could stimulate T cells from donors immunized with a myoglobin of the other group. This suggested that an immunodominant epitope recognized by T cells was centered on position 109, regardless of which amino acid was substituted. Usually, this approach has led to correct localization of the antigenic site in the protein,¹⁷³^,¹⁷⁴^,¹⁷⁵ but the possibility of long-range effects on antigen processing must be kept in mind (see the section on “Antigen Processing”). Also, this approach using natural variants is limited in that it can focus on the correct region of the molecule but cannot define the boundaries of the site. Site-directed mutagenesis may therefore expand the capabilities of this approach.

A second approach is to use short peptide segments of the protein sequence, taking advantage of the fact that T cells specific for soluble protein antigens appear to see only segmental antigenic sites not assembled topographic ones.¹²⁷^,¹⁷⁶^,¹⁷⁷^,¹⁷⁸^,¹⁷⁹^,¹⁸⁰ These may be produced by chemical or enzymatic cleavage of the natural protein,¹⁷⁸^,¹⁷⁹^,¹⁸⁰^,¹⁸¹^,¹⁸²^,¹⁸³^,¹⁸⁴^,¹⁸⁵^,¹⁸⁶ solid-phase peptide synthesis,¹⁸⁵^,¹⁸⁷^,¹⁸⁸^,¹⁸⁹^,¹⁹⁰ or recombinant DNA expression of cloned genes or gene fragments.¹⁹¹ In the case of class I MHC molecule-restricted cytotoxic T cells, viral gene deletion mutants expressing only part of the gene product have also been used.¹⁹²^,¹⁹³^,¹⁹⁴

In the case of myoglobin-specific T cells, mapping of an epitope to residue Glu 109 was confirmed by use of a synthetic peptide 102 to 118, which stimulated the T cells.¹⁸⁹^,¹⁹⁵ The T cells elicited by a myoglobin with either Glu or Asp 109 could readily distinguish between synthetic peptides containing Glu or Asp at this position. Similar results were obtained with cytochrome c, where the predominant site recognized by T cells was localized with sequence variants to the region
around residue 100 at the carboxyl end of cytochrome.¹⁷⁴ Furthermore, the response to cytochrome c peptide 81 to 104 was as great as the response to the whole molecule. This indicated that a 24 amino acid peptide contained an entire antigenic site recognized by T cells. The T cells could distinguish between synthetic peptides with Lys or Gln at position 99, although both were immunogenic with the same MHC molecule.¹⁹⁶^,¹⁹⁷^,¹⁹⁸ This residue determined T-cell memory and specificity, and so presumably was interacting with the T-cell receptor (TCR). A similar conclusion could be drawn for residue 109 of myoglobin. However, this type of analysis must be used with caution. When multiple substitutions at position 109 were examined for T-cell recognition and MHC binding, residue 109 was found to affect both functions.¹⁹⁹ The ultimate use of synthetic peptides to analyze the segmental sites of a protein that are recognized by T cells was to synthesize a complete set of peptides, each staggered by just one amino acid from the previous peptide, corresponding to the entire sequence of hen egg lysozyme.²⁰⁰ Around each immunodominant site, a cluster of several stimulatory peptides was found. The minimum “core” sequence consisted of just those residues shared by all antigenic peptides within a cluster, whereas the full extent of sequences spanning all stimulatory peptides within the same cluster defined the “determinant envelope.” These two ways of defining an antigenic site differ, and one interpretation is that each core sequence corresponds to an MHC-binding site, whereas the determinant envelope includes the many ways for T cells to recognize the same peptide bound to the MHC.

In each case, the polyclonal T-cell response could be mapped to a single predominant antigenic site. These results are consistent with the idea that each protein antigen has a limited number of immunodominant sites (possibly one) recognized by T cells in association with MHC molecules of the high-responder type. If none of the antigenic sites could associate with MHC molecules on the APCs, then the strain would be a low responder, and the antigen would have little or no immunogenicity.

Monoclonal T Cells

Further progress in mapping T-cell sites depended on the analysis of cloned T-cell lines. These were either antigenspecific T-cell lines made by the method of Komoto and Fathman²⁰¹ or T-cell hybridomas made by the method of Kappler et al.²⁰² In the former method, T cells are allowed to proliferate in response to antigen and APCs, rested, and then restimulated again. After stimulation, the blasts can be cloned by limiting dilution and grown from a single cell in the presence of interleukin 2. In the second method, enriched populations of antigen-specific T cells are fused with a drug-sensitive T-cell tumor, and the fused cells are selected for their ability to grow in the presence of the drug. Then the antigen specificity of each fused cell line must be determined. The key to determining this in a tumor line is that antigen-specific stimulation of a T-cell hybridoma results in release of interleukin 2, even though proliferation is constitutive. T cells produced by either method are useful in defining epitopes, measuring their MHC associations and studying antigen-processing requirements.

Monoclonal T cells may be useful in identifying which of the many proteins from a pathogen are important for T-cell responses. For instance, Young and Lamb²⁰³ have developed a way to screen proteins separated by SDS-polyacrylamide gel electrophoresis and blotted onto nitrocellulose for stimulation of T-cell clones and have used this to identify antigens of Mycobacterium tuberculosis.²⁰⁴ Mustafa et al.²⁰⁵ have even used T-cell clones to screen recombinant DNA expression libraries to identify relevant antigens of Mycobacterium leprae. Use of T cells to map epitopes has also been important in defining tumor antigens.²⁰⁶^,²⁰⁷^,²⁰⁸^,²⁰⁹^,²¹⁰^,²¹¹

Precise mapping of antigenic sites recognized by T cells was made possible by the fact that T cells would respond to peptide fragments of the antigen when they contain a complete antigenic determinant. A series of overlapping peptides can be used to walk along the protein sequence and find the antigenic site. Then, by truncating the peptide at either end, the minimum antigenic peptide can be determined. For example, in the case of myoglobin, a critical amino acid residue, such as Glu 109 or Lys 140, was found by comparing the sequences of stimulatory and nonstimulatory myoglobin variants and large CNBr cleavage fragments²¹² as previously discussed, and then a series of truncated peptides containing the critical residue was synthesized with different overlapping lengths at either end.¹⁸⁵^,¹⁸⁹ Because solid-phase peptide synthesis starts from a fixed carboxyl end and proceeds toward the amino end, it can be stopped at various positions to produce a nested series of peptides that vary in length at the amino end. In this way, it was found that two of the Glu 109-specific T-cell clones responded to synthetic peptides 102 to 118 and 106 to 118 but not to peptide 109 to 118.¹⁸⁹ One clone responded to peptide 108 to 118, whereas the other did not. Thus, the amino end of the peptide recognized by one clone was Ser 108, whereas the other clone required Phe 106 and/or Ile 107. Similar fine specificity differences have been observed with T-cell clones specific for the peptides 52 to 61 and 74 to 96 of hen egg lysozyme,¹⁸²^,²¹³^,²¹⁴ the peptide 323 to 339 of chicken ovalbumin,¹⁸³ and the peptide 81 to 104 of pigeon cytochrome c¹⁸⁸: The epitopes recognized by several T-cell clones overlap but are distinct. In addition, nine T-cell clones recognized a second T-cell determinant in myoglobin located around Lys 140, and each one responded to the CNBr cleavage fragment 132 to 153.²¹⁵ Further studies with a nested series of synthetic peptides showed that the stimulatory sequence is contained in peptide 136 to 145.¹⁸⁵

These findings can be generalized to characterize a large number of epitopes recognized by T cells from a number of protein antigens (Table 23.4).²¹²^,²¹³^,²¹⁴^,²¹⁵^,²¹⁶^,²¹⁷^,²¹⁸^,²¹⁹^,²²⁰^,²²¹^,²²²^,²²³^,²²⁴^,²²⁵^,²²⁶^,²²⁷ What these studies and others demonstrated about epitopes recognized by T cells is that in each case, the entire site is contained on a short peptide. MHC class I-restricted antigens also follow this rule,²²⁸ even when the protein antigen is normally expressed on the surface of infected cells. This applies to viral glycoproteins, such as influenza hemagglutinin, which are recognized by cytolytic T cells after antigen processing²²⁹ (see section on “Antigen Processing”). These peptides consist of no more than about 12 to 17 amino acid residues for class II MHC or 8 to 10 residues for class I. Within this size, they must contain all the information necessary to survive processing
within the APC, associate with the MHC antigen, and bind to the TCR, as discussed in the following sections.

TABLE 23.4 Examples of Immunodominant T-Cell Epitopes Recognized in Association with Class II MHC Molecules

Protein		T-Cell Antigenic Sites and Reference	Amphipathic Segments
Sperm whale myoglobin		69-78148	64-78
		102-118159	99-117
		132-145155	128-145
Pigeon cytochrome c		93-104158	92-103
Beef cytochrome c		11-25192	9-29
		66-80193	58-78
Influenza		109-119186	97-120
	Hemagglutinin	130-140187	—
	A/PR/8/34	302-313187,188	291-314
Pork insulin		B 5-16157	4-16
		A 4-14189	1-21
Chicken lysozyme		46-61185	—
		74-86184	72-86
		81-96175	86-102
		109-119145	—
Chicken ovalbumin		323-339153	329-346
Foot and mouth virus VP1		141-160191	148-165
Hepatitis B virus
	Pre-S	120-132190	121-135
	Major surface antigen	38-52194	36-49
		95-109194	—
		140-154194	—
&lgr; Repressor protein Cl		12-26195	8-25
Rabies virus spike glycoprotein precursor		32-44196	29-46
Adapted with permission from Schwartz et al.188

Sequential Steps that Focus the T-Cell Response on Immunodominant Determinants

In contrast to antibodies that bind all over the surface of a native protein⁵⁰ (see “Protein and Polypeptide Antigenic Determinants” section), it has been observed that T cells elicited by immunization with the native protein tend to be focused on one or a few immunodominant sites.²³⁰^,²³¹^,²³² This is true whether one deals with model mammalian or avian proteins such as cytochrome c,¹⁷⁹ myoglobin,¹⁷⁸^,¹⁸⁰ lysozyme,¹⁸²^,²¹⁴^,²³³^,²³⁴ insulin,¹⁸⁷^,²¹⁹ and ovalbumin,¹⁸³ or with bacterial, viral, and parasitic proteins from pathogens, such as influenza hemagglutinin²¹⁷ or nucleoprotein (NP),²²⁸ staphylococcal nuclease,²³⁵ or malarial CS protein.¹³⁵^,²³⁶ Because the latter category of proteins shares no obvious homology to mammalian proteins, the immunodominance of a few sites cannot be attributed simply to tolerance for the rest of the protein because of homologous host proteins. Moreover, immunodominance is not simply the preemption of the response by a single clone of predominant T cells because it has been observed that immunodominant sites tend to be the focus for a polyclonal response of a number of distinct T-cell clones recognizing overlapping subsites within the antigenic site or having different sensitivities to substitutions of amino acids within the site.¹⁸²^,¹⁸³^,¹⁸⁸^,¹⁸⁹^,²⁰⁰^,²¹³^,²¹⁴^,²³⁷

Immunodominant antigenic sites appear to be qualitatively different from other sites. For example, in the case of myoglobin, when the number of clones responding to different epitopes after immunization with native protein was quantitated by limiting dilution, it was observed that the bulk of the response to the whole protein in association with the high-responder class II MHC molecules was focused on a single site within residues 102 to 118¹⁹⁵ (Fig. 23.5). When T cells in the (high × low responder) F1 hybrid restricted to each MHC haplotype were compared, there was little difference in the responses to nondominant epitopes, and all the overall difference in magnitude of response restricted by the high versus low responder MHC could be attributed to the high response to the immunodominant determinant in the former and the complete absence of this response in the latter (see Fig. 23.5). Similar results were found for two different high-responder and two different low-responder MHC haplotypes.¹⁹⁵ Why did the response to the other sites not compensate for the lack of response to the immunodominant site in the low responders?
The greater frequency of T cells specific for the immunodominant site may in part be attributed to the large number of ways this site can be recognized by different T-cell clones, as mentioned previously, but this only pushes the problem back one level. Why is an immunodominant site the focus for so many different T-cell clones? Because the answer cannot depend on any particular T cell, it must depend on other factors primarily involved in the steps in antigen processing and presentation by MHC molecules.

FIG. 23.5. Frequency of High- and Low-Responder Major Histocompatibility Complex (MHC)-Restricted T Cells in F1 Hybrid. High responsiveness may be accounted for by the response to a single immunodominant epitope. Lymph node T cells from (low-responder [H-2^k] × high-responder [H-2^d]) F1 hybrid mice immunized with whole myoglobin were plated at different limiting dilutions in microtiter wells with either high- or low-responder presenting cells and myoglobin as antigen. The cells growing in each well were tested for responsiveness to whole myoglobin and to various peptide epitopes of myoglobin. The frequency of T cells of each specificity and MHC restriction was calculated from Poisson statistics and is plotted on the ordinate. Most of the difference in T-cell frequency between high- and low-responder restriction types (solid bars) can be accounted for by the presence of T cells responding to the immunodominant site at residues 102 to 118, accounting for more than twothirds of the high-responder myoglobin-specific T cells, in contrast to the absence of such T cells restricted to the low-responder MHC type. (Based on the data in Kojima et al.¹⁹⁵.)

It has also been observed that some peptides may be immunogenic themselves, but the T-cell response they elicit is specific only for the peptide and does not cross-react with the native protein nor do T cells specific for the native protein recognize this site.²³⁸^,²³⁹^,²⁴⁰ These are called cryptic determinants.²⁴⁰ The reasons for these differences may involve the way the native protein is processed to produce fragments distinct from, but including or overlapping, the synthetic peptides used in experiments and also the competition among sites within the protein for binding to the same MHC molecules, as discussed further in the next section. To understand these factors that determine dominance or crypticity, one must understand the steps through which an antigen must go before it can stimulate a T-cell response.

Unlike B cells, T-cell recognition of antigen depends on the function of another cell, the APC.²⁴¹ Antigen must pass through a number of intracellular compartments and survive processing and transport steps before it can be effectively presented to T cells. Following antigen synthesis in the cell (as in a virally infected cell) or antigen uptake via phagocytosis, pinocytosis, or, in some cases, receptor-mediated endocytosis, the subsequent steps include 1) partial degradation (“processing”) into discrete antigenic fragments that can be recognized by T cells, 2) transport of these fragments into a cellular compartment where MHC binding can occur, 3) MHC binding and assembly of a stable peptide-MHC complex, and 4) recognition of that peptide-MHC complex by the expressed T-cell repertoire. At each step, a potential antigenic determinant runs the risk of being lost from the process, for example, by excessive degradation or failure to meet the binding requirements needed for transport to the next step. Only those peptides that surmount the four selective hurdles will prove to be antigenic for T cells. We will now consider each step in detail, for its contribution to the strength and specificity of the T-cell response to protein antigens.

Antigen Processing

Influence of Antigen Processing on the Expressed T-Cell Repertoire. Several lines of evidence indicate that antigen processing plays a critical role in determining which potential antigenic sites are recognized and, therefore, what part of the potential T-cell repertoire is expressed upon immunization with a protein antigen. Because the T cell does not see the native antigen but only the products of antigen processing, it is not unreasonable that the nature of these products would at least partly determine which potential epitopes could be recognized by T cells.

One line of evidence that processing plays a major role in T-cell repertoire expression came from comparisons that were made of the immunogenicity of peptide versus native molecule in the cases of myoglobin²³⁸ or lysozyme.²³⁹ In the case of myoglobin, a site of equine myoglobin (residues 102 to 118) that did not elicit a response when H-2^k mice were immunized with native myoglobin nevertheless was found to be immunogenic when such mice were immunized with the peptide.²³⁸ Thus, the low responsiveness to this site in mice immunized with the native myoglobin was not due to either of the classical mechanisms of Ir gene defects—namely, a hole in the T-cell repertoire or a failure of the site to interact with MHC molecules of that strain. However, the peptide-immune T cells responded only poorly to native equine myoglobin in vitro. Thus, the peptide and the native molecule did not cross-react well in either direction. The problem was not simply a failure to process the native molecule to produce this epitope because (H-2^k × H-2^s) F1-presenting cells could present this epitope to H-2^s T cells when given native myoglobin but could not present it to H-2^k T cells. Also, because the same results applied to individual T-cell clones, the failure to respond to the native molecule was apparently not due to suppressor cells induced by the native molecule. Similar observations were made for the response to the peptide 74 to 96 of hen lysozyme in B10.A mice.²³⁹ The peptide, not the native molecule, induced T cells specific for this site, and these T cells did not cross-react with the native molecule. With these alternative mechanisms excluded, we are left with the conclusion that an appropriate peptide was produced, but it differed from the synthetic peptide in such a way that a hindering site outside the minimal antigenic site interfered with presentation by presenting cells of certain MHC types. Further evidence consistent with this mechanism came from the work of Shastri et al.,²⁴² who found that different epitopes within the 74 to 96 region of lysozyme were immunodominant in H-2^b mice when different forms of the immunogen were used.

Another line of evidence came from fine specificity studies of individual T-cell clones. Shastri et al.²⁴³ observed that H-2^b T-cell clones specific for hen lysozymes were about 100-fold more sensitive to ring-necked pheasant lysozyme than to hen lysozyme. Nevertheless, they were equally sensitive to the CNBr cleavage fragments containing the antigenic sites from both lysozymes. Thus, regions outside the minimal antigenic site removable by CNBr cleavage presumably interfered with processing, presentation, or recognition of the corresponding site in hen lysozyme. Similarly, it was observed that a T-cell clone specific for sperm whale myoglobin, not equine myoglobin, responded equally well to the minimal epitope synthetic peptides from the two species.²³⁸ Also, residues outside the actual site must be distinguishing equine from sperm whale myoglobin. Experiments using F1-presenting cells that can clearly produce this epitope for presentation to other T cells proved that the problem was not a failure to produce the appropriate fragment from hen lysozyme²³⁹ or equine myoglobin.²³⁸ Thus, these cases provide evidence that a structure outside the minimal site can hinder presentation in association with a particular MHC molecule.

Such a hindering structure was elegantly identified in a study by Grewal et al.²⁴⁴ comparing hen egg lysozyme peptides presented by strains C57BL/6 and C3H.SW that share H-2^b but differ in non-MHC genes. After immunization with whole lysozyme, a strong T-cell response was seen to peptide 46 to 61 in C3H.SW mice but not at all in C57BL/6 mice. Because the F1 hybrids of these two strains responded, the lack of response in one strain was not due to a hole in
the T-cell repertoire produced by self-tolerance. It was found that peptide 46 to 60 bound directly to the I-A^b class II MHC molecule, whereas peptide 46 to 61 did not, indicating that the C-terminal Arg at position 61 hindered binding. Evidently, a non-MHC-linked difference in antigen processing allowed this Arg to be cleaved off the 46 to 61 peptide in C3H.SW mice, in which the peptide was dominant, but not in C57Bl/6 mice, in which the peptide was cryptic.

Even a small peptide that does not need processing may nevertheless be processed, and that processing may affect its interaction with MHC molecules. Fox et al.²⁴⁵ found that substitution of a tyrosine for isoleucine at position 95 of cytochrome c peptide 93 to 103 enhanced presentation with E&bgr;^b but diminished presentation with E&bgr;^k when live APCs were used but not when the APCs were fixed and could not process antigen. Therefore, the tyrosine residue was not directly interacting with the different MHC molecule but was affecting the way the peptide was processed, which in turn affected MHC interaction.

Besides the mechanisms suggested previously, Gammon et al.²³⁹ and Sercarz et al.²⁴⁶ have proposed the possibility of competition between different MHC-binding structures (“agretopes”) within the same processed fragment. If a partially unfolded fragment first binds to MHC by one such site already exposed, further processing may stop, and other potential binding sites for MHC may never become accessible for binding. Such competition could also occur between different MHC molecules on the same presenting cell.²³⁹ For instance, BALB/c mice, expressing both A^d and E^d, produce a response to hen lysozyme specific for 108 to 120, not for 13 to 35,²³⁹ and this response is restricted to E^d. However, B10.GD mice that express only A^d respond well to 13 to 35 when immunized with lysozyme. The BALB/c mice clearly express an A^d molecule, so the failure to present this 13 to 35 epitope may be due to competition from E^d, which may preempt by binding the 108 to 120 site with higher affinity and preventing the 13 to 35 site from binding to A^d. Competition between different peptides binding to the same MHC molecule could also occur.