Roughly 30% of patients with chronic lymphocytic leukemia (CLL) carry immunoglobulin receptors with highly similar primary sequences. Highly similar, quasi-identical immunoglobulins are termed stereotyped. Patients with CLL can be assigned to different subsets expressing different types of stereotyped immunoglobulin receptors. Reliable identification of stereotypy may assist in the molecular classification of CLL and thus better-guided, compartmentalized research. In several major subsets, stereotypy extends from shared primary sequences to shared clinicobiological features and outcome. Reliable identification of stereotypy in CLL may pave the way for tailored treatment strategies applicable to each major stereotyped subset.
Key points
- •
Roughly 30% of patients with chronic lymphocytic leukemia (CLL) carry immunoglobulin receptors with highly similar primary sequences.
- •
Highly similar, quasi-identical immunoglobulins are termed stereotyped.
- •
Patients with CLL can be assigned to different subsets expressing different types of stereotyped immunoglobulin receptors.
- •
Just a few subsets account for more than 10% of the entire cohort.
- •
Reliable identification of stereotypy may assist in the molecular classification of CLL and thus better-guided, compartmentalized research.
- •
In several major subsets, stereotypy extends from shared primary sequences to shared clinicobiological features and outcome, even beyond IGHV gene mutational status.
- •
Reliable identification of stereotypy in CLL may pave the way for tailored treatment strategies applicable to each major stereotyped subset.
Introduction: a primer on mechanisms of immunoglobulin diversity
All B cells express identical copies of immunoglobulin (IG) on their cell surface. The IGs, together with accessory proteins, constitute the surface complexes known as B-cell receptors (BcRs), which are critical for specific antigen recognition. Each IG molecule is a tetramer composed of 2 identical heavy chains (HCs) and 2 identical light chains (LCs), either κ or λ, each subdivided into a variable (V) and constant (C) domain. The V domain is the part of the molecule that binds antigen, whereas the C domain determines the isotype of the molecule and has effector function. Each V domain comprises 4 areas of limited diversity, known as the framework regions (FRs), interspersed with 3 regions of high variability, known as the complementarity determining regions (CDRs), which confer the IG molecule its unique specificity.
The V domain of the IG HC of each B cell is generated somatically by the recombinatorial process and joining of distinct variable (IGHV), diversity (IGHD) and joining (IGHJ) genes at the IGH locus. The V domain of IG LCs is generated in a similar fashion; however, the joining involves only variable (IGKV or IGLV, for κ or λ light chains, respectively) and joining (IGKJ or IGLJ) genes at the IGK and IGL loci. This recombinatorial assembly of IG HC and LC variable domains is known as V(D)J recombination.
V(D)J recombination rests at the basis of the BcR IG diversity. The random assembly of one each of multiple distinct V, D (for IG HCs), and J genes leads to a variety of combinations (combinatorial diversity). IG diversity increases significantly during V(D)J recombination by the trimming of nucleotides from the ends of the recombining genes and the insertion of random (nontemplated) nucleotides at the V-D, D-J, or V-J junctions located within the CDR3, the most diverse part of the V domain (junctional diversity). It has been estimated that the combinatorial events of the IGH, IGK, and IGL loci create greater than 1.6 × 10 6 possible combinations for BcR IGs.
When the B cell encounters antigen to which the BcR adequately binds, affinity maturation of the IG occurs in specialized structures of the secondary lymphoid organs. During this antigen-dependent phase of B-cell differentiation, diversity is increased exponentially as a result of somatic hypermutation (SHM) and class-switch recombination (CSR). SHM is characterized by the introduction of mutations within recombined genes, which increases IG diversity and produces IGs with higher specificity. CSR replaces the constant ( IGHC ) gene to be expressed from IGHM to IGHG or IGHE or IGHA , switching the IG HC isotype without, however, changing antigen specificity. SHM and CSR have been estimated to increase the potential for diversity 10 3 -fold to 10 6 -fold. Hence, altogether, the B-cell repertoire comprises, in principle, 10 12 different antigen specificities. The theoretic probability that two independent B-cell clones might carry exactly the same BcR IG by chance alone is virtually negligible (10 −12 ).
Introduction: a primer on mechanisms of immunoglobulin diversity
All B cells express identical copies of immunoglobulin (IG) on their cell surface. The IGs, together with accessory proteins, constitute the surface complexes known as B-cell receptors (BcRs), which are critical for specific antigen recognition. Each IG molecule is a tetramer composed of 2 identical heavy chains (HCs) and 2 identical light chains (LCs), either κ or λ, each subdivided into a variable (V) and constant (C) domain. The V domain is the part of the molecule that binds antigen, whereas the C domain determines the isotype of the molecule and has effector function. Each V domain comprises 4 areas of limited diversity, known as the framework regions (FRs), interspersed with 3 regions of high variability, known as the complementarity determining regions (CDRs), which confer the IG molecule its unique specificity.
The V domain of the IG HC of each B cell is generated somatically by the recombinatorial process and joining of distinct variable (IGHV), diversity (IGHD) and joining (IGHJ) genes at the IGH locus. The V domain of IG LCs is generated in a similar fashion; however, the joining involves only variable (IGKV or IGLV, for κ or λ light chains, respectively) and joining (IGKJ or IGLJ) genes at the IGK and IGL loci. This recombinatorial assembly of IG HC and LC variable domains is known as V(D)J recombination.
V(D)J recombination rests at the basis of the BcR IG diversity. The random assembly of one each of multiple distinct V, D (for IG HCs), and J genes leads to a variety of combinations (combinatorial diversity). IG diversity increases significantly during V(D)J recombination by the trimming of nucleotides from the ends of the recombining genes and the insertion of random (nontemplated) nucleotides at the V-D, D-J, or V-J junctions located within the CDR3, the most diverse part of the V domain (junctional diversity). It has been estimated that the combinatorial events of the IGH, IGK, and IGL loci create greater than 1.6 × 10 6 possible combinations for BcR IGs.
When the B cell encounters antigen to which the BcR adequately binds, affinity maturation of the IG occurs in specialized structures of the secondary lymphoid organs. During this antigen-dependent phase of B-cell differentiation, diversity is increased exponentially as a result of somatic hypermutation (SHM) and class-switch recombination (CSR). SHM is characterized by the introduction of mutations within recombined genes, which increases IG diversity and produces IGs with higher specificity. CSR replaces the constant ( IGHC ) gene to be expressed from IGHM to IGHG or IGHE or IGHA , switching the IG HC isotype without, however, changing antigen specificity. SHM and CSR have been estimated to increase the potential for diversity 10 3 -fold to 10 6 -fold. Hence, altogether, the B-cell repertoire comprises, in principle, 10 12 different antigen specificities. The theoretic probability that two independent B-cell clones might carry exactly the same BcR IG by chance alone is virtually negligible (10 −12 ).
BcR stereotypy in CLL: how it all started
Studies from the early 1990s offered the first hints for restrictions in the IG heavy variable ( IGHV ) gene repertoire of CLL. Unrelated CLL cases were reported to carry distinctive VH CDR3 characterized by shared amino acid motifs. A milestone in the immunogenetic study of CLL came in 1998, when it was convincingly shown that the IGHV gene repertoire of CLL is restricted, with certain genes, such as IGHV1-69 , IGHV4-34 , and IGHV3-7 , overrepresented in CLL compared with normal IgM + B cells. Specific associations were identified between certain IGHV genes with certain IGHD and IGHJ genes, as shown by rearrangements using the IGHV1-69 gene, which were frequently recombined with the IGHD3-3 and IGHJ6 genes. SHM was not uniform among rearrangements of IGHV genes: for example, the IGHV1-69 gene carried few or no mutations as opposed to other genes (eg, IGHV3-7 , IGHV3-23 and IGHV4-34 ), which bore a significant SHM load. On these grounds, it was proposed that selection by antigen is implicated in CLL ontogeny.
Another milestone followed soon. Less than a year later, 2 groups independently reported that the mutational status of the rearranged IGHV genes strongly correlated with patient survival. In particular, patients carrying mutated IGHV genes were reported to follow a more indolent course than those with unmutated IGHV genes, who tend to show adverse cytogenetic profiles and follow aggressive disease courses characterized by clonal evolution and resistance to therapy. All studies since have corroborated the general rule: IGHV-unmutated = bad prognosis, IGHV-mutated = good prognosis.
So far, the rule has had a single exception. Since the initial report in 2002, accumulating evidence has suggested that usage of the IGHV3-21 gene in CLL BcR IGs may represent an adverse prognostic factor, regardless of the SHM status. CLL using the IGHV3-21 gene is also notable because many cases express distinctive IGs with highly similar VH CDR3s and biased association with λ LCs using the IGLV3-21 gene. Because this situation could not happen by chance alone, it was justifiably considered as evidence for common antigenic drive, perhaps of pathogenic significance.
Stereotyped BcRs in CLL: concepts, definitions, and strategies for identification
Soon thereafter, as large datasets of IG sequences became available, groups from both Europe and the United States reported subsets of CLL cases with highly similar BcR IGs. Hence, rather than an interesting curiosity, it was established beyond doubt that distinct prototypic BcR IGs existed in CLL, repeated with limited or no variation in different subsets of patients with either mutated or unmutated IGHV genes. This remarkable restriction, at odds with the logistics of IG synthesis, strongly supports an antigen-driven pathway to CLL development.
Quasi-identical BcR IGs in unrelated CLL cases are conforming to a fixed or general pattern, thus amply fulfilling the definition of stereotype. After almost a decade of intensive research, it is now established that different subsets of cases with distinct stereotyped VH CDR3 sequences within their BcRs collectively account for almost one-third of the CLL repertoire. Furthermore, just a few major stereotyped subsets represent a substantial proportion of the entire cohort ( Fig. 1 ), showing CLL-biased and often highly distinctive clinicobiological features and, even, outcome (see later discussion). Therefore, the detection of BcR IG stereotypy may assist in the molecular classification of CLL into different categories, potentially paving the way for tailored treatment strategies applicable to each major stereotyped subset.
This brings us to a fundamental question. How can BcR IG stereotypes be identified? The first set of criteria for the identification of stereotypy required potentially stereotyped BcR IGs to use the same IGHV, IGHD, IGHJ germline genes, use the same IGHD gene reading frame, and show VH CDR3 amino acid identity equal to or greater than 60% ( Table 1 ). However, the first criterion could not be met in all instances, because cases with overall high BcR IG similarity (ie, stereotyped) were found to use different IGHV genes. Hence, revised criteria were proposed, putting emphasis on (1) functional conservation of amino acids in case of sequence differences; and (2) also allowing the usage of different IGHV genes (see Table 1 ).
Messmer et al, 2004 | Stamatopoulos et al, 2007 | Darzentas et al, 2010 | Agathangelidis et al, 2012 | |
---|---|---|---|---|
VH CDR3 amino acid identity (%) | ≥60 | ≥60 | ≥50 | ≥50 |
VH CDR3 amino acid similarity (%) | n/a a | n/a | ≥70 | ≥70 |
IGHV genes | Same | Any | Any | Clan b |
VH CDR3 length difference | ? | ≤3 | ≤2 | =0 |
Offset of the pattern | n/a | n/a | ≤2 | =0 |
IGHD gene reading frame | Same | n/a | n/a | n/a |
The validity of this approach was made evident by a set of cases now defined as subset 1 that are characterized by usage of the IGHD6-19 and IGHJ4 genes in association with different IGHV genes (namely IGHV1-2, IGHV1-3, IGHV1-18, IGHV1-8, IGHV5-a, IGHV7-4-1 ). The IGHV genes used in rearrangements assigned to subset 1 are all members of phylogenetic clan I, which means that their germline sequences are closely and evolutionarily related, thus most probably producing overall similar VH domains when recombining with identical IGHD and IGHJ genes.
With the accumulation of IG sequence data for thousands of patients with CLL, these relatively straightforward approaches for identifying BcR IG stereotypy, essentially based on multiple sequence alignment, proved to have serious limitations with regards to efficiency, accuracy, and sensitivity. For this reason, purpose-built bioinformatics methods were developed based on de novo sequence pattern discovery and clustering, enabling sophisticated identification of VH CDR3 sequence similarities regardless of the IGHV/IGHD/IGHJ genes used (see Table 1 ).
A major advantage of this novel approach concerned the ability to document more distant relationships between sequences, thus forming a fuzzy treelike scheme, with first-level (ground-level) clusters merging, under certain conditions, into higher-level clusters with more relaxed membership criteria and more members ( Fig. 2 ). More recently, realizing that the phylogenetic relatedness of IGHV genes can be reflected in the gene composition of subsets, we refined our criteria for BcR IG stereotypy and now require that only sequences carrying IGHV genes of the same clan can be assigned to the same subset. In addition, more stringent criteria were adopted related, indirectly, to the three-dimensional structure of the BcR, including the requirement for identical VH CDR3 lengths and identical locations of the shared patterns within the VH CDR3 of connected sequences (see Table 1 ).
Clues to the ontogeny of CLL from the analysis of stereotyped BcRs
CLL cells frequently express polyreactive and self-reactive antibodies with a reactivity profile similar to that of natural antibodies. Natural antibodies, mainly produced from B1 and marginal zone (MZ) B cells, constitute a first line of defense against invading pathogens. They are multireactive and tend to recognize nonprotein antigens widespread in diverse pathogenic and commensal organisms and, occasionally, by the host; therefore, many, if not all, natural antibodies may be both autoreactive (against self) and alloreactive (against antigens expressed by pathogens). These attributes of natural antibodies may explain why B1 and MZ B cells show restricted IG gene usage and limited junctional diversity, thus ensuring the formation of archetypical BcR IGs, perhaps selected over evolutionary time for showing broad antigen reactivities.
On these grounds, it is relevant that archetypical BcR IGs with widely shared sequence features are a feature of CLL with stereotyped BcRs. Within the stereotyped fraction of CLL, few genes (namely, IGHV1-69, IGHV1-3, IGHV1-2, IGHV3-21, IGHV4-34, IGHV4-39 ) account for more than 75% of cases. This finding indicates that (1) the IG gene repertoire restrictions reported as typical for CLL are essentially a feature of cases expressing stereotyped BcRs IGs, whereas non-stereotyped CLL cases express a more diverse IGHV gene repertoire; and (2) that certain germline specificities may be selected for in the progenitors of at least subsets of CLL cases.
A notable feature of several CLL subsets with stereotyped BcR IGs concerns the usage of different yet phylogenetically related IGHV genes, including subset 1 ( IGHV genes belonging to clan I) as well as other examples (eg, subsets 12 [ IGHV1-2 and IGHV1-46 ], 59 [ IGHV1-58 and IGHV1-69 ] and 77 [ IGHV4-4 and IGHV4-59 ]), the latter including cases with mutated IGs. This finding is similar to what has been described for recombinant monoclonal antibodies (mAbs) with similar reactivities using CDR shuffling approaches and, recently, for potent broadly neutralizing CD4-binding site anti-human immunodeficiency virus antibodies that mimic binding to CD4.
An animal model that reproducibly replicates abnormalities typical of CLL in the human is still lacking. However, CD5+ B-cell lymphoproliferative disorders similar to human CLL develop in mice manipulated in several different ways. The leukemic BcR IGs in the mice show several molecular features reminiscent of the human disease, including BcR stereotypy. As recently shown by comparison of all available VH CDR3s of CD5+ lymphoproliferations from CLL mouse models, 44% (29/66) of all cases could be assigned to 8 subsets of quasi-identical VH CDR3s. In TCL1 transgenic mice, one of the best characterized of CLL animal models, antigen-binding studies have confirmed that selected TCL1 clones were polyreactive or autoreactive, binding to a glycerophospholipid (PtC based on reactivity with Br-treated red blood cells), a lipoprotein (low-density lipoprotein), or polysaccharides (Fucα1-linkages). On these grounds, it has been suggested that the TCL1 clones likely derive from the B-1a subset, consistent with finding the initial, preleukemic clonal expansions in the peritoneal cavity.
Altogether, asymmetries in both the usage of IG genes and the molecular features of the antigen-binding site in CLL with stereotyped BcR IGs strongly recall B-1 cells, which show a distinct and considerably more biased IG repertoire than conventional B cells, with frequent occurrence of identical IG heavy and LC rearrangements. The repertoire and reactivity pattern of B-1 cells is stable within each species and even across species, likely reflecting evolutionary pressure for maintaining broadly protective specificities against exogenous (ie, microbial pathogens) and endogenous (ie, apoptotic debris) risks. Whether and how these observations are relevant for CLL ontogeny is unknown; however, the analogies to human disease are tantalizing and difficult to ignore.