The Molecular Basis of α-Thalassemia: A Model for Understanding Human Molecular Genetics




Down-regulation of α-globin synthesis causes α-thalassemia with underproduction of fetal (HbF, α 2 γ 2 ) and adult (HbA, α 2 β 2 ) hemoglobin. This article focuses on the human α-globin cluster, which has been characterized in great depth over the past 30 years. In particular the authors describe how the α genes are normally switched on during erythropoiesis and switched off as hematopoietic stem cells commit to nonerythroid lineages. In addition, the principles by which α-globin expression may be perturbed by natural mutations that cause α-thalassemia are reviewed.


Over the past 50 years, our understanding of the normal production of hemoglobin and how errors in this process give rise to α- and β-thalassemia has been at the center of many developments that have established the principles underlying normal globin gene regulation and how this may be perturbed by natural mutations. This field has advanced as a 2-way process; improved understanding of normal globin gene expression has guided the search for mutations, and the identification of genetic variation associated with specific red cell phenotypes has often provided unexpected insights into the normal mechanisms underlying globin gene regulation. Although the globin genes are often considered as a “special case” with respect to our understanding of gene regulation, on the contrary, there is very little that we have learnt from this system that has not been of general relevance to human molecular genetics.


This review concentrates on the human α-globin cluster, which has been characterized in great depth over the past 30 years. In particular the authors describe how the α genes are normally switched on during erythropoiesis and switched off as hematopoietic stem cells commit to nonerythroid lineages. In addition, the principles by which α-globin expression may be perturbed by natural mutations that cause α-thalassemia are reviewed.


Down-regulation of α-globin synthesis causes α-thalassemia with underproduction of fetal (HbF, α 2 γ 2 ) and adult (HbA, α 2 β 2 ) hemoglobin. The considerable diversity of mutations that have been identified is explained by the fact that carriers for α-thalassemia are (to some extent) protected from falciparum malaria in a manner that we still do not fully understand (reviewed in Ref. ). Consequently, in tropical and subtropical regions of the world (where malaria is endemic) α-thalassemia reaches very high frequencies, making it one of the most common of all human genetic disease traits. Amongst people originating from these areas, compound heterozygotes and homozygotes for some mutations may have severe hematologic phenotypes. When α-globin synthesis is reduced to around 25% or less, patients may suffer from a moderately severe hemolytic anemia associated with readily detectable excess β-globin chains in the form of β 4 tetramers (referred to as HbH); hence the condition is called HbH disease. When α-globin synthesis is even further reduced (or even abolished), this gives rise to a severe intrauterine anemia associated with excess γ-globin chains in the form of γ 4 tetramers (referred to as Hb Bart’s); hence this condition is referred to as the Hb Bart’s hydrops fetalis syndrome (BHFS).


The high frequency of α-thalassemia trait, HbH disease, and Hb Bart’s hydrops fetalis means that tens of thousands of patient samples have been analyzed and that probably most of the natural mutations affecting these genes are now known. However, when all known mutations have been excluded, new often rare but informative genetic variants are still being found. Nevertheless, in most instances accurate genetic counseling and (when appropriate) prenatal testing can be made available in countries where the health care infrastructure is able to support this.


In clinical practice α-thalassemia is much less of a problem than β-thalassemia. When genetic counseling is available, most families choose to avoid continuing pregnancies in which the fetus has the BHFS, although rare infants with this condition survive and have been treated with lifelong blood transfusion or bone marrow transplantation. Most patients with HbH disease can lead a relatively normal life, blood transfusion and iron chelation being rarely required. In fact a major clinical interest in α-thalassemia comes from its interaction with β-thalassemia. In some circumstances, the coinheritance of α-thalassemia appears to have a significant impact on the clinical phenotype of compound heterozygotes and homozygotes for β-thalassemia (reviewed in Refs. ), presumably by reducing the production of free α-globin chains (the main cause of red cell damage in β-thalassemia). Understanding in detail how expression of the α-globin genes is normally regulated may therefore identify pathways by which α-globin synthesis could be down-regulated and thereby ameliorate the phenotype of β-thalassemia.


Normal structure and regulation of the human α-globin gene cluster


Expression of the α-like and β-like globin chains is regulated by clusters of genes on chromosomes 16 and 11, respectively. The α-like gene cluster is located close to the telomere of chromosome 16 (16p13.3) including an embryonic gene (ζ) and 2 fetal/adult genes arranged along the chromosome in the order telomere-ζ-α2-α1-centromere, surrounded by widely expressed genes ( Fig. 1 ). The normal cluster is denoted αα. Approximately 25 to 65 kb upstream of the α-globin genes there are 4 highly conserved noncoding sequences, or multispecies conserved sequences (MCS), called MCS-R1 to -R4, which are thought to be involved in the regulation of the α-like globin genes (see Fig. 1 ). It is of interest that 3 of these elements (MCS-R1–3) lie within the introns of the adjacent widely expressed gene c16orf35 . Of these elements, only MCS-R2 (also known as HS-40) has been shown to be essential for α-globin expression.




Fig. 1


The human α-globin cluster (5′ζ-α2-α1-3′) surrounded by widely expressed genes ( MPG , C16orf35 , and Luc7L ) on chromosome 16 (16p13.3). Below this the multispecies conserved elements (MCS-Rs) are shown. The X, Y, and Z boxes are the regions of duplication that play a part in the generation of the common α-thalassemias, as discussed in the text. The most common deletions removing one (-α 3.7 and -α 4.2 ) or both (– Med and — SEA ) α genes and causing thalassemia are shown as horizontal bars, as are the unusual deletions α- ZF and (αα) TM . The duplications of the cluster, BS and FD, are also indicated. For a full catalog of all deletions that cause α-thalassemia see Ref. At the bottom of the figure the positions of common repeats, variable number random repeats (VNTRs) and SNPs are shown. The region containing all SNPs and VNTRs corresponding to the classic haplotype used in population studies (as discussed in the text) is illustrated as a thin horizontal line. The regulatory SNP (rSNP) that creates new functional GATA1 binding site seen in the Melanesian nondeletional α-thalassemia is shown with an asterisk. The extent of the α-globin locus present in the humanized mouse is shown as a thin line.


As multipotent hemopoietic progenitors commit to the erythroid lineage and begin to differentiate to form mature red blood cells, key erythroid transcription factors (eg, GATA-2, GATA-1, SCL, SpXKLF, NF-E2) and various cofactors (eg, FOG, pCAF, p300) progressively bind to the upstream MCS-R elements and the promoters of the α-like globin genes ( Fig. 2 ). Binding of these factors is associated with widespread modifications of the associated chromatin reflecting activation (eg, histone acetylation). Finally, RNA polymerase II (PolII) is recruited to both the upstream regions and the globin promoters as transcription starts in early erythroid progenitors. At the same time, the upstream elements and promoters of the globin genes interact with each another via the formation of chromatin loops. It has recently been shown that HS-40 is critically required for looping to occur and for the stable recruitment of PolII to the α-globin promoters.




Fig. 2


In stem cells and early progenitors the cluster is silenced by the Polycomb repressive complex. In multipotent cells (CMP), the cluster is primed in the upstream region (MCSR-2) by multiprotein complexes containing SCL and NF-E2 nucleated by GATA-2. In committed erythroid progenitors (U-MEL, proerythroblast stage), additional remote regulatory sequences are bound by multiprotein complexes containing various combinations of SCL, NF-E2, and GATA1 replacing GATA2. At this stage, the α-globin promoter is also occupied by a combination of factors including NF-Y and is poised for expression. In differentiating erythroid cells the preinitiation complex (PIC), including PolII, is recruited to the enhancers in a cooperative manner but independently of the promoter. Krüppel-like transcription factors are also recruited, independently of the upstream elements and to the promoter. At this final stage, the α-globin promoter is now occupied by a multiprotein complex that represents a docking site for the recruitment of the PIC, which is entirely dependent on the presence of the upstream elements that interact with the promoter, forming a loop.


It has become clear that this series of up-regulatory events taking place in erythroid cells is only a part of the complete story of α-globin regulation. Although the α-globin genes are expressed in a strictly tissue-specific manner, they are contained in a large chromosomal region which is broadly transcriptionally active and bears the hallmarks of constitutively active chromatin. Therefore, it was predicted that within such a region, mechanisms should exist to maintain the α-globin genes in a silent state in nonerythroid cells. An extensive survey of chromatin modifications in erythroid and nonerythroid cells showed that the α-globin cluster is marked by a specific signature (H3K27me3) in nonerythroid cells but less so in erythroid cells. The H3K27me3 mark is imposed by a histone methyltransferase (EZH2), which forms part of a repressive Polycomb complex called PRC2. Consistent with this, components of this complex (EZH2 and SUZ12) have also been found at the α-globin cluster in nonerythroid cells. The PRC2 complex recruits histone deacetylases (HDACs) and another repressive Polycomb complex (called PRC1; Lynch and colleagues, unpublished data, 2009). Experimental data suggest that together, these complexes (PRC1, PRC2, and HDACs) maintain the silence of the α genes in multipotent cells and differentiated nonerythroid cells but that they are cleared from the α-globin promoter before the activation steps described (see Fig. 2 ).


Activation of the α-globin genes can thus be thought of as a multistep process in which transcription of α-globin is at first somewhat leaky, albeit expressed at very low levels, in hematopoietic stem cells when the α genes are partially repressed by the Polycomb system. As cells differentiate down the nonerythroid pathway, the genes become increasingly silenced by Polycomb so that in mature nonerythroid cells no transcription can be detected. By contrast, as cells differentiate along the erythroid pathway, Polycomb silencing is removed as the activating events are played out on the cluster (see Fig. 2 ). All results to date have been obtained by analyzing populations of cells so that we only see the average effects. Seeing the true order of events in single cells is a challenge for the future.




Normal variation within the human α-globin cluster


Over the past few years whole genome analysis has revealed the scale of variation across the human genome in apparently normal individuals. Such variation results from single nucleotide polymorphisms (SNPs), variations in the numbers of tandem repeats (VNTRs), variations in microsatellites, copy number variants (CNVs, caused by homologous or illegitimate recombination), segmental duplications and deletions, and chromosomal translocations. Although heralded as a surprise, in fact the extent of variation and mechanisms underlying such polymorphic changes were anticipated from detailed characterization of the regions containing the globin loci in large numbers (thousands) of nonthalassemic individuals. In addition to SNPs forming ancestral haplotypes, it had been shown that extensive polymorphic variation caused by genomic rearrangements were found in hematologically normal individuals. These variants are summarized in detail in Ref. but the most notable small-scale variations in the α cluster occur in the many G-rich VNTRs in this region together with the large-scale variation commonly observed in the subtelomeric region of chromosome 16. Studies of the different arrangements of these polymorphisms and VNTRs in various population samples have made it possible to define a series of α-globin-gene haplotypes that have been of considerable value both in the analysis of evolutionary aspects of the gene clusters and in defining the origins of many of the α-thalassemia mutations. In addition to their value as genetic markers, these variants have been useful in distinguishing functionally important areas of the α-globin cluster from regions that appear to be of little functional significance.




Normal variation within the human α-globin cluster


Over the past few years whole genome analysis has revealed the scale of variation across the human genome in apparently normal individuals. Such variation results from single nucleotide polymorphisms (SNPs), variations in the numbers of tandem repeats (VNTRs), variations in microsatellites, copy number variants (CNVs, caused by homologous or illegitimate recombination), segmental duplications and deletions, and chromosomal translocations. Although heralded as a surprise, in fact the extent of variation and mechanisms underlying such polymorphic changes were anticipated from detailed characterization of the regions containing the globin loci in large numbers (thousands) of nonthalassemic individuals. In addition to SNPs forming ancestral haplotypes, it had been shown that extensive polymorphic variation caused by genomic rearrangements were found in hematologically normal individuals. These variants are summarized in detail in Ref. but the most notable small-scale variations in the α cluster occur in the many G-rich VNTRs in this region together with the large-scale variation commonly observed in the subtelomeric region of chromosome 16. Studies of the different arrangements of these polymorphisms and VNTRs in various population samples have made it possible to define a series of α-globin-gene haplotypes that have been of considerable value both in the analysis of evolutionary aspects of the gene clusters and in defining the origins of many of the α-thalassemia mutations. In addition to their value as genetic markers, these variants have been useful in distinguishing functionally important areas of the α-globin cluster from regions that appear to be of little functional significance.




α-Thalassemia caused by sequence variations in the structural genes


Although most sequence variants in the α-globin cluster (including variation with HS-40) are caused by neutral SNPs, the authors currently know of 69 point mutations or oligonucleotide variants that alter gene expression, referred to as nondeletional forms of α-thalassemia (denoted α T α or αα T depending on whether the α 2 or α 1 gene is affected). As for many other human genetic diseases, these mutations may affect the canonical sequences that control gene expression, including the CCAAT and TATA box sequences associated with the promoter, the initiation codon (ATG), splicing signals (GT/AG), the termination codon (TAA), and the poly(A) adenylation signal (AATAAA). In addition to these mutations, α-thalassemia may also be caused by in-frame deletions, frameshift mutations, and nonsense mutations (often leading to nonsense-mediated decay of the RNA) and/or to the production of abnormal protein. Some variants alter the structure of the hemoglobin molecule making the dimer (αβ) or tetramer (α 2 β 2 ) unstable. Such molecules may precipitate in the red cell, forming insoluble inclusions that damage the red cell membrane. Over the past few years it has become apparent that some α-globin structural variants are so unstable that they undergo very rapid postsynthetic degradation and thereby cause the phenotype of α-thalassemia. A full list of sequence variations in the structural genes that cause α-thalassemia is available.




Translocations and duplications of the α-globin cluster


Rarely, chromosomal translocations involving 16p13.3 place the α-globin locus at the tip of another chromosome, as seen, for example, in some relatives of patients with the ATR-16 syndrome (see later discussion). To date the authors know of 16 individuals with such balanced translocations and none of them has α-thalassemia. Because the closest centromeric breakpoint of these chromosomal translocations lies only 1.2 Mb from the α-globin genes, these findings demonstrate that the cis -acting sequences required for full α-globin regulation are contained within this region and that expression is not perturbed by rearrangements on this scale. In 2 individuals with unbalanced translocations, and 3 copies of 16p13.3, the α/β globin chain synthesis ratios were 1.5 and 1.6 (Refs. and Fichera M, unpublished data, 1998), again indicating that the additional, mis-localized copy of the α complex is expressed even though its genomic position has been altered.


Two large duplications of the terminal region of chromosome 16 have also been described in patients with β-thalassemia intermedia. Because these patients are simply heterozygotes for the β-thalassemia mutations, the implication is that their relatively severe phenotype results from the production of excess α-globin chains from the α genes in the duplicated regions. In one pedigree, 3 α clusters (αα:αα:αα) are present on one copy of chromosome 16 (Ref. and unpublished data). Provisional data suggest that at least 2 and possibly all 3 clusters in the duplicated region are fully active. A carrier for this abnormal chromosome (αα:αα:αα/αα), with a total of 8 α genes, has an α/β-globin chain synthesis ratio of 2.7. A recent study has more fully characterized what appears to be a very similar rearrangement in another Italian family with β-thalassemia intermedia revealing a duplication of approximately 260 kb (see BS in Fig. 1 ). These investigators also characterized another duplication of approximately 175 kb of chromosome 16 lying between the end of the α cluster and the telomere (see FD in Fig. 1 ). Again the phenotype of a compound heterozygote for this rearrangement (αα:αα/αα) and β-thalassemia trait suggested that the additional α clusters in the duplicated segment are fully active, indicating that all sequences required for fully regulated α-globin expression lie in this duplicated segment of chromosome 16.


These findings, delimiting the region required to direct fully regulated expression of the human α-globin cluster, are consistent with experimental data from a mouse model in which the mouse α-globin cluster was replaced with approximately 135 kb of the human α-globin cluster. This region contains all sequences within a region of conserved synteny between the human and mouse, including the globin genes and their regulatory elements. The pattern and levels of expression of the human transgenes in this segment of DNA suggest that this region of approximately 135 kb contains all of the sequences required to express the α-globin genes correctly from an appropriate chromosomal environment, although, as discussed previously, the level of expression in a mouse environment is less than in the human.




α-Thalassemia caused by deletions removing one of the duplicated structural genes


Heteroduplex and DNA sequence analysis has shown that the duplicated α-globin genes (αα) are embedded within 2 highly homologous, 4-kb duplication units whose sequence identity appears to have been maintained throughout evolution by gene conversion and unequal crossover events. These regions are divided into homologous subsegments (X, Y, and Z) by nonhomologous elements (I, II, and III, see Fig. 1; Fig. 3 ). Reciprocal homologous recombination between Z segments, which are 3.7 kb apart, produces chromosomes with only one α gene (-α 3.7 , rightward deletion, see Figs. 1 and 3 ) that cause α-thalassemia and others with 3 α genes (ααα anti3.7 ). Recombination between homologous X boxes, which are 4.2 kb apart, also gives rise to an α-thalassemia determinant (-α 4.2 , leftward deletion, see Figs. 1 and 3 ) and an ααα anti4.2 chromosome. Further recombination events between the resulting chromosomes (α, αα, ααα) may give rise to quadruplicated α genes (αααα), or quintuplicated (ααααα) or other unusual patchwork rearrangements.




Fig. 3


The mechanism by which the common deletions underlying α-thalassemia occur. Crossovers between misaligned Z boxes give rise to the -α 3.7 and ααα anti3.7 chromosomes. Crossovers between misaligned X boxes give rise to -α 4.2 and ααα anti4.2 chromosomes.


Although these long-standing observations have pointed to the mechanism by which -α and ααα chromosomes arise, it has only recently been shown (using single DNA molecule polymerase chain reaction) how they may occur in vivo. From this work, the overall picture is one of reciprocal recombination (and unequal exchange) occurring in mitosis (premeiotic) in the germ line. The estimated frequencies of -α and ααα arrangements in sperm are in the order of 1–5 × 10 −5 . Deletions and duplications may also occur in somatic tissues by related mechanisms, although deletions detected in blood occur by intrachromosomal rather than interchromosomal recombination.


In addition to the common -α chromosomes, several rare deletions that remove either the α 1 or α 2 gene (leaving one gene intact, -α) have been described. In general these remove either the α 1 or α 2 gene by nonhomologous recombination. All of these deletions leave one α gene intact, and it is therefore possible to assess the influence of these deletions on expression of the remaining gene. When all of the data from these deletions that cause α-thalassemia are considered alongside the observations from nonthalassemic variants (see above), it appears that large segments of the α-globin cluster are not essential for α-globin expression. Figures showing the full list of currently known deletions removing a single α gene are presented by Rugless and colleagues.




α-Thalassemia caused by deletions removing both of the duplicated structural genes


The authors currently know of approximately 50 deletions from the α-globin cluster that either completely or partially delete both α-globin genes, and therefore that no α-chain synthesis is directed by these chromosomes in vivo. Examples of 2 common deletions — MED and — SEA are shown in Fig. 1 . Homozygotes for such chromosomes (–/–) have the Hb BHFS. Compound heterozygotes for these deletions and deletions removing a single α gene (see above) have HbH disease (–/-α).


With completion of the DNA sequence of 16p13.3 and beyond, it has been possible to define the full extent of many of the deletions that remove both α genes (–). These deletions can be grouped into those (like — MED and — SEA , see Fig. 1 ) that lie entirely within the α-globin cluster and deletions that extend up to approximately 800 kb beyond the α cluster to include the flanking genes. Although these deletions remove other genes, affected heterozygotes appear phenotypically normal, apart from their α-thalassemia: in some patients α-thalassemia trait (–/αα) and in others HbH disease (–/-α). In patients with more extensive deletions with monosomy for a large segment of 16p13.3, α-thalassemia is associated with developmental abnormalities and mental retardation (so-called ATR-16 syndrome, see later discussion). It is interesting to note that all of the α-thalassemia deletions that occur at polymorphic frequencies in human populations are limited to the α cluster and do not extend into the surrounding genes, suggesting that deletion of these genes (even in heterozygotes) may result in a selective disadvantage.


Detailed analysis of several of these determinants of α-thalassemia indicates that they often result from illegitimate or nonhomologous recombination events (eg, Refs. ). Such events may involve short regions of partial sequence homology at the breakpoints of the molecules that are rejoined, but they do not involve the extensive sequence-matching required for homologous recombination as described in the previous section.


Sequence analysis has shown that members of the dispersed family of Alu repeats are frequently found at or near the breakpoints of these deletions. Alu-family repeats occur frequently in the genome (3 × 10 5 copies) and seem to be particularly common in and around the α-globin cluster, where they make up approximately 25% of the entire sequence. These repeats may simply provide partially homologous sequences that promote DNA-strand exchanges during replication, or possibly a subset of Alu sequences may be more actively involved in the process. Detailed sequence analysis of the junctions of the α-globin deletions has revealed several interesting features including palindromes, direct repeats, and regions of weak homology. Some deletions involve more complex rearrangements that introduce new pieces of DNA bridging the 2 breakpoints of the deletion. In 2 deletions this inserted DNA originates from upstream of the α cluster, and appears to have been incorporated into the junction in a manner suggesting that the upstream segment lies close to the breakpoint regions during replication. Orphan sequences from unknown regions of the genome are frequently found bridging the sequence breakpoints of other α-thalassemia deletions. At least 2 of the deletions result from chromosomal breaks in the 16p telomeric region that have been “healed” by the direct addition of telomeric repeats (TTAGGG) n . This mechanism is described later in further detail. All currently known deletions that remove both α genes are summarized in Ref.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 18, 2017 | Posted by in HEMATOLOGY | Comments Off on The Molecular Basis of α-Thalassemia: A Model for Understanding Human Molecular Genetics

Full access? Get Clinical Tree

Get Clinical Tree app for offline access