Thyroglobulin Structure, Function, and Biosynthesis
Héctor M. Targovnik
Thyroglobulin (TG), a homodimeric glycoprotein of 660 kDa (TG 19S), functions as the highly specialized matrix for thyroid hormone biosynthesis and for the storage of the inactive form of thyroid hormones and iodine (1). TG is translocated into the endoplasmic reticulum (ER). During translation/translocation, newly synthesized TG immediately starts to fold and acquires its final glycoprotein structure as it passes through the Golgi complex. Correct folding is determined in large part by the sequence of the protein, but it is also assisted by interaction with enzymes and chaperones of the ER. TG is finally secreted and stored in the follicular lumen as colloid. The iodine content of TG under normal conditions varies widely depending on iodine intake and species. For normal human TG, values from 0.1% to 1.1% (from 5 to 55 atoms of iodine per mole of TG) have been reported (2). The iodine is covalently bound to amino acids within TG in the form of T3, T4, and their iodotyrosine inactive precursors, monoiodotyrosine (MIT) and diiodotyrosine (DIT) (1). TG contains even low amounts of 3,3′,5′-iodothyronine (rT3), 3,3′-diiodothyronine (T2), and monoiodohistidine (3,4,5). The thyroid cells produce free thyroid hormones by proteolytic cleavage of the TG (6), which are delivered to the blood circulation for action at their peripheral target tissues. Biosynthesis of thyroid hormones requires the integrity of a complex protein system, several sequential steps, and is critically dependent upon the native three-dimensional structure of TG. The general organization of the TG gene, its mRNA, and protein domains has been studied extensively (1). However, little is known about the structure–function relationship of the TG because of our lack of knowledge about the three-dimensional structure of this protein. Unfortunately, there are no X-ray crystallographic data of any TG regions due to the numerous posttranslational modifications. The TG is also a possible regulator of thyroid follicle function (7,8,9) or could be involved in some unknown mechanisms that remain to be explored.
Thyroglobulin and Evolution
The TG protein is composed of four structural and functional regions (Figs. 5.1 and 5.2). The N-terminal and the central part of the monomer contains three types of repetitive motifs, called TG type-1, TG type-2, and TG type-3, organized in three regions (I, II and III). Region I contains 10 of the 11 TG type-1 repeats, followed by a 247-residue hinge region. Region II contains multiple type-2 repeats and the 11th TG type-1 repeat, while region III contains multiple type-3 repeats. Finally, the fourth region located in the carboxy-terminal contains a no-repetitive domain that shows significant homology with the acetylcholinesterase (ACHE), named the ACHE-like domain (10).
Domain duplication and shuffling by recombination represent the basic evolutionary mechanisms that form proteins and contribute to complexity of the proteome (11). Interestingly, the combinations of domains in a protein can be represented as a network in which the nodes denote functional domains from different superfamilies. The internal protein organization makes TG an example of gene evolution by intragenic duplication events and gene fusions. The TG type-1, type-2, and type-3 motifs derive from ancestral genes that were duplicated and modified during evolution by mutations. Homology between the esterases and TG was first discovered by comparison of the primary structures of Torpedo californica ACHE and bovine TG (10). This suggested that both evolved from a common ancestral gene.
Figure 5.2 The complete primary structure of the human thyroglobulin deduced from thyroglobulin cDNA sequences. The preprotein monomer is composed of a 19-amino acids signal peptide followed by a 2,749-residue polypeptide. The amino acids are indicated by the single letter code. The number 1 indicates the first amino acid of the mature proteins. The repetitive motifs of type-1, type-2, and type-3, and the acetylcholinesterase-homology domain (ACHE-like domain) are showed. The putative glycosylated asparagine residues are boxed (positions 57, 91, 179, 465, 477, 510, 729, 797, 928, 1,201, 1,330, 1,346, 1,697, 1,755, 1,850, 1,994, 2,103, 2,231, 2,276, and 2,563) and the amino acids polymorphisms are indicated by underlined letters (p.G58S, p.S715A, p.S715L, p.G796R, p.Q811E, p.R969P, p.M1009V, p.G1293D, p.T1479M, p.N1819D, p.R1980W, p.P2213L, p.W2482R, and p.R2511Q). The tyrosine acceptor sites in positions 5, 1,291, 2,747, and 2,554 are shown enlarged in bold letter. See color plate. |
During the evolution of chordates, the formation of thyroid hormones precedes the morphological differentiation of thyroid cells and their organization in follicles (12). The storage of thyroid hormone bound to TG in the follicular lumen is characteristic of vertebrates. Lamprey, a freshwater cyclostome, produces TG 19S before the metamorphosis of larvae, thus before the organization of the thyroid follicles existing in adults (13). In contrast, hagfish, a salt-water cyclostome known as the lowest class of vertebrates, contains highly iodinated glycoproteins of non-TG nature (14). The amphioxus endostyle is widely considered to be a homolog of the vertebrate thyroid gland. Amphioxus or lancelets, is a small worm-like marine animal that is considered a modern survivor of an ancient chordate lineage, with a fossil record dating back to the Cambrian period (15). The sequencing of the ∼520-megabase genome of the lancelet Branchiostoma floridae (15) identified several genes for components of the thyroid hormone production system: A thyroid hormone receptor, a sodium iodide symporter, a thyroid peroxidase, deiodinases that convert T4 to active T3, and a cytosolic thyroid hormone–binding protein (16,17). However, it does not include a gene orthologous to the gene encoding the vertebrate TG protein (16,17). This result is surprising because a protein with biochemical properties similar to TG has been described in amphioxus. It has been shown that in the endostyles of the amphioxus occurs the biosynthesis of a protein with a sedimenting coefficient of 17–19S, corresponding to a molecular weight of about 660 kDa and the in vivo and in vitro enzymatic iodination of this protein shows the ability to form thyroid hormones, qualitatively similar to TG of vertebrates (18,19). There are two scenarios that might explain the lack of a TG gene: (1) The amphioxus TG type-1 domain proteins might be the source of the Tyr residues required for thyroid hormone synthesis, and (2) the Tyr residues might be derived from unidentified protein able to incorporate iodine to form iodothyronines (17). The identification and characterization, in the near future, of the TG homolog in amphioxus may provide important insights into structure–function relationships and may expand our knowledge on the ancient route for thyroid hormone production.
Thyroglobulin Gene: Structure, Expression, and Regulation
The molecular analysis of TG began in 1973 when the bovine TG mRNA was translated in Xenopus oocytes by Vassart et al. (20,21,22,23,24). Subsequently, the cDNA and its corresponding gene have been isolated and widely characterized (25,26,27,28,29). The TG gene is organized in 48 exons, spanning over 270 kb (30,31,32,33,34,35,36,37,38) on human chromosome 8q24.2 (39,40,41,42) (Fig. 5.1), with exon sizes ranging between 63 and 1,101 nucleotides. The 5′ region has an exonic/total sequences ratio of 0.9, the same as found in many other mammalian genes, while the 3′ region has a ratio of 0.02 owing to the presence of giant introns of up to 64,000 bases long (33). It has been found that the 64 kb intron of the human TG contains a gene encoding for the human Src-like adaptor protein (hSLAP) (43).
The human TG mRNA is 8.5 kb long (44,45,46,47,48) (Fig. 5.1). The general organization of the sequence showed a 41-nucleotide 5′-untranslated segment, followed by a single open reading frame of 8,307 bases and a 3′-untranslated segment ranging from 101 up to 120 bp (46,48).
TG mRNA in human thyroid tissues is very heterogeneous due to alternatively spliced transcripts and polyadenylation cleavage site variants. The splicing map of human TG mRNA is composed of at least 11 alternative splicing variants (49,50,51). This consists of a series of transcripts where complete exons or a group of consecutive complete exons were skipped. Bertaux et al. found four transcripts smaller than previously known TG mRNA (49,51). The regions absent in each of these RNA molecules correspond to exons 17, 17, and 18; 17–19; and 17–20 (49,51). Exons 3, 4+6, and 22, 26, and 46 are also alternatively spliced (48,50). Deletions of nucleotides between positions 3,430–3,736 and 7,301–7,561 also were identified (48). Four possible polyadenylation sites were identified in human TG mRNA: cgg(A)n, cggtga(A)n, cggtgaagca(A)n, and cggtgaagcattgttgactcta(A)n (48).
Highly informative DNA polymorphic markers were characterized in the TG gene (48,52,53,54,55) and can be used in linkage and association studies in family histories with congenital hypothyroidism, autoimmunity thyroid diseases (55), non-medullary thyroid cancer risk (56), and premature ovarian failure (57). The TG DNA polymorphisms proved to be interesting and informative genetic markers to investigate also whether a common ancestral chromosome or a mutational hot spot accounted for the occurrence of the same mutation in all the affected individuals. The term DNA polymorphism refers to a wide range of variations in nucleotide base composition, single nucleotide polymorphism (SNP), insertion and deletion sequences (indel or also called copy number variable, CNV), or length of nucleotide repeats. This later group includes the short tandem repeats (STR) or microsatellites.
Twenty-one SNPs were identified and characterized in the coding sequence of the TG gene, 14 of them resulting in amino acid polymorphisms: p.G58S, p.S715A, p.S715L, p.G796R, p.Q811E, p.R969P, p.M1009V, p.G1293D, p.T1479M, p.N1819D, p.R1980W, p.P2213L, p.W2482R, and p.R2511Q (1) (Fig. 5.2).
A large insertion/deletion (indel) polymorphism of 1,464 bp localized in intron 18 of the human TG gene was identified (54). Genetic evidence indicates that the small additions and deletions can occur spontaneously during replication. Deletion and insertion also result from recombination events or activities of the transposable elements. GenBank database search showed that the 1,464 bp Indel polymorphism does not correspond to any known interspersed repetitive human sequence. However, it is not possible to exclude that some ancient transposable element, not identified in the intron 18, might have been involved in the development of this polymorphism.
Four STRs were identified and characterized within introns 10, 27, 29, and 30 of the TG gene, named Tgms1, Tgms2, TGrI29, and TGrI30, respectively (53,55). Tgms1 had 5 alleles and is a CA repeats located to 28 bp of downstream from the 3′ end of exon 10 (55), whereas Tgms2 had 16 alleles and is also a CA repeats located to 3,224 bp upstream from the 5′ end of exon 28 (55). TGrI29 exhibited clearly four distinguishable alleles and TGrI30 showed eight alleles (53). Sequencing analysis indicated that both loci are complex repeats, TGrI29 containing two types of variable motifs (tc)n and (tg)n, and TGrI30 containing a tetra-nucleotide tandem units (atcc)n (53).
TG gene expression is stimulated by TSH through the modulation of the intracellular level of cyclic adenosine monophosphate (cAMP) (58,59,60). TSH exerts its function via a G-protein–coupled receptor, the TSH receptor (TSHR) (61), which relies on the associated G protein to transmit and amplify the signal inside the cell. Transcription of the TG gene is under control of the coordinated action of a master set of transcription factors that includes the homeodomain protein NKX2.1 (TTF-1) (62,63,64), the forkhead-domain protein FOXE1 (TTF-2) (65,66,67), and the paired-domain protein PAX8 (68), by their binding to the TG promoter on their consensus sequences. NKX2.1 is expressed in embryonic diencephalon, thyroid, and lung (69), FOXE1 in pituitary and thyroid (65), and PAX8 in kidney, in the developing excretory system, and in the thyroid (70). These three transcription factors are present together only in the thyroid, suggesting that this unique combination is responsible for the morphogenesis of the thyroid gland and for maintenance of thyroid-differentiated phenotype (71). The proximal promoter region of the TG gene apart from the canonical TATA Box homology also contains binding sites for transcription factors NKX2.1, FOXE1, and PAX8. Three NKX2.1 binding sites are present in the TG proximal promoter. Also, an upstream enhancer region contains additional binding sites for NKX2.1 (72). Binding sites for nonspecific factors such as cAMP responsive elements (CRE) have been described for this enhancer sequence (73). PAX8 and NKX2.1 synergistically activate transcription from the promoter of TG gene and require a complex mechanism based on the functional protein–protein interaction between both transcription factors and the target protein (74,75,76). PAX8 binds to a single site on the proximal TG promoter and the PAX8-binding site overlaps with one of the NKX2.1-binding site (74).
Thyroglobulin Protein Domains
The human TG mRNA codes for a polypeptide chain of 2,767 amino acids (Mr = 302.773) (46,48). Amino acid composition of TG deduced from the mRNA sequence is in agreement with the amino acid composition data previously obtained on the protein (46,48). A leader peptide of 19 amino acids is followed by a polypeptide of 2,748 amino acids, corresponding to the monomeric human TG (48) (Figs. 5.1 and 5.2). Each TG chain contains 67 Tyr and 122 Cys residues, representing 2.44% and 4.44 of the total amino acids, respectively (46,48). Hydrophobic and charged amino acid residues are homogeneously distributed on the polypeptide (46). Eighty percent of the full length TG monomer has the three repeated domains, TG type-1, type-2, and type-3, comprising Cys-rich repeat domains covalently bound by disulfide bonds (46,48,77) (Figs. 5.1 and 5.2). The remaining 20%, in the C-terminal of the molecule, has the ACHE-like domain (residues 2,192 to 2,716) (46,48,78,79,80) (Figs. 5.1 and 5.2). TG monomer contains 11 type-1 repeat domains located between positions 12 and 1,191 and between 1,492 and 1,546; three type-2 domains located between amino acids 1,437 and 1,484, and five type-3 domains between residues 1,584 and 2,168 (46,48) (Figs. 5.1 and 5.2). The analysis in detail of the relation between the three families of Cys-rich repetitive units and the intron–exon junctions organization showed the following distribution: (i) Repeats type-1, 2, 4, 7, 10, and 11 are each encoded by a single exon (exon 4, 8, 10, 16, and 22, respectively), repeats 1 and 9 by two exons (exon 2 and 3, and 14 and 15, respectively), repeats 3 and 8 by three exons (exon 5, 6, and 7 and 11, 12, and 13, respectively), and repeats 5 and 6 are a fraction of exon 9 (38). N-terminal limit of repeat type 1–5 is ambiguous. (ii) The three type-2 repetitive elements map between exons 20 and 21 (38). (iii) The type-3 domain includes two subtypes, 3a and 3b, and maps between exons 23 and 37 (3a-1: Between exons 23 and 26, 3b-1: Between exons 26 and 30, 3a-2: Between exons 30 and 33, 3b-2: Between exons 33 and 36, and 3a-3: Between exons 36 and 37) (38). Each repeat TG type-1 is composed of approximately 60 amino acids, in which the positions of Cys, Pro, and Gly residues are highly conserved (46). Some insertions of variable length are found in fixed positions (46). The type-1 domains are grouped into two subgroups, type-1A and type-1B. Type-1A repeats (type 1–1 to 1–8 and 1–10) have a total of six Cys residues, whereas type-1B repeats (type 1–9 and 1–11) contains only four Cys residues. Consensus sequence of type-1 domains includes a central core of two highly conserved motifs, QC and CWCV. The proportion of Cys and Tyr in the type-1 domains is high, as compared to the entire protomer. Conservative intradomain disulfide bond patterns Cys1–Cys2, Cys3–Cys4, and Cys5–Cys6 prevail in all type-1A repeats (77). TG type-1 domains have been found as parts of many proteins with different domain architectures, functions, and phyletic distributions (81). In total, 170 protein sequences were found containing 333 type-1 modules. Six architecturally distinct groups containing the type-1 domain were identified in vertebrates in addition to the TG group: Testicans, secreted modular calcium binding protein (SMOCs), trops, splice variant of the major histocompatibility complex class II–associated invariant chain, insulin-like growth factor–binding protein (IGFBP), and nidogen. Interestingly, type-1 repeats could function as binder and reversible inhibitors of the protease in the lysosomal pathway (82). Several proteins that contain the type-1 domains, named thyropins (83), show inhibition of peptidases, such as saxiphilin and equistatin that are inhibitors of papain-like Cys proteinases (84,85). Analysis of the type-1 domain of the TG shows that this part of the molecule experienced large and complex rearrangements during vertebrate evolution (81). This fact is probably related to the appearance of multicellular animals, which was accompanied by rapid evolution of proteins involved in cell–cell interactions and in interactions of cells with their environment (81).
TG and ACHE share certain common tertiary structural features. The six Cys residues involved in ACHE intrachain disulfide bonds are conserved within TG (86). ACHE-like domain is required for protein dimerization and consequently plays a critical structural and functional role in the TG protein, that is essential for normal conformational maturation and intracellular transport of TG to the site of its iodination and hormonogenesis, via the secretory pathway (87). Truncated TG comprising only regions I, II, and III and devoid of the ACHE-like domain is blocked within the ER, making it incompetent for cellular export and consequently fails in to be transported to the site of thyroid hormone synthesis (88). On the other hand, studies showed that attaching an artificial signal peptide to the ACHE-like domain is itself sufficient for the rapid and efficient intracellular transport to extracellular space (88). Within 4 hours after synthesis, the secretory ACHE protein was nearly completely released from cells. Interestingly, co-expression of secretory ACHE domain with truncated TG (contains regions I, II, and III), as separate proteins within the ER, increased intracellular folding, promoted oxidative maturation, and facilitated secretion of truncated TG, indicating that ACHE-like domain may function as an intramolecular chaperone and as a molecular escort for TG regions I, II, and III (88). In wild-type TG, the ACHE-like domain is physically contiguous within TG protein, whereas the truncated TG protein is physically associated with secretory ACHE protein within the ER and remains associated throughout the secretory pathway. ACHE-like domain interacts with upstream TG regions. These findings provide significant evidence of the contributions of ACHE-like domain to the process of intracellular transport and stabilization of the TG dimer complex. Recently, Lee and Arvan reported that the truncated TG that contains regions II and III and an artificial signal peptide attached is a fully efficient secretory protein, whereas the truncated TG that contains only regions I, with or without the hinge sequences and their wild-type signal peptide cannot be secreted and remains in the ER (89). ACHE-like domain with the artificial signal peptide rescues the secretion of TG containing the regions I, II, and III, while the rescue of TG containing the regions I and II is minimal and the rescue of TG with only region I is not detectable (89). However, ACHE-like domain rescues region I in cells that also co-express TG with regions II and III (89). These experimental observations suggest that conformational maturation of region I is a limiting step in the TG maturation process, and this step is facilitated by the presence of both regions II and III and ACHE-like domain.
It is possible to speculate that a short amino-terminal portion of TG with a single hormonogenic site would be necessary for human survival on the surface of the earth with physiological levels of thyroid hormones. However, environmental circumstances as the unavailability of the iodide have promoted the evolution to a more complex TG structure for iodide storage in vertebrate organisms.
Thyroglobulin Posttranslational Modifications in the Endoplasmic Reticulum
TG, similar to other secretory glycoproteins, is co-trans-lationally translocated in the lumen of the ER via a protein-conducting channel, the translocon (90,91) (Fig. 5.3). When ribosomes translate proteins destined for secretion, they are directed to translocational pores in the cytosolic side ER by an RNA–multiprotein complex, the signal recognition particle (SRP) (90,91). The N-terminal signal peptide sequence of the TG is recognized by the SRP while the protein is still being synthesized on the ribosome. The synthesis pauses while the ribosome–mRNA–protein–SRP complex is transferred to an SRP receptor (SR) on the ER (Fig. 5.3). There, the nascent protein is inserted into the Sec61 translocation complex that passes through the ER membrane to the lumen side (91,92). The signal sequence is cleaved from the polypeptide by a membrane bound signal peptidase, once it has been translocated with ∼100 amino acids into the ER. The SRP then releases the signal sequence, restarting translation and translocation of the nascent chain.
Intensive posttranslational modifications take place in the ER and include glycosylation and formation and isomerization of intrachain disulfide bonds (93) (Fig. 5.3). TG in the lumen of the ER acquires its three-dimensional structure. With an estimated translation rate of 6.5 amino acids per second, one can deduce that it should take 7 minutes for the entire nascent TG polypeptide to be delivered into the ER. TG requires about 60 to 90 minutes to complete formation of its 60 disulfide bonds (94,95), while the half-life of medial Golgi arrival is about 90 to 120 minutes (95,96). It is obvious that for a large protein such as TG, the folding of its four main regions follows a specific and complex folding pathway. It is possible that the folding of the N-terminal region (regions I, II, and III) are independent of the C-terminal region (ACHE-like domain).
TG acquires the N-glycans during the process of translocation and elongation of the polypeptide chain into the ER. This addition of oligosaccharide by an oligosaccharyltransferase (membrane protein complex) can act to stabilize the protein and increase its solubility (97). Approximately 10% of this mass consists of covalently bound carbohydrates (93). The cDNA sequence of bovine TG predicts 14 putative N-linked glycosylation sites, 13 were confirmed as glycopeptides in the mature protein (98). Of the 20 putative N-linked glycosylation sites in the human TG polypeptide chain, 16 are glycosylated in the mature protein (99) (Fig. 5.2). First, while the protein continues to be associated with the translocon, two N-acetylglucosamines (GlcNAc) and nine mannoses with three terminal glucose residues are (Glc3, Man9, GlcNAc2) assembled onto a core oligosaccharide, which is then en bloc transferred from dolichol pyrophosphate to the consensus N-glycosylation sites (NXS/T, where X is any amino acid residue except Pro) on nascent TG chains (Fig. 5.3). Next, the three terminal glucoses of this core are trimmed by sequential action α-glucosidase I (type II membrane protein) and II (soluble luminal enzyme). α-glucosidase I removes the terminal α l-2-linked glucose, whereas glucosidase II removes sequentially the two remaining α l-3-linked residues. Finally, a terminal mannose is trimmed by α-mannosidase I with the formation of high mannose-type oligosaccharide (100) (Fig. 5.4).
Several ER enzymes and molecular chaperones, such as calnexin (CNX), calreticulin (CRT), GRP94, BiP, protein disulfide isomerase (PDI), ERp57, ERp29, and ERp72 interact, both concurrently and sequentially, with TG during its folding and assembly (94,95,101,102,103,104,105,106). The TSH-induced elevation of the GRP94, BiP, and PDI in the thyroid cells accelerate early TG folding (96). Such interactions of molecular chaperones with TG may also serve to prevent premature export of incorrectly folded or incompletely assembled secretory proteins from the ER by a process known as ER-quality control (107). At least three main chaperones BiP, CNX, and CRT are active in ER-quality control.
CNX and CRT are among the endoplasmic chaperones that interact through a cycle of binding and release with partially folded glycoproteins and determine if the proteins are to be released from the ER to Golgi system, or alternatively, if they are to be sent to the proteosome for degradation (100). These chaperones serve to prevent aggregation, protect proteins from premature degradation, and ensure the correct folding status of monomers before continuing in the intracellular trafficking pathway (100). CRT is a soluble protein found within the lumen of the ER, whereas CNX is a type I membrane protein composed of a luminal ER domain homologous to CRT and a cytoplasmic domain (100) (Fig. 5.3). Both possess a lectin-like binding site that prefers association with monoglucosylated oligosaccharide processing intermediates (Glc1Man9GlcNAc2) and both require the presence of luminal calcium. CNX and CRT with similar kinetics bind to a newly synthesized TG that are concomitant with the formation of TG intrachain disulfide bonds, preceding TG dimerization and exit from the ER (94). TG folds in the CNX/CRT pathway with the formation of ternary complexes (TG–CNX/CRT), both chaperons acting together and at the same time facilitating the maturation and export of TG (94). Loss of ER calcium causes premature exit of TG from the CNX/CRT system, extending the association of TG with BiP and GRP94 and consequently their retention in the ER (94). The binding of proteins to CNX or CRT is terminated by removal of the third glucose by glucosidase II (Fig. 5.3). If the protein is not correctly folded, it can be re-glycosylated by UDP-glucose:glycoprotein glucosyl transferase (UGGT), and reassociate with CNX or CRT (100).
GRP94 is a member of the heat-shock protein-90 family that associates with nascent TG (104). BiP, a heat-shock protein-70 family, is a glucose-regulated protein (GRP) that binds to unfolded polypeptides and prevents protein aggregation through non-covalent associations (108). BiP works with PDI to promote oxidative protein folding (101,103,109), whereas ERp57 oxidoreductase, a thiol reductase member of the PDI superfamily, works in a complex with CNX/CRT and promotes formation/isomerization of disulfide bonds (110) (Fig. 5.3). Consequently, the oxidative folding of TG proceeds with the aid of two chaperone–oxidoreductase complexes, CNX/CRT/ERp57, and Bip/PDI (110), forming mixed-disulfide folding intermediates between newly synthesized TG and ERp57 and PDI (95). The formation of intradomain disulfide bonds can be catalyzed near CNX/CRT bound sites by the ERp57 oxidoreductase and near BiP bound sites by PDI (95). ERp29 (105) and ERp72 (102,106) are other ER protein members of TG folding complex and probably reflects different specific actions.
TG conformational maturation culminates in TG homodimerization, without any interchain disulfide bridges (111), with progression to a compact structure. The TG homodimerization occurs only after TG has dissociated from CNX/CRT system, and after the appearance of fully oxidized TG (94). While correctly folded proteins are exported from the ER, misfolded proteins are retained and selectively degraded by the ER-associated degradation (ERAD) pathway. The ERAD pathway retrotranslocates misfolded proteins back into the cytosol where proteasomal degradation takes place. BiP might keep misfolded proteins in a reversibly aggregated state, whereas PDI directly targets misfolded proteins for retrotranslocation (107).
Thyroglobulin Posttranslational Modifications in the Golgi
The late stage of main posttranslational modifications of TG N-glycans occurs within the Golgi and it is presumably that it follows the same general process described for other proteins. Unfortunately, up to now, no specific studies have explored the posttranslational events related with the TG in the Golgi. Several of high mannose-type oligosaccharides of the TG imported from ER are trafficked through the Golgi and post-Golgi vesicular compartments without further processing, whereas the structure of other high mannose-type units is altered in the Golgi (Fig. 5.4). In analogy with what happens with other proteins, after removal of mannose residues by Golgi mannosidases, various N-acetylglucosamine glycosyltransferases may catalyze branching and elongation of the carbohydrate chains, producing hybrid-type or complex-type oligosaccharides (112) (Fig. 5.4). The action of these enzymes results in different N-glycans that contain from 1 to 2 branches in hybrid type and from 2 to 5 branches in complex type (Fig. 5.4). Individual branches in both types of N-glycans are elongated by addition of galactose, fucose, and sialic acid residues catalyzed by galactosyltransferases, fucosyltransferase, and sialyltransferases in Golgi (112).
Endo-B-N-acetylglucosaminidase H (endo H) is used to monitor posttranslational modification in the Golgi system. Endo H digests high mannose-type oligosaccharides, but the oligosaccharides that have been converted into complex sugars in the Golgi become endo H–resistant. This tool is a useful indicator of the efficiency of TG export from the ER. Eight of sixteen confirmed glycosylation sites in the human TG (at asparagines residues 57, 465, 510, 729, 797, 1,697, 1,755, and 2,231; Fig. 5.2) appear to be linked to complex-type oligosaccharide units containing fucose and galactose in addition to mannose and glucosamine (99). Five sites (at positions 1,201, 1,330, 1,994, 2,276, and 2,563; Fig. 5.2) contain high mannose-type units (mannose and glucosamine) and two sites (at positions 179 and 1,346, Fig. 5.2) are linked to oligosaccharide units containing galactose in addition to mannose and glucosamine but no fucose and may be either hybrid oligosaccharide structure or complex oligosaccharide moieties lacking fucose (99). Finally, very different oligosaccharide composition types were found associated with position 928 (complex or high mannose) (99) (Fig. 5.2). N-glycans have an important role for the processes of intracellular transport and apical sorting of proteins, such as protein folding, ER quality control, ERAD, ER-to-Golgi trafficking and apical delivery of glycosylated proteins (113).
In addition, the human TG also contains O-linked glycosylation (114,115). O-glycans are attached to a subset of serine and threonine residues. Chondroitin sulfate proteoglycans synthesis in TG is initiated by the transfer of a galactose unit, by galactosyltransferase, to a xylosyl-serine in the TG peptide (114).
Exocytosis Pathways of Thyroglobulin
Thyrocytes are highly specialized epithelial cells that require vectorial transport and polarized distribution of transporters and receptors to apical or basolateral membrane domains. TG is transferred to the follicle lumen concentrated into secretory vesicles.
The exocytotic vesicles are accumulated in the most apical cell region but vesicles are also seen in relation to the Golgi area, and on transit in between the Golgi area and the apical plasma membrane. The exocytotic vesicles have a diameter of about 150 nm and TSH induces exocytosis. The major soluble protein component is 19S TG. 12S TG was also present.
In the trans-Golgi Network, proteins are sorted into vesicles bound for different destinations including the plasma membrane, the endosome/lysosome, and secretory granules. The apical secretory pathway can be initiated with the formation and fusion in the plane of the membrane in the trans-Golgi network of lipid–protein microdomains known as sphingolipid–cholesterol rafts (118) (Figs. 5.4 and 5.5). This microdomain serves as a platform upon which apical secretory proteins can be transported (118). In Madin-Darby Canine Kidney (MDCK) cells and PC Cl3 thyrocytes, a subpopulation of newly synthesized recombinant TG is recovered in a Triton X-100 insoluble, glycosphingolipid/cholesterol–raft fraction (118). These imply that TG utilizes a cargo-selective mechanism for apical sorting. VIP36 and VIP17 (vesicular integral membrane proteins of 17 kDa, also known as MAL) are lectins identified as components of the raft machinery in the MDCK cells (119,120) (Figs. 5.4 and 5.5). MAL is a non-glycosylated protein containing multiple hydrophobic segments (120). Human thyroid and epithelial FRT cells (a polarized rat cell line of thyroid origin) expressed MAL transcripts (121). MAL is a major protein component of the raft system and is restricted to the apical zone of thyroid FRT cells (121). These observations suggest that TG may be transported to the apical surface and delivery to follicular lumen via raft pathway with possible association with one or more lectin components.
Thyroglobulin and Thyroid Hormone Synthesis
The central steps in thyroid hormone synthesis take place at the cell–colloid interface of follicular thyroid cells. Once TG has reached the follicular lumen, several Tyr residues are iodinated (122,123). The subsequent coupling between either two DIT residues, or between a DIT and a MIT residue, results in the formation of T4 or T3 within the TG molecule. The coupling reaction involves iodophenyl transfer from a donor MIT or DIT residue to an acceptor DIT residue and the dehydroalanine residues in the donor site remains within the polypeptide chain (123). The efficiency of the thyroid hormone synthesis in TG is extremely high especially under rather low iodide conditions (123). The molecular structure of TG seems to have physical characteristics by which a hormonogenic iodotyrosine residue plays only one of two roles, acceptor or donor. With the establishment of the cDNA sequence of TG, the major hormonogenic sites could be localized within TG’s polypeptide chain. Four hormonogenic acceptor Tyr residues have been identified and localized at positions 5, 1,291, 2,554, and 2,747 in human TG (Figs. 5.1, 5.2 and 5.6) and three Tyr residues localized at positions 130, 847, and 1,448 have been proposed as donor sites (46,48,122) (Fig. 5.2). The homology of the rat TG with the mouse, bovine, and human TG is 90%, 76%, and 78%, respectively, at the nucleotide level; and 90%, 71%, and 74%, respectively, at the amino acid level. The relative positions of the four human hormonogenic sites are conserved in the other mammalian species (Fig. 5.6). The most important T4 forming site in all vertebrate species examined is at Tyr5 (124,125), whereas Tyr130 is an important outer ring donor for thyroxine formation at Tyr5 (125). In most species, Tyr5 accounts for 44% of T4 and 25% of T3, Tyr2554 for 24% of T4 and 18% of T3, Tyr2747 for 50% of T3 and Tyr1291 that is prominent in guinea pigs and rabbits for 17% of T4 (126).