Molecular Endocrinology, Endocrine Genetics, and Precision Medicine

Introduction

The study of the endocrine system has undergone a dramatic evolution since the 1990s, from the traditional physiologic studies that dominated the field for many years to the discoveries of molecular endocrinology and endocrine genetics. At the present time, the major impact of molecular medicine on the practice of pediatric endocrinology relates to diagnosis and genetic counseling for a variety of inherited endocrine disorders. In contrast, the direct therapeutic application of this new knowledge is still in its infancy. Endocrine oncology has greatly benefited from the application of new drugs that were designed to target specific mutations in, for example, thyroid cancer. A notable recent therapeutic advancement that followed the identification of the molecular basis of an endocrine disorder is the development of monoclonal antibody burosumab directed against the fibroblast growth factor-23 protein to treat X-linked hypophosphatemic rickets. This chapter is an introduction to the basic principles of molecular biology, common laboratory techniques, and some examples of the recent advances made in clinical pediatric endocrinologic disorders with an emphasis on endocrine genetics. Most new diagnostic testing, pharmacogenetics, and molecular therapies are discussed in the disease-specific chapters of this book, and only examples that highlight the principle/strategy under discussion are discussed in this chapter.

Basic molecular tools

Isolation and Digestion of DNA and Southern Blotting

The human chromosome comprises a long double-stranded helical molecule of deoxyribonucleic acid (DNA) associated with different nuclear proteins. As DNA forms the starting point of the synthesis of all the protein molecules in the body, molecular techniques using DNA have proven to be crucial in the development of diagnostic tools to analyze endocrine diseases. DNA can be isolated from any human tissue, including circulating white blood cells. About 200 μg of DNA can be obtained from 10 to 20 mL of whole blood, with the efficiency of DNA extraction being dependent on the technique used and the method of anticoagulation used. The extracted DNA can be stored almost indefinitely at an appropriate temperature. Furthermore, lymphocytes can be transformed with the Epstein-Barr virus (or other means) to propagate indefinitely in cell culture as “immortal” cell lines, thus providing a renewable source of DNA. For performing molecular genetic studies, transformed lymphoid lines are routinely the tissue of choice, because a renewable source of DNA obviates the need to obtain further blood from the family. Fibroblast-derived cultures can also serve as a permanent source of DNA or ribonucleic acid (RNA) (once transformed), but they have to be derived from surgical specimens or a biopsy. It should be noted that, because the expression of many genes is tissue specific, immortalized lymphoid or fibroblastoid cell lines cannot be used to analyze the abundance or composition of messenger RNA (mRNA) for a specific gene. Hence, studies involving mRNA necessitate the analysis of the tissue(s) expressing the gene, as outlined in the section on “RNA Analysis.” More recently, the problem of limited amounts of DNA obtainable from certain sources has been circumvented by the utilization of the polymerase chain reaction (PCR), a versatile way to faithfully multiply segments of the original DNA.

DNA is present in extremely large molecules; the smallest autosomal chromosome (chromosome 22) has about 50 million base pairs and the entire haploid human genome is estimated to comprise 3 to 4 billion base pairs. This extreme size precludes the analysis of DNA in its native form in routine molecular biology techniques. The techniques for identification and analysis of DNA became feasible and readily accessible with the discovery of enzymes termed restriction endonucleases . These enzymes, originally isolated from bacteria, cut DNA into smaller sizes on the basis of specific recognition sites that vary from two to eight base pairs in length. The term restriction refers to the function of these enzymes in bacteria. A restriction endonuclease destroys foreign DNA (such as bacteriophage DNA) by cleaving the DNA at specific sites, thereby “restricting” the entry of foreign DNA in the bacterium. Several hundred restriction enzymes with different recognition sites are now commercially available. Because the recognition site for a given enzyme is fixed, the number and sizes of fragments generated for a particular DNA molecule remain consistent with the number of recognition sites and provide predictable patterns after separation by electrophoresis.

The analysis of a few hundred base pairs of DNA in the region of interest is difficult when DNA from all human chromosomes is cut and separated on the same gel. These limitations are circumvented by the technique of Southern blotting (named after its originator, Edward Southern). Southern blotting involves digestion of DNA and separation by electrophoresis through agarose. After electrophoresis, the DNA is transferred to a solid support (such as nitrocellulose or nylon membranes), enabling the pattern of separated DNA fragments to be replicated onto the membrane ( Fig. 2.1 ). The DNA is then denatured (i.e., the two strands are physically separated), fixed to the membrane, and the dried membrane is mixed with a solution containing the DNA probe. A DNA probe is a fragment of DNA that contains a nucleotide sequence specific for the gene or chromosomal region of interest. For purposes of detection, the DNA probe is labeled with an identifiable tag, such as radioactive phosphorus (e.g., ³² P) or a chemiluminescent moiety; the latter has now almost exclusively replaced radioactivity. The process of mixing the DNA probe with the denatured DNA fixed to the membrane is called hybridization , the principle being that there are only four nucleic acid bases in DNA—adenine (A), thymidine (T), guanine (G), and cytosine (C)—that always remain complementary on the two strands of DNA, A pairing with T, and G pairing with C. Following hybridization, the membrane is washed to remove the unbound probe and exposed to an x-ray film either in a process called radioautography (also referred to as autoradiography ) to detect radioactive phosphorus or in a process used to detect the chemiluminescent tag. Only those fragments that are complementary and have bound to the probe containing the DNA of interest will be evident on the x-ray film, enabling the analysis of the size and pattern of these fragments. As routinely performed, the technique of Southern analysis can detect a single copy gene in as little as 5 μg of DNA, the DNA content of about 10 ⁶ cells.

Restriction Fragment Length Polymorphism

Restriction fragment length polymorphism (RFLP) is a technique that is currently rarely used but is widely present in endocrine genetic literature, as a number of endocrine genetic discoveries over the last 2 to 3 decades were based on this technique. The number and size of DNA fragments resulting from the digestion of any particular region of DNA form a recognizable pattern. Small variations in a sequence among unrelated individuals may cause a restriction enzyme recognition site to be present or absent; this results in a variation in the number and size pattern of the DNA fragments produced by digestion with that particular enzyme. Thus this region is said to be polymorphic for the particular enzyme tested—that is, an RFLP ( Fig. 2.2 ). The value of RFLP is that it can be used as a molecular tag for tracing the inheritance of the maternal and paternal alleles. Furthermore, the polymorphic region analyzed does not need to encode the genetic variation that is the cause of the disease being studied, but only to be located near the gene of interest. When a particular RFLP pattern can be shown to be associated with a disease, comparing the offspring’s RFLP pattern with the RFLP pattern of the affected or carrier parents can determine the likelihood of an offspring inheriting the disease. The major limitation of the RFLP technique is that its applicability for the analysis of any particular gene is dependent on the prior knowledge of the presence of convenient (“informative”) polymorphic restriction sites that flank the gene of interest by at most a few kilobases. Because these criteria may not be fulfilled in any given case, the applicability of RFLP cannot be guaranteed for the analysis of a given gene.

Polymerase Chain Reaction

PCR is a technique that was developed in the late 1980s and revolutionized molecular biology ( Fig. 2.3 ). PCR allows the selective logarithmic amplification of a desired fragment of DNA from a complex mixture of DNA that theoretically contains at least a single copy of the target fragment. In the typical application of this technique, some knowledge of the DNA sequences in the region to be amplified is necessary, so that a pair of short (approximately 18–25 bases in length) specific oligonucleotides (“primers”) can be synthesized. The primers are synthesized in such a manner that they define the limits of the region to be amplified. The DNA template containing the segment that is to be amplified is heat denatured, such that the strands are separated and then cooled to allow the primers to anneal to the respective complementary regions. The enzyme Taq polymerase, a heat stable enzyme originally isolated from the bacterium Thermophilus aquaticus , is then used to initiate synthesis (extension) of DNA. The DNA is repeatedly denatured , annealed , and extended in successive cycles in a machine called the thermocycler that permits this process to be automated. In the usual assay, these repeated cycles of denaturing, annealing, and extension result in the synthesis of approximately 1 million copies of the target region in about 2 hours. To establish the veracity of the amplification process, the identity of the amplified DNA can be analyzed by electrophoresis, hybridization to RNA or DNA probes, digestion with informative restriction enzyme(s), or subjected to direct DNA sequencing. The relative simplicity combined with the power of this technique has resulted in widespread use of this procedure and has spawned a wide variety of variations and modifications that have been developed for specific applications. From a practical point of view, the major drawback of PCR is the propensity to get cross-contamination of the target DNA. This drawback is the direct result of the extreme sensitivity of the method that permits amplification from one molecule of the starting DNA template. Thus unintended transfer of amplified sequences to items used in the procedure will amplify DNA in samples that do not contain the target DNA sequence (i.e., a false positive result). Cross-contamination should be suspected when amplification occurs in negative controls that did not contain the target template. One of the most common modes of cross-contamination is via aerosolization of the amplified DNA during routine laboratory procedures, such as vortexing, pipetting, and manipulation of microcentrifuge tubes. Meticulous care to experimental technique, proper organization of the PCR workplace, and inclusion of appropriate controls are essential for the successful prevention of cross-contamination during PCR experiments.

In general, PCR applications are either directed toward the identification of a specific DNA sequence in a tissue or body fluid sample or used for the production of relatively large amounts of DNA of a specific sequence, which then are used in further studies. Examples of the first type of application are common in many fields of medicine, such as in microbiology, wherein the PCR technique is used to detect the presence of DNA sequences specific for viruses or bacteria in a biological sample. Examples of such an application in pediatric endocrinology include the use of PCR of the SRY gene for detecting Y chromosome material in patients with karyotypically defined Turner syndrome and the rapid identification of chromosomal gender in cases of fetal or neonatal sexual ambiguity ( Fig. 2.4 ).

Most PCR applications, both as research tools and for clinical use, are directed toward the production of a target DNA or the complementary DNA of a target RNA sequence. The DNA that is made (“amplified”) is then analyzed by other techniques, such as DNA sequencing.

RNA Analysis

The majority (> 95%) of the chromosomal DNA represents noncoding sequences. These sequences harbor regulatory elements, serve as sites for alternate splicing, and are subject to methylation and other epigenetic changes that affect gene function. However, at present most disease-associated mutations in the human gene have been identified in coding sequences. An alternate strategy to analyze mutations in a given gene is to study its mRNA, which is the product (via transcription) of the remaining 5% of chromosomal DNA that encodes for proteins. In addition, because the mRNA repertoire is cell and tissue specific, the analyses of the mRNA sequences provide unique information about tissue-specific proteins produced in a particular organ/tissue.

There are many techniques for analyzing mRNA. The oldest and most widely used in the past, although now rarely used, is Northern blotting (so named because it is based on the same principle as the Southern blot), which is one of the original methods used for mRNA analysis. In Northern blotting, RNA is denatured by treating it with an agent, such as formaldehyde, to ensure that the RNA remains unfolded and in the linear form. The denatured RNA is then electrophoresed and transferred onto a solid support (such as nitrocellulose membrane) in a manner similar to that described for the Southern blot. The membrane with the RNA molecules separated by size is probed with the gene-specific DNA probe labeled with an identifiable tag that, as in the case of Southern blotting, is either a radioactive label (e.g., ³² P) or more commonly a chemiluminescent moiety. The nucleotide sequence of the DNA probe is complementary to the mRNA sequence of the gene and is hence called complementary DNA ( cDNA ). It is customary to use labeled cDNA (and not labeled mRNA) to probe Northern blots because DNA molecules are much more stable and easier to manipulate and propagate (usually in bacterial plasmids) than mRNA molecules. The Northern blot provides information regarding the amount (estimated by the intensity of the signal on radioautography) and the size (estimated by the position of the signal on the gel in comparison to concurrently electrophoresed standards) of the specific mRNA. Although the Northern blot technique represents a versatile and straightforward method to analyze mRNA, it had major drawbacks, and it has now been supplanted by more sensitive and less time-consuming techniques that are discussed later.

One of the most sensitive methods for the detection and quantitation of mRNA currently available is the technique of quantitative reverse transcriptase (RT)-PCR (qRT-PCR). This technique combines the unique function of the enzyme reverse transcriptase with the power of PCR. qRT-PCR is exquisitely sensitive, permitting analysis of gene expression from very small amounts of RNA. Furthermore, this technique can be applied to a large number of samples or many genes (multiplex) in the same experiment. These two critical features endow this technique with a measure of flexibility unavailable in more traditional methods, such as Northern blot or solution hybridization analysis. The first step in qRT-PCR analysis is the production of DNA complementary (cDNA) to the mRNA of interest. This is done by using the enzymes with RNA-dependent DNA polymerase activity that belong to the RT group of enzymes (e.g., Moloney murine leukemia virus [MMLV], avian myeloblastosis virus [AMV] reverse transcriptase, an RNA-dependent DNA polymerase). The RT enzyme, in the presence of an appropriate primer, will synthesize DNA complementary to RNA. The second step in the qRT-PCR analysis is the amplification of the target DNA, in this case the cDNA synthesized by the RT enzyme. The specificity of the amplification is determined by the specificity of the primer pair used for the PCR amplification. To establish the veracity of the amplification process, the identity of the amplified DNA can be analyzed by electrophoresis, hybridization to RNA or DNA probes, digestion with informative restriction enzyme(s), or subjected to direct DNA sequencing.

Whereas the detection of a specific mRNA by this technique is relatively straightforward, the precise quantitation of the mRNA in a given sample is more complicated. Because the production of DNA by PCR involves an exponential increase in the amount of DNA synthesized, relatively minor differences in any of the variables controlling the rate of amplification will cause a marked difference in the yield of the amplified DNA. In addition to the amount of template DNA, the variables that can affect the yield of the PCR include the concentration of the polymerase enzyme, magnesium, nucleotides (dNTPs), and primers. The specifics of the amplification procedure, including cycle length, cycle number, annealing, extension, and denaturing temperatures, also affect the yield of DNA. Because of the multitude of variables involved, routine RT-PCR is unsuitable for performing a quantitative analysis of mRNA. To circumvent these pitfalls alternate strategies have been developed. One technique for determining the concentration of a particular mRNA in a biological sample is a modification of the basic PCR technique called competitive RT-PCR . This method is based on the coamplification of a mutant DNA that can be amplified with the same pair of primers being used for the target DNA. The mutant DNA is engineered in such a way that it can be distinguished from the DNA of interest by either size or the inclusion of a restriction enzyme site unique to the mutant DNA. The addition of equivalent amounts of this mutant DNA to all the PCR reaction tubes serves as an internal control for the efficiency of the PCR process, and the yield of the mutant DNA in the various tubes can be used for the equalization of the yield of the DNA by PCR. It is important to ensure for accurate quantitation of the DNA of interest that the concentrations of the mutant and target template should be nearly equivalent. Because the use of mutated DNA for normalization does not account for the variability in the efficiency of the RT enzyme, a variation of the original method has been developed. In this modification, competitive mutated RNA transcribed from a suitably engineered RNA expression vector is substituted for the mutant DNA in the reaction before initiating the synthesis of the cDNA. Competitive RT-PCR can be used to detect changes of the order of two- to threefold of even very rare mRNA species. The major drawback of this method is the propensity to get inaccurate results because of the contamination of samples with the mRNA of interest. In theory, as the technique is based on PCR, contamination by even one molecule of mRNA of interest can invalidate the results. Hence, scrupulous attention to laboratory technique and set up is essential for the successful application of this technique.

In general, two types of methods are used for the detection and quantitation of PCR products: the “end-point” measurements of products and the newer “real-time” techniques. End-point determinations (e.g., the competitive RT-PCR technique described earlier) analyze the reaction after it is completed, whereas real-time determinations are made during the progression of the amplification process. In general, the real-time approach is more accurate and is currently the preferred method. Advances in fluorescence detection technologies have made the use of real-time measurement possible for routine use in the laboratory. One of the popular techniques that takes advantage of real-time measurements is the TaqMan (fluorescent 5′ nuclease) assay ( Fig. 2.5 ). The unique design of TaqMan probes, combined with the 5′ nuclease activity of the PCR enzyme (Taq polymerase), allows direct detection of PCR product by the release of a fluorescent reporter during the PCR amplification by using specially designed machines (ABI Prism 5700/7700). The TaqMan probe consists of an oligonucleotide synthesized with a 5′-reporter dye (e.g., FAM; 6-carboxy-fluorescein) and a downstream, 3′-quencher dye (e.g., TAMRA; 6-carboxy-tetramethyl-rhodamine). When the probe is intact, the proximity of the reporter dye to the quencher dye results in suppression of the reporter fluorescence, primarily by Forster-type energy transfer. During PCR, forward and reverse primers hybridize to a specific sequence of the target DNA. The TaqMan probe hybridizes to a target sequence within the PCR product. The Taq polymerase enzyme, because of its 5′-3′ nuclease activity, subsequently cleaves the TaqMan probe. The reporter dye and the quencher dye are separated by cleavage, resulting in increased laser-stimulated fluorescence of the reporter dye as a direct consequence of target amplification during PCR. This process occurs in every cycle and does not interfere with the exponential accumulation of product. Both primer and probe must hybridize to the target for amplification and cleavage to occur. The fluorescence signal is generated only if the target sequence for the probe is amplified during PCR. Because of these stringent requirements, nonspecific amplification is not detected. Fluorescent detection takes place through fiber optic lines positioned above optically nondistorting tube caps. Quantitative data are derived from a determination of the cycle at which the amplification product signal crosses a preset detection threshold. This cycle number is proportional to the amount of starting material, thus allowing for a measurement of the level of specific mRNA in the sample. An alternate machine (Light Cycler) also uses fluorogenic hydrolysis or fluorogenic hybridization probes for quantification in a manner similar to the ABI system.

MicroRNA

One of significant advances in the early 2000s in the field of RNA biology is the discovery of small (20–30 nucleotide) noncoding RNAs. In general, there are two categories of small noncoding RNAs: microRNA (miRNA) and small interfering RNA (siRNA). miRNAs are expressed products of an organism’s own genome, whereas siRNAs are synthesized in the cells from foreign double-stranded RNA (e.g., from viruses or transposons or from synthetic DNA introduced into the cell to study the function of a particular gene/process). In addition, there are differences in the biogenesis of these two classes of small nucleotide RNAs. These differences notwithstanding, the overall biological effect of these small nucleotide RNAs is translational repression or target degradation and gene silencing by binding to complementary sequences on the 3′ untranslated region of target mRNA; positive regulation of gene expression via such a mechanism is distinctly uncommon. The complexity of the phenomenon is increased by the fact that in a cell- or tissue-specific context, a single miRNA can target multiple RNAs and more than one miRNA can recognize the same mRNA target to amplify and strengthen the translational repression of the target gene. It is estimated that this phenomenon is present in several cell types and the human genome codes for more than 1000 miRNAs that could target 60% to 70% of mammalian genes. miRNA-mediated events have been implicated in regulation of cell growth and differentiation, cell growth, apoptosis, and other cellular processes. To date, the major impact of the discovery of miRNA has been in the fields of developmental biology, organogenesis, and cancer. miRNA and miRNA-related events (e.g., proteins involved in miRNA processing) have been directly implicated in only a small number of nonneoplastic endocrine disorders (e.g., diGeorge syndrome and X-linked mental retardation). It is predicted that as we learn more about the basic biology of this process, small nucleotide noncoding RNAs will be implicated in the pathogenesis of a wider spectrum of endocrine diseases.

Detection of mutations in genes

Changes in the structural organization of a gene that impact its function involve deletions, insertions, or transpositions of relatively large stretches of DNA, or more frequently single-base substitutions in functionally critical regions. High throughput or next-generation sequencing (NGS) has revolutionized the identification of mutations in genes.

Direct Methods

DNA sequencing is the current gold standard for obtaining unequivocal proof of a point mutation. However, DNA sequencing has its limitations and drawbacks. A clinically relevant problem is that current DNA sequencing methods do not reliably and consistently detect all mutations. For example, in many cases where the mutation affects only one allele (heterozygous), the heights of the peaks of the bases on the fluorescent readout corresponding to the wild-type and mutant allele are not always present in the predicted (1:1) ratio. This limits the discerning power of “base calling” computer protocols and results in inconsistent or erroneous assignment of DNA sequence to individual alleles. Because of this limitation, clinical laboratories routinely determine the DNA sequence of both the alleles to provide independent confirmation of the absence/presence of a putative mutation. DNA sequencing can be labor intensive and expensive, although advances in pyrosequencing (discussed later), for example, have made it technically easier and cheaper.

Although the first DNA sequences were determined with a method that chemically cleaved the DNA at each of the four nucleotides, the enzymatic or dideoxy method developed by Sanger and colleagues in 1977 became the most commonly used for routine purposes ( Fig. 2.6 ). This method uses the enzyme DNA polymerase to synthesize a complementary copy of the single-stranded DNA (“template”) whose sequence is being determined. Single-stranded DNA can be obtained directly from viral or plasmid vectors that support the generation of single-stranded DNA or by partial denaturing of double-stranded DNA by treatment with alkali or heat. The enzyme DNA polymerase cannot initiate synthesis of a DNA chain de novo but can only extend a fragment of DNA. Hence the second requirement for the dideoxy method of sequencing is the presence of a “primer.” A primer is a synthetic oligonucleotide, 15 to 30 bases long, whose sequence is complementary to the sequence of the short corresponding segment of the single-stranded DNA template. The dideoxy method exploits the observation that DNA polymerase can use both dNTP and 2′,3′-dideoxynucleoside triphosphates (ddNTPs) as substrates during elongation of the primer. Whereas DNA polymerase can use dNTP for continued synthesis of the complementary strand of DNA, the chain cannot elongate any further after addition of the first ddNTP, because ddNTPs lack the crucial 3′-hydroxyl group. To identify the nucleotide at the end of the chain, four reactions are carried out for each sequence analysis, with only one of the four possible ddNTPs included in any one reaction. The ratio of the ddNTP and dNTP in each reaction is adjusted so that these chain terminations occur at each of the positions in the template where the nucleotide occurs. To enable detection by radioautography, the newly synthesized DNA is labeled, usually by including in the reaction mixture radioactively labeled dATP (for the older manual methods) or, most commonly, currently fluorescent dye terminators in the reaction mixture (now in use in automated techniques). The separation of the newly synthesized DNA strands manually is done via high-resolution denaturing polyacrylamide electrophoresis or with capillary electrophoresis in automatic sequencers. Fluorescent detection methods have enabled automation and enhanced throughput. In capillary electrophoresis, DNA molecules are driven to migrate through a viscous polymer by a high electric field to be separated on the basis of charge and size. Although this technique is based on the same principle as that used in slab gel electrophoresis, the separation is done in individual glass capillaries rather than gel slabs, facilitating loading of samples and other aspects of automation. Whereas manual methods allow the detection of about 300 nucleotides of sequence information with one set of sequencing reactions, automated methods using fluorescent dyes and laser technology can analyze 7500 or more bases per reaction. To sequence larger stretches of DNA, it is necessary to divide the large piece of DNA into smaller fragments that can be individually sequenced. Alternatively, additional sequencing primers can be chosen near the end of the previous sequencing results, allowing the initiation point of new sequence data to be moved progressively along the larger DNA fragment.

One of the seminal technological advances has been the introduction of microarray-based methods for detection, and analysis of nucleic acids. Microarrays contain thousands of oligonucleotides deposited or synthesized in situ on a solid support, typically a coated glass slide or a membrane. In this technique, a robotic device is used to print DNA sequences onto the solid support. The DNA probes immobilized on the microarray slide as spots can either be cloned cDNA or gene fragments (expressed sequence-tags [ESTs]), or oligonucleotides corresponding to known genes or putative open reading frames. The arrays are hybridized with fluorescent targets prepared from RNA extracted from tissue/cells of interest; the RNA is labeled with fluorescent tags, such as Cy3 and Cy5. The prototypic microarray experimental paradigm consists of comparing mRNA abundance in two different samples. One fluorescent target is prepared from control mRNA and the second target with a different fluorescent label is prepared from mRNA isolated from the treated cells or tissue under investigation. Both targets are mixed and hybridized to the microarray slide, resulting in target gene sequences hybridizing to their complementary sequences on the microarray slide. The microarray is then excited by laser, and the fluorescent intensity of each spot is determined with the relative intensities of the two colored signals on individual spots being proportional to the amounts of specific mRNA transcripts in each sample ( Fig. 2.7 ). Analysis of the fluorescent intensity data yields an estimation of the relative expression levels of the genes in the sample and control sample. Microarrays enable individual investigators to perform large-scale analyses of model organisms and to customize arrays for special genome applications.

The method of choice for global expression profiling depends on several factors, including technical aspects, labor, price, time, and effort involved, and, most important, the type of information that is sought. Technical advances in the development of expression arrays, their abundance and commercial availability, and the relative speed with which analysis can be done are all factors that make arrays more useful in routine applications. In addition, array content can now be readily customized to cover from gene clusters and pathways of interest to the entire genome: some studies examine series of tissue-specific transcripts or genes known to be involved in particular pathology; others directly use arrays covering the whole genome. Another factor that needs to be considered before embarking on any high-throughput approach is whether individual or pooled samples will be investigated. Series of pooled samples reduce the price, the time spent, and the number of the experiments down to the most affordable. Investigating individual samples, however, is important for identifying unique expression ratios in a given type of tissue or cell. There are limitations of microarray-based techniques; for example, similar to direct DNA sequencing methods, microarray-based methods also suffer from the disadvantage of not being able to reliably and consistently detect heterozygous mutations. Furthermore, microarrays cannot be used to detect insertions of multiple nucleotides without exponentially increasing the number of oligonucleotides that must be immobilized on the glass slides.

A more reliable technique in mutation identification is pyrosequencing, which is based on an enzymatic real-time monitoring of DNA synthesis by bioluminescence ( Fig. 2.8 ). Pyrosequencing is performed by the addition of dNTPs individually, in a predefined dispensation order, so that the nascent nucleotide chain is extended one nucleotide residue per dispensation event. Detection of nucleotide sequence is performed by way of a chain of enzymatic reactions involving the activities of DNA polymerase, apyrase, ATP sulfurylase, and luciferase, respectively, allowing for the incorporation of complementary nucleotide, degradation of unused dNTP, generation of luciferase-substrate from pyrophosphate and adenosine 5’-phosphosulfate, and emission of light from the ATP-driven conversion of luciferin to oxyluciferin. Incorporation of a particular nucleotide is displayed graphically in the form of a chart recording of nucleotide dispensation event versus the intensity of emitted light. This cascade of enzyme reactions is quantitative, in that increased light intensity is produced upon incorporation of multiple nucleotides.