Mechanisms of Mutation




Abstract


Mutation is a sudden, inheritable change in DNA sequence. Single nucleotide variation (SNV), insertion or deletion of a few nucleotides (indels), rearrangements, change in the number of copies of larger stretches of DNA (copy number variation, CNV), change in the structure or number of chromosomes, and insertion or movement of transposable elements are all mutations. Many of these changes occur as a result of replication and recombination. Others result from the alteration of DNA by exogenous agents (mutagens) followed by replication by polymerases of relaxed specificity. The mutation rate is the resultant of the interaction of error producing and repair processes that detect changes and restore DNA to the parental sequence. Studies of the cancer genome have identified new mechanisms for the production of multiple closely linked mutations. A portion of human genetic disease is the result of de novo mutation. However, it is still not possible to predict the precise phenotypic consequence of a particular mutation in a particular individual.




Keywords

transition, transversion, indel, single nucleotide variation (SNV), copy number variation (CNV), excision repair, mismatch repair (MMR), double-strand break repair (DSBR), translesion synthesis, recombination

 




Introduction


Darwin realized the need for variation to provide a basis for natural selection, but he had no way of understanding the mechanisms by which such variation arose. Vague ideas of the origin of variation in the late nineteenth century gave way to the term mutation, coined by deVries to describe the discontinuous variation associated with Mendelian traits. Genes were first recognized and defined by mutations with an extreme phenotype (see Table 1.1 ). Further progress led to conceptualizing the gene as a more complex structure with multiple “sites” for mutation available within the same gene. The nature of such sites was not clear, nor was the relationship between the physiological effects of gene mutations and the structural change involved. A typical pre-Watson–Crick examination question, “what is a gene?” could be answered in terms of function, of mutation, or of recombination and the question for geneticists was the relationship between these definitions.



Table 1.1

Special Abbreviations and Definitions














































































































































Term/Abbreviations Definition
Abasic (apurinic/apyrimidinic) site Site in DNA missing a base attached to the 1′-position of the sugar
APOBEC/AID Apolipoprotein B mRNA editing enzyme (APOBEC) and activation-induced deaminase (AID). A family of cytidine deaminases
Aneuploidy Eucaryotic cells with the normal diploid (2 n ) number of chromosomes are “euploid.” “Haploid” cells are n . “Polyploid” cells are 3 n , 4 n , etc. “Aneuploid” cells are 2 n ± a number other than n
Base excision repair (BER) A repair mechanism in which single nucleotide bases are removed and replaced by a patch of one or, at most, a few nucleotides
Chromothripsis Multiple localized chromosome rearrangements occurring in a single event and in one or a few chromosomes
Copy number variation (CNV) Altered number of copies of a gene or extended DNA sequence present in the genome
Double-strand break repair (DSBR) Joining together of two DNA fragments to make a single molecule
Epigenetic Heritable changes in gene expression that cannot be tied to DNA sequence variation and involving the active perpetuation of local chromatin states
Fidelity A measure of the relative ability of DNA polymerases to insert the “correct” complementary base
Frameshift mutation The insertion or deletion of a number of nucleotides not divisible by 3, properly speaking in a coding region of a gene. Largely replaced by the term “indel.”
Genome (1) The complete set of genetic material present in an organism. (2) The complete sequence of the DNA in an organism
Genotype The genetic constitution of an organism
Holliday junction A mobile junction formed in recombination between four strands of DNA. It is “resolved” by specific enzymes to regenerate two double-stranded molecules
Homologous recombination (HR) A DSBR process involving the use of an allelic DNA sequence as a source of information
Indel Insertion or deletion of a small number of nucleotides in the DNA structure
Insertional mutagenesis Mutation by insertion of one or more nucleotides. Often used to denote inactivation of the genes by insertion of large transposable elements
Inversion A rearrangement of the chromosome so that the order of the nucleotide pairs is reversed: if the normal order is ABCDEF, the order AEDCBF would constitute an inversion
Kataegis Multiple, localized mutations, mostly C→T
L1 element A common retrotransposon found in the human genome
Microhomology-mediated end joining (MMEJ) An end joining DSBR mechanism utilizing the homology of a relatively few bases to orient the broken strands
Mismatch repair (MMR) An excision repair process mainly devoted to correcting errors in replication
Missense mutation A change in a gene, which results in a change in the meaning of a codon, e.g., the change from GAA (glutamic acid) to GUA (valine)
Mobile element insertion (MEI) See transposon below. Mutational event in which a mobile element is inserted at a new position in the genome
Mutator A mutation, often of a repair gene, that has the effect of increasing the spontaneous mutation rate
Nucleotide excision repair (NER) The paradigm of an excision repair pathway. NER recognizes a wide range of damage and proceeds by cutting out and replacing an extensive series of nucleotides
Nonallelic homologous recombination (NAHR) HR in which the complement is a homologous sequence other than the normal allele and which can lead to chromosome aberrations
Nonsense mutation A mutation that results in one of the termination codons UAA, UAG, or UGA
Phenotype The observable traits of an organism
Point mutation A mutation involving one or a few nucleotides as distinguished from insertions, deletions, and duplications involving hundreds, thousands, or more nucleotides
Proofreading In DNA synthesis, the process where an exonuclease checks a newly-inserted nucleotide for goodness of fit. Sometimes referred to as editing
Pseudogene A copy of a gene made inactive by the accumulation of mutations and often devoid of introns
Retrotransposon A transposable element that can shift its position in DNA via an RNA intermediate
Reactive oxygen species (ROS) Chemically reactive radicals containing oxygen formed in metabolism and produced in clusters by ionizing radiation
Somatic hypermutation (SHM) Process producing multiple mutations in mature B cells, mostly but not exclusively in the immunoglobulin gene during antibody maturation
Single nucleotide variation (SNV) A point mutation involving a single nucleotide pair
Syn/anti base configuration In the anti configuration, the bulky part of the base of a nucleoside or nucleotide rotates away from the sugar. In the syn configuration the bulky part rotates over the sugar
Synonymous/silent mutation A nucleotide change that does not change the meaning of a codon, e.g., the change from GGU (glycine) to GGA (glycine). Not all synonymous mutations are silent, that is, without phenotypic effect
Translocation Attachment of a segment of one chromosome to a different (nonhomologous) chromosome
Transposition The movement of a transposable element from one position in the genome to another
Transition The mutational change from a purine to another purine or a pyrimidine to another pyrimidine. G↔A and C↔T are the possible transitions
Transposon, mobile element (ME) A DNA sequence able to move from one position to another within the genome. Movements are generally rare and are catalyzed by special enzymes coded for by the transposon
Transversion The mutational change from a purine to a pyrimidine or a pyrimidine to a purine. A↔T, G↔T, C↔G, C↔A are possible transversions
Translesion synthesis (TLS) Synthesis of DNA by specialized polymerases utilizing a damaged template
Transcription coupled nucleotide excision repair (TC-NER) Specialized NER mechanism targeted to genes in the process of transcription
Ubiquitin A conserved small (76 amino acids in humans) protein, which when covalently added to proteins in single or multiple copies serves as a signal for processes such as degradation and/or changes in conformation


Modern understanding of the mechanism of mutation is based on the Watson–Crick DNA structure. Recognizing the importance of nucleotide sequence followed by the deciphering of the genetic code led to a change from a biological and formalistic or mathematical view of mutation to a more biochemical approach. This chapter presents the problem of mutation as mainly one of biochemistry.


The immediate response of investigators to the Watson–Crick structure was to focus attention on the base changes that resulted in mutation and on the chemical changes that might alter base pairing. The specificity of particular mutagenic agents was initially ascribed to chemical changes in either the incoming or template nucleotide, resulting in altered pairing properties, mainly involving hydrogen bonding. Benzer and Freese and Brenner et al. defined mutation in terms of substitutions, additions, and deletions of nucleotides. DNA in eukaryotes is organized into discrete chromosomes. Changes in the structure (rearrangements and translocations) and numerical distribution of these chromosomes that leave a viable organism are also mutation, but it is only recently, with the availability of extensive DNA sequence information, that these changes can even be partially accounted for biochemically.


Advances in our understanding of the complex biochemistry of DNA replication and its interaction with the various DNA repair and recombination pathways has led to a more mechanistic approach to understanding mutation ( Fig. 1.1 ). The discovery in the late 1990s of a series of DNA polymerases with altered fidelity and ability to replicate past damaged sites in DNA advanced a view of mutation as an event involving both initial changes in the DNA and the interaction of these changes with the protein complement of the cell. Most recently, the advent of rapid and relatively inexpensive DNA sequencing technology has permitted a direct measurement of normal human mutation rates and the recognition that de novo mutation plays a role in human disease. The identification of thousands of mutational changes in individual tumors, only a small minority of which are involved as “drivers” in the etiology of the tumors, has permitted recognition of a set of mutational “signatures” that implicate particular repair processes in the generation of the mutations. The observation of numerous closely linked mutations in tumor cells suggests the operation of unique mutagenic mechanisms whose operation in normal cells remains an open question.




Figure 1.1


The interactions between repair and mutation.

Abbreviations: TC-NER, transcription-coupled repair; BER, base excision repair; MMR, mismatch repair; HR, homologous recombination; NAHR, nonallelic homologous recombination; TLS, translesion synthesis; NER, nucleotide excision repair; DSBR double-strand break repair; NHEJ, nonhomologous end joining; MMEJ, microhomology-mediated end joining; CNV, copy number variations. See Table 1.1 for definitions.




The types of mutation


Mutations are defined in this chapter as changes in the parental sequence of the DNA ( Table 1.1 ). This definition is not without problems, since it is sometimes difficult to distinguish such changes from the normal process of recombination. Mutations include single nucleotide variation (SNV), insertion or deletion of small numbers of nucleotides (indels, frameshifts), rearrangements of the DNA sequence, change in the number of copies of larger stretches of DNA (copy number variation, CNV), and changes in the structure (inversions and translocations) or number of chromosomes (aneuploidy). The insertion or movement of transposable elements may affect phenotype and be obviously mutagenic. Nucleotide changes may occur outside the exome, the protein coding region of the genome, and these may or may not have an observable effect on phenotype. This view of mutation as a sequence change anywhere in the genome is a product of the sequencing revolution, since the recognition of mutation in the presequencing era required some observable change in the phenotype. The definition of mutation as any change in DNA sequence results in classifying sequence changes that have no obvious phenotypic effect as mutations. There are about 20,000–25,000 human genes, and the exome comprises somewhere about 2% of the total number of nucleotides. Much of the remainder of the DNA is transcribed into RNA, and some of this plays an important regulatory role in gene function, but as yet there is no automatic way to predict whether or what a change in DNA sequence will mean for physiology.


The possible single base changes were first cataloged by Ernst Freese and Seymour Benzer. Freese coined the term “transition” to denote the change from one purine to another, or of one pyrimidine to another. The four possible transitions are cytosine (C) to thymine (T) and its reverse, and adenine (A) to guanine (G) and its reverse. Freese defined “transversions” as changes from a purine to a pyrimidine or the reverse. Change from an A or a G to a C or a T, and the reverse C or T to A or G was defined as a transversion. These definitions remain even though some of the putative transversions described were actually insertions or deletions of a few nucleotides. These were later called frameshifts because of the discovery that the genetic code was read in groups of three nucleotides to specify a particular amino acid. The addition (or deletion) of any number of nucleotides not divisible by three results in a change in the reading “frame,” thereby changing the amino acid composition of all amino acids downstream of the coding change. The details of the genetic code, as elucidated in the 1960s, also indicated that such frameshifts might not only result in major changes in the amino acid composition of a protein but might also produce unexpected termination codons as a result of the shift. Point mutations that resulted in protein terminations were at first termed “nonsense” mutations, as opposed to missense mutations, that resulted in the substitution of one amino acid for another. The nonsense mutations did not make “sense,” that is, did not specify any amino acid. There are three such codons, now called termination (ter) codons: UAA, UAG, and UGA. Since messenger RNA is the molecule that is actually read by the protein-synthesizing machinery, the code is an RNA code with U(racil) substituted for T(hymine). One of the stop codons, UGA, is read as tryptophan in mitochondria, and the mitochondrial code includes a few other variations; AGG and AGA are mitochondrial stop codons instead of coding for arginine and AUA codes for methionine instead of isoleucine. There are 64 possible codons but only 20 (or 21 including selenocysteine) natural amino acids incorporated into protein, and the codes are degenerate, or redundant, in that several codons can specify the same amino acid. Selenocysteine is synthesized from a special sertRNA and is coded for by an in-frame UGA stop codon. The mechanism by which particular UGA sites are selected for selenocysteine incorporation requires particular transcription factors and a cis -acting insertion sequence (SECIS) on the mRNA.


Point mutations within genes that do not change the meaning (amino acid coded) of the codon are termed synonymous, or “silent,” as opposed to nonsynonymous changes. “Silent” mutations may actually affect physiology when they are part of splice sites and because of the different availability of various tRNAs. “Synonymous” is probably the better term. Although there was some initial confusion about the necessity of punctuation between the triplet codons, it was realized that if reading of the code began at a fixed site, and if the reading “frame” was designed to read three nucleotides at a time, the correct sequence of amino acids would be automatically produced. This terminology was developed before it was realized that there were large amounts of noncoding DNA. “Frameshift” has no meaning for mutations within such noncoding regions. A more recent term for small in sertions or del etions, regardless of their physiological effect is “indel,” although the frameshift terminology continues to be used when appropriate.


Change in the structure of other cellular constituents (e.g., membranes, prions) may also be heritable and alter physiology. Methylation of the DNA base cytosine and modifications of the histones constitute a set of markers that can alter cell physiology and direct patterns of differentiation. Some of these modifications are propagated through mitosis and constitute a mechanism for somatic inheritance and for tissue differentiation. Certain of these markers can survive meiosis, but the situation is complicated in large part because of the massive removal of tissue specific markers in the period between the formation of the gametes and early differentiation followed by their replacement. Cytosine methylation in DNA is carried out by specific maintenance and de novo methylation enzymes. The signal for maintenance is the presence of a methyl group on the parental strand of a 5′CpG sequence. Methylated promoters are usually associated with gene inactivation. Environmental influences can affect DNA methylation and gene function, and such environmental effects can persist through several generations. Perhaps the most well-known study is the demonstration of low birth weight in offspring of Dutch mothers pregnant during the 1945 famine. The general biological significance of such epigenetic marks (see Table 1.1 ) is clearly a matter of current interest, since it reintroduces a Lamarckian cast to molecular biology. Epigenetic changes can mimic mutational ones since they affect phenotype. In addition, the rate of somatic mutation can vary as much as 10-fold according to tissue, and the determining factor is apparently the distribution of (epigenetic) markers.




The mechanisms of mutation


Base Pairing and the Action of Mutagenic Agents


It was first assumed that the fidelity of normal replication stemmed from the stability of the A:T and G:C base pairs resulting from hydrogen bonding. The early workers on the molecular nature of mutations accounted for the specificity of a variety of base analogs and other mutagenic agents by drawing acceptable alternative base pairings, resulting from the incorporation of these compounds into DNA or by their reaction with DNA nucleotides. Alkylating agents, such as methylnitrosourea and the chemotherapeutically active mustard gas derivatives, were shown to react with individual nucleotides to produce multiple changes. Production of O 6 -methylguanine by agents such as methylnitrosourea or methylnitronitrosoguanidine was shown to promote mistaken base pairing, making understandable the highly mutagenic characteristics of such compounds. A major development was the discovery that metabolic systems in the host activated ingested compounds, making it possible for them to react with DNA. Carcinogenic polycyclic hydrocarbons, including those present in tobacco smoke and aflatoxins, are converted to epoxide derivatives with the participation of the cytochrome p450 system. These epoxides react directly with DNA, producing mutagenic adducts.


We live in an environment that is not friendly to DNA. We are essentially 55.6 M water. Given the law of mass action, hydrolytic reactions are inevitable. It has been estimated that we lose about 18,000 bases per cell in every 24-h period as a result of spontaneous hydrolysis of the glycosidic bond. The abasic sites so created are targets for base excision repair (see Modifiers of Mutation Not Associated with Replication p 11) but those that survive are mutagenic.


The most reactive mutagen in our environment is undoubtedly oxygen. Breathing, however unavoidable, is inherently dangerous! The electron transport chain, by which adenosine triphosphate (ATP) is generated, results in the generation of reactive oxygen species (ROS) that produce the hydroxyl radical OH. When formed in proximity to DNA, this species produces a variety of oxidation products, of which a guanine with a saturated imidazole ring (8-oxoguanine) and altered hydrogen bonding properties are the most important.


We are dependent on the sun both as a primary source of energy and for our feeling of well-being. But sunlight is a major source of DNA damage and results in the production of pyrimidine dimers and altered pyrimidines, all of which may be mutagenic and carcinogenic.


The Role of Enzymes in Mutation


It is not surprising that organisms have developed a set of enzymes to protect themselves from such damage and from mutation. Most mutations, at least in the exome, may have a deleterious effect, but it is essential that they occur at a sufficient rate in germ cells to ensure the variation on which natural selection is based. The problem organisms have had to solve is to adjust the mutation rate to some presumably optimal rate. A group of enzymes accomplishes this purpose.


The free energy differences between correct and incorrect base pairs are very small, at most 0.4 kcal/mol. This means that in a water solution, in which there is much competitive hydrogen bonding, a correct base pair is only about twice as likely to form as is a mismatch. The major contribution to specificity is provided by the replicative polymerases: the structural nature of the pockets into which incoming nucleotides fit and the kinetic interactions between elongation of the chain and reversal of the reaction accounting for this specificity. Humans (and other organisms) have developed a variety of polymerases with vastly different specificities ( Table 1.2 ). The free energy differences between correct and mismatched bases for a reaction catalyzed by a replicative polymerase ( Drosophila melanogaster polymerase alpha) indicated a difference of 4.9 kcal/mol, equivalent to a discrimination factor of about 1 in 3,000. The in vitro measured error frequency of the different polymerases ( Table 1.2 ) varies from a low of about 1 in 100,000 for the different B family replicative polymerases to more than 3.5 per 100 for human polymerase eta. Polymerase iota (pol iota) confronted with a template T will actually incorporate a G three times more frequently than the “correct” A!



Table 1.2

Eucaryotic DNA Polymerases













































































































Pol Family Exo(?) Error rate Function References
Alpha (α) B No 10 −4 –10 −5 Replication Biochemistry 1991;30:11751–59
Beta (β) X No 5 × 10 −4 Gap filling BER Biochemistry 1991;30:11751–59
Gamma (γ) A Yes 1 × 10 −5 Mitochondrial J Biol Chem 2001;276:38555–62
Delta (δ) B Yes 10 −5 –10 −6 Replication Nat Rev Genet 2008;9:594–604
Epsilon (ɛ) B Yes 4.4 × 10 −5 Replication Nucleic Acids Res 2011;39:1763–73
Eta (η) Y No 0.26–6 × 10 −2 TLS (UV) Nature 2000;404:1011–1013
Iota (ι) Y No Template and metal ion dependent, SHM J Biol Chem 2007;282:24689–96
Kappa (κ) Y No 5.8 × 10 −3 TLS J Mol Biol 2001;312:335–346
Lambda (λ) X No 9 × 10 −4 DSBR J Biol Chem 2002;277:13184–191
Mu (μ) X No 10 −3 –10 −5 NHEJ Biochemistry 2004;43:13827–38
Theta (τ) A No 2.4 × 10 −3 SHM, TLS Nucleic Acids Res 2008;36:3847–56
Zeta (ζ) B No 1.3 × 10 −3 (yeast) TLS Nucleic Acids Res 2006;34:4731–42
Rev1 Y No dCMP transferase TLS accessory to pol zeta Nucleic Acids Res 2010;38:5036–48
Nu (ν) A No 2.4 × 10 −3 T opposite template G, TLS, SHM DNA Repair 2007;6:213–223

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 7, 2019 | Posted by in ENDOCRINOLOGY | Comments Off on Mechanisms of Mutation

Full access? Get Clinical Tree

Get Clinical Tree app for offline access