Molecular and Genetic Basis of Childhood Cancer



Molecular and Genetic Basis of Childhood Cancer


Peter D. Aplan

Jack F. Shern

Javed Khan



INTRODUCTION

The biological behavior of every mammalian cell is determined by the pattern of gene expression within that cell, and the hallmark of cancer is the progressive accumulation of genetic and epigenetic alterations. At the molecular level, cancer is a genetic disease, caused by a combination of inherited (germline) and acquired (somatic) aberrations of the genome. During the process of malignant transformation, the genetic material of a cancer cell may acquire a wide array of mutations, including single nucleotide substitutions (point mutations), small insertions and deletions (indels), as well as a spectrum of larger structural variations, including translocations and more complex rearrangements. These aberrations, as well as epigenetic changes, lead to alterations of the cell’s gene expression profile, which in turn leads to phenotypic abnormalities including uncontrolled growth, failure of differentiation, and reduced apoptosis. Depending on the genetic locus involved and the mechanism of its disruption, some of these changes may make small or incremental contributions to malignant transformation. Others may be cataclysmic in their unraveling of ordered and regulated growth processes. Since the previous edition of this chapter, there has been a virtual explosion of information related to somatic alterations in pediatric cancers, which has largely been enabled by next-generation sequencing (NGS) strategies.

Novel finding using these technologies will be discussed in this chapter as well as the principles of the molecular and genetic basis of pediatric cancers. The first sections will focus on the technologies employed for comprehensive analysis of the cancer genome. Next, we will summarize the pertinent inherited (germline) and acquired (somatic) mutations associated with cancer, and the mechanisms that generate these aberrations. It should be stressed that many of the chromosomal changes and molecular genetic aberrations associated with childhood malignancies overlap with those in adult malignancies, so that the lessons learned from the study of cancers more common in adults than children (e.g., acute myelogenous leukemia) will also be discussed here.


GENERAL NATURE OF CANCER-ASSOCIATED GENETIC ABERRATIONS

The association of a particular chromosomal abnormality with a specific type of malignancy was first demonstrated in 1960 with the identification of the Philadelphia chromosome in the malignant cells of patients with chronic myelogenous leukemia. However, over the past 55 years, cytogeneticists have described an increasing number of specific chromosomal abnormalities, each associated with a particular cell type or a histologically distinct malignancy. These karyotypic abnormalities have provided both a rich source of biological insights into the process of malignant transformation and a means to stratify patients for prognostic and therapeutic purposes. Figure 3.1 summarizes the broad classification of the alterations that are commonly found within the cancer genome. These alterations can be either germline or somatic. Germline alterations may occur as a result of whole chromosomal changes, most commonly chromosomal gains, such as trisomy 21. Segmental gains or losses, or copy number variations (CNVs), of germline DNA have been increasingly associated with an increased risk of certain forms of cancer.1 There are case reports of cancer arising in patients with constitutional chromosomal rearrangements such as translocations.2 Other constitutional alterations that predispose to cancer include paternal segmental iso-disomy that is associated with Beckwith-Wiedemann syndrome, in which the paternal region around the IGF2 gene becomes duplicated, leading to overexpression of this growth factor (see Table 3.1). Genome-wide association studies (GWAS) and other genetic linkage studies have identified many genes where single nucleotide variations (SNVs) or single nucleotide polymorphisms (SNPs) are associated with an increased predisposition to cancer. These variations may be in regulatory regions leading to aberrant gene expression or may be within the coding regions that alter the function of the protein. These latter mutations may result in loss of function, for example, the Li-Fraumeni syndrome with TP53 mutations or gain of function such as the ALK mutations described in familial neuroblastoma.3,4


COMPREHENSIVE ANALYSIS OF THE CANCER GENOME

The following section will review the significant impact that the study of the whole genome, defined as the total genetic information of a cell, including both DNA and RNA, has had on the understanding of the molecular and genetic basis of pediatric cancers. The central dogma of molecular biology describes the flow of genetic information from genomic DNA to RNA to protein within a biological system (Fig. 3.2). The traditional approach employed by molecular biologists has been the investigation of one gene at a time. However, it has become increasingly clear that the biology of a cell is driven by the simultaneous expression of a large number of genes acting in concert and that there is a complex interaction of each of the components of the genome and proteome. Hence a comprehensive analysis of the genome will include DNA nucleotide and structural variation, RNA expression profile, and global protein expression analysis (Fig. 3.2).

The Human Genome Project (HGP) estimated 23,000 protein-coding genes at the DNA level; with alternative splicing of RNA, this increases to an estimate of >100,000 proteins coded within the human genome. However, only approximately 1.2% to 2% of the human genome5,6 is protein coding, and the remaining 98% was largely unexplored until the Encyclopedia of DNA Elements (ENCODE) project reported in 2012 that >60% of the human genome is reproducibly represented in RNA molecules (size >200 nucleotides) on the basis of a conservative threshold.7 It has been estimated that there are thousands of transcribed but nontranslated RNAs within the human genome; these have been termed noncoding RNAs (ncRNAs).8

The term proteomics refers to the study of all proteins within the cell. Although there are estimated to be approximately 23,000 mRNAs in the genome, this number continues to increase,9 and it is likely that the NGS of human transcriptomes (see later) will further increase this number.10,11 As discussed above, alternative mRNA splicing leads to additional protein isoforms and posttranslational modifications such as phosphorylation further increase32
the number of individual proteins. There are powerful emerging technologies to measure global protein expression profiles, including protein arrays, mass spectroscopy, isotope-coded affinity tags (ICAT), isobaric tags for relative and absolute quantitation (iTRAQ), and stable isotope labeling by amino acids in cell culture (SILAC), that are able to generate quantitative global profiles of the proteome and phospho-proteome.12 Although application of molecular technologies for proteomic analysis of clinical samples have lagged behind genomics in terms of sensitivity, robustness, and reproducibility, these methods are likely to increase significantly in the future in their utility to biologically characterize cancer and provide clinically relevant biomarkers.






Figure 3.1 Summary of the common genomic and genetic alterations found in cancer. These alterations can be either germline (constitutional) or somatic (acquired), and the color scheme reflects similar aberrations that occur in either group. Germline alterations may be a result of copy number alterations, such as whole chromosomal gains (losses usually incompatible with prolonged life), or segmental changes. Constitutional chromosomal rearrangements have been observed in patients with cancer. These rearrangements can result in truncation of a protein, expression of a gene under the control of an alternative promoter, or the production of a chimeric or novel fusion protein not normally found in nature. Single nucleotide variations (SNV) or polymorphisms (SNP) have been associated with an increased predisposition to cancer. These variations may be in the gene regulatory regions, leading to overexpression or suppression, or may be within the protein coding regions leading to expression of mutant proteins. Other constitutional alterations that predispose to cancer include parental segmental isodisomy. Additional somatic alterations include copy-neutral loss of heterozygosity, where there is no net loss of DNA but uniparental disomy, with only one parental chromosome or region present in two copies. Epigenetic alterations such as silencing of genes by methylation is an increasingly important mechanism of oncogenesis.


DNA Microarrays

Microarrays are fabricated with DNA from BAC or cDNA clones, or oligonucleotides. The targets used in DNA microarray experiments are prepared from DNA or RNA that are fluorescently labeled. The fluorescently labeled targets are then hybridized to the microarray slides and imaged. Some microarray studies utilize two fluorescent dyes; in these studies, the test (e.g., tumor) sample is labeled with one dye, and the control (e.g., normal) sample is labeled with a different-colored dye. The two fluorescently labeled samples are then simultaneously hybridized to the microarray slide. Alternatively, Affymetrix and other arrays utilize a single color where only the test sample is labeled and hybridized. The targets are then hybridized on microarray slides and imaged. The fluorescence intensity of each spot on the array is quantified and corresponds to the amount of DNA or RNA in the fluorescently labeled test sample. In this manner, a single experiment, using multiple arrays, can easily generate millions of data points. The analysis of these data is greatly aided by computational biology techniques and the accessibility of large, publicly available data sets. Although NGS methods (see below) have superseded DNA microarrays, nevertheless the data generated by historical studies represent high-quality data that are increasingly used by the scientific community for hypothesis generation and in silico validation. Databases such as the Oncogenomics (http://pob.abcc.ncifcrf.gov/cgi-bin/JK), R2 (http://hgserver1.amc.nl/cgi-bin/r2/main.cgi), and Oncomine (www.oncomine.org) provide large collections of publically available genomic and proteomic data for pediatric tumor samples, many of which are clinically annotated and remain of great utility.


DNA-Based Genomic Studies

Genome-wide scanning of germline DNA has been facilitated by the HGP, and the International HapMap Project was built upon the sequence of the human genome produced by the International Human Genome Sequencing Consortium. The International HapMap Project began with the observation that there are sites within the genome that differ by a single nucleotide across different individuals. If these SNVs occur at a frequency of greater than or equal to 1% in the population, they are referred to as single nucleotide polymorphisms (SNPs). This information has been used to discover associations of particular SNPs with disease in order to identify disease loci and genes. Fortunately, genetic variation among individuals is organized in “DNA neighborhoods,” called haplotype blocks. SNP variants that lie close to each other along the DNA molecule form a haplotype block and tend to be inherited together. SNP variants that are far from each other along the DNA molecule tend to be in different haplotype blocks and are less likely to be inherited together. The International HapMap Project has parsed the genome into heritable haplotype units, each of which may contain 10 or more SNPs. Only a few so-called “tag” SNPs are needed to identify unique blocks of genome that represents all of the SNPs associated with that one segment of genomic DNA.









TABLE 3.1 Inherited Predisposition to Cancer
































































































































































Syndrome


Gene


Types of Inactivation


Neoplasms


Reference


Ataxia-telangiectasia


ATM


Point mutation (biallelic)


Haploinsufficiency


Leukemia, lymphoma, breast, ovarian


228


Basal cell nevus syndrome (Gorlin syndrome)


PTCH


Point mutation, deletion


Basal cell, meduloblastoma


229


Beckwith-Wiedemann syndrome


Multiple


Paternal segmental isodisomy


Deletions


Wilms tumor, neuroblastoma, hepatoblastoma, rhabdomyosarcoma


230


Birt-Hogg-Dube syndrome


FLCN


Small insertions/deletions


Renal


231


Bloom syndrome


BLM


Point mutations


Deletions


Leukemia, lymphoma, Wilms tumor, colon, breast, cervix


232,233


Hereditary breast/ovarian cancer


BRCA1


BRCA2


Point mutations, deletions


Breast, ovarian, prostate, pancreatic


232,234


Hereditary nonpolyposis colon cancer (Lynch syndrome)


MLH1


MSH2


PMS2


MSH6


Point mutations


Colon, uterine, gastric, endometrial, small bowel, sebaceous gland


235,236


Cowden syndrome


PTEN


Point mutations, deletions, promoter mutations


Breast, thyroid, renal, glioblastoma


Dyskeratois congenita


DKC1


TERC


TERT


Point mutations


Leukemia, esophagus


237,238


Fanconi anemia


Many (FANCA-FANCN)



Leukemia, hepatocellular, esophagus, head and neck, cervix


232,239


Familial acute myeloid leukemia


RUNX1


Others


Point mutations


Leukemia


Li-Fraumeni syndrome


TP53


Point mutations


Leukemia, lymphoma, breast, osteosarcoma, brain tumors


240


Dysplastic nevus syndrome


CDKN2A


Others


Point mutations, deletions, insertions


Melanoma, pancreatic


241


Multiple endocrine neoplasia type 1


MEN1


Point mutations, insertions, deletions


Parathyroid, pancreas, gastrinomas, insulinoma, carcinoid


242


Multiple endocrine neoplasia types 2A and 2B


RET


Point mutations


Thyroid medulla, pheochromocytoma


243


Neurofibromatosis type 1


NF1


Point mutations, deletions, translocations


MPNST, pheochromosytoma, astrocytoma, glioma, leukemia


244


Neurofibromatosis type 2


NF2


Point mutations


Astrocytoma, melanoma, meningioma


Nijmegen breakage syndrome


NBS1


Point mutations


Lymphoma, leukemia


245,246


Peutz-Jeghers syndrome


LKB1


Point mutations


Stomach, small intestine, colon, pancreas, uterine, breast


Familial adenomatous polyposis


APC


Point mutations leading to truncation


Colon, small intestine, thyroid, pancreas, hepatoblastoma, medulloblastoma


247,248


Retinoblastoma


RB


Deletions, point mutations


Retinoblastoma, osteosarcoma, melanoma, pinealoblastoma, lung


249,250


Von Hippel-Lindau


VHL


Point mutations, deletions


Renal cell carcinoma, pancreatic islet cell, pheochromocytoma


251


Werner syndrome


WRN


Point mutations


Leukemia, melanoma, osteosarcoma, thyroid


252


Xeroderma pigmentosum


Many


Point mutations


Basal cell, melanoma, stomach, leukemia


253


WAGR syndrome


WT1


Deletions


Wilms tumor


Wiskott-Aldrich syndrome


WASP


Point mutations


Leukemia, lymphoma


254








Figure 3.2 Whole-genome and proteome investigation of cancer. The human genome is contained within 23 chromosomal pairs comprising 3.2 billion pairs of the four nucleotides (adenosine [A], cytidine [C], guanosine [G], and thymidine [T]). Two percent of the genome is transcribed into 20 to 25,000 protein-coding messenger RNAs (mRNA). The genomic sequence contains a promoter region, exons (containing the coding regions), and introns. The introns are spliced out following transcription, and alternate splicing can generate several different mRNAs and protein products. Many regions of the genome are also transcribed into noncoding RNA molecules (>10,000), including microRNAs (˜700). Shown on the right are some of the methodologies in genomics, including low-resolution chromosomal structure analysis such as cytogenetics and molecular cytogenetics (fluorescent in-situ hybridization [FISH], comparative genomic hybridization [CGH], and spectral karyotyping [SKY]). Genetic mapping uses DNA markers to find linkage of a genomic region to an inherited disease to eventually identify the causal gene. Physical mapping uses clones containing human nucleic acids (e.g., bacterial artificial chromosome) to define physical locations of genes and markers within each chromosome. In parallel to the sequencing of the human genome, many single nucleotide variations (SNVs) or single nucleotide polymorphisms (SNPs) are being detected and cataloged and may contribute to the phenotypic differences found in many patients and their cancers. Increasingly, “next generation” sequencing methods for profiling and discovery of novel genes are replacing traditional genomics methods. Finally, several methods are available for detecting which genes are actively being transcribed and translated into proteins, including protein arrays, mass spectroscopy, and isotope-coded affinity tags (ICAT).225 (Figure modified from references.225,226)

Genome-wide scanning of tumor DNA using comparative genomic hybridization (CGH) is an established technique for detecting gain or loss of chromosomal regions. Originally, CGH was performed by hybridizing differentially labeled tumor and normal
DNA onto a metaphase preparation of normal human chromosomes. However, this approach has been progressively replaced by higher resolution methods including array-based comparative genomic hybridization (A-CGH), using arrays based on bacterial artificial chromosome (BAC), cDNA, oligonucleotides, SNPs, and, most recently, NGS methods.

The International HapMap Project has enabled investigators to perform GWAS, which are noncandidate gene-driven studies that use whole-genome SNP-based approaches to identify association of an SNP with traits such as disease, response to drug, and anthropometry. Multiple studies investigating the association of up to 5 million tag SNPs associated with human disease have been used to identify germline alterations associated with cancer.13,14,15,16 At the time of this writing, there have been 1,751 curated GWAS studies17 that collectively have identified 1,491 regions associated with more than 600 complex diseases or traits, and in cancer this strategy has identified 165 regions. GWAS in pediatric cancers have been hampered by the relative scarcity of childhood cancers and consequent lack of sufficient sample numbers to adequately power these studies. Notable exceptions include the investigation of acute lymphoblastic leukemia (ALL),18,19,20,21 neuroblastoma,22,23 osteosarcoma,24 and Ewing sarcoma.25 Complimenting these GWAS efforts are global sequencing projects such as the 1000 Genomes (http://www.1000genomes.org/), Exome Sequencing Project (ESP; http://evs.gs.washington.edu/EVS/), and International Cancer Genome Consortium (ICGC; http://www.icgc.org/), in which base pair variations are being mapped and associated with diseases.

The Cancer Genome Atlas (TCGA; http://cancergenome.nih. gov/) is a nationwide pilot project, jointly supported and led by the National Human Genome Research Institute (NHGRI) and the National Cancer Institute (NCI), to perform large-scale multidimensional analysis of these molecular characteristics in human cancer and to release the data to the research community. To date 37 different cancer types have been or will be analyzed by this effort. The pediatric arm of TCGA is known as Therapeutically Applicable Research to Generate Effective Treatments (TARGET; http://target.cancer.gov/). The goals of the TARGET initiative are to perform extensive genomic profiling of pediatric cancers to identify therapeutic targets in therapy. The types of cancer currently under study include ALL, acute myeloid leukemia (AML), kidney tumors, neuroblastoma, and osteosarcoma (detailed in subsequent sections).


NGS Methods

Despite the wide utilization of microarrays in genomic research, this technology has its limitations. First, due to the high background caused by cross-hybridization, it is difficult to detect SNVs or structural alterations such as balanced translocations. Second, prior knowledge of the targeted DNA sequences is required for designing probes on microarrays. Third, it is technically very challenging to detect every mutation in a given tumor sample using microarray-based strategies. Sanger-based large-scale sequencing projects have identified several genetic alterations that had not previously been associated with neoplasia. For example, Parsons et al. sequenced 20,610 genes from 22 glioblastomas (GBM) primary tumors or xenografts and identified 44 candidate cancer genes (CAN genes) that may be important for GBM. Of the top 10 alterations predicted in this study, 9 genes already had established roles in the pathogenesis of gliomas; however, one gene, IDH1, has not typically been viewed as an important component of malignant transformation.26,27 Other more recent Sanger-based sequencing approaches have identified FGFR4 to be mutated in rhabdomyosarcomas.28

NGS (also referred to as massively parallel sequencing or deep sequencing) technology directly identifies millions or even billions of nucleic acid species in parallel in a single experiment and represents a marked advance over traditional, Sanger-based methods. Different from the Sanger method of sequencing, the massively parallel DNA sequencing technology not only generates sequence information for each nucleic acid strand, but also determines the abundance of each nucleic acid species, resulting in a digital readout of abundance for any sequence, even those at levels below the detection sensitivity of hybridization-based technologies. Additionally, NGS allows for the detection of mutations in a heterogeneous sample, including those heavily contaminated with normal tissue. With the advent of NGS technologies, whole genomes from multiple samples can be readily determined. Thus, this technology has wide-ranging applications for the investigation of both DNA and RNA (Fig. 3.3).29,30,31 The method involves firstly fractionating the DNA or RNA to sizes of 200 bp to 3 kb for DNA, 200 bp for mRNA, and less than 200 bp for microRNA. For RNA, the molecules are reverse transcribed into DNA. Adaptors are then ligated onto the DNA, which is then amplified either by PCR in either bead-based emulsions or on a solid substrate. The beads are deposited onto glass slides, and by innovative fluorescent-based methods, each molecule is sequenced with current lengths 50 to 300 bp in size. By this method, millions of short fragment reads are generated that are then mapped back to the reference genome using powerful computer clusters (Fig. 3.4).

With these NGS techniques, it is possible to sequence an entire cancer genome within 1 week, a staggeringly short time considering that it took 13 years to sequence a handful of human genomes by the HGP. There are also methods for isolating the protein coding exons (the “exome”) or a defined genomic region (termed genome-partitioning) using either microarray solid-phase hybridization,32,33 solution-based methods, or multiplex PCR.34 In addition, to sequence information these methods allow investigators to determine inherited or acquired CNVs on the basis of the relative numbers of sequence “reads” that are generated for a particular genomic region. These methods are also being modified to perform global methylation scans either by sequencing methylated DNA fragments that have been immune-precipitated by an antibody or protein that recognizes methylated DNA (methylated DNA immunoprecipitation, or methyl-CpG-binding protein) or by sequencing bisulfite-modified DNA. Bisulfite genomic sequencing is a widely used technique for analyzing cytosine-methylation of DNA. Treatment of genomic DNA with bisulfite deaminates cytosine residues, which become uracil residues, whereas bisulfite treatment has no effect on 5-methylcytosine residues, and these nucleotide differences can be detected by NGS techniques. Finally, it is possible to identify DNA bound by protein, including transcription factors and modified histones, and perform NGS in a process known as chromatin immunoprecipitation and sequencing (ChIPSeq).

In a landmark study, the entire genome of malignant cells from a patient with M1 AML was sequenced, and 10 heterozygous, non-synonymous SNVs were identified. Among these, two were previously implicated in the pathogenesis of AML (FLT3 and NPM1), while eight were alterations in genes not known to be involved in AML pathogenesis, underscoring the limitation of targeted sequencing and validating the importance of unbiased whole-genome sequencing.29 It seems reasonable to predict that use of these methods will allow investigators to identify every mutation or SNV in an individual tumor as well as every gene rearrangement with the precise breakpoint. NGS can also be applied to evaluate RNA (RNAseq), giving an unbiased survey of the cancer cell transcriptome. The generated RNA sequence data can be used for gene expression profiling based on counting the number of sequencing reads generated for each transcript. Due to the ability to detect novel transcripts, RNAseq experiments can study the transcriptome in an unprecedented detail including but not limited to noncoding or fusion gene detection, as well as allele-imbalanced gene expression, viral gene integration, and pseudogene expression. The following section discusses strategies to apply RNAseq data to systematically study these areas that were challenging in the microarray era.







Figure 3.3 The application of next-generation sequencing for the comprehensive genomic investigation of cancers. For DNA, it is possible to sequence the entire cancer genome, or the DNA of the whole expressed genome, or a chromosomal region that has been identified by GWAS studies. These methods can also be used to perform global methylation scans either by sequencing bisulfite-treated DNA or DNA fragments that have been precipitated by an antibody (methylated DNA immunoprecipitation [MeDIP]) or protein (methyl-CpG-binding protein [MBD]) that binds methylated DNA. The power of these techniques is the ability to determine the copy number of every DNA and RNA molecule in the cell, including single nucleotide variants and mutations, and novel transcripts including splice variants. It will also detect all chromosomal rearrangements and novel transcripts produced by these rearrangements, including chimeric fusion oncogenes as a result of translocations. With slight modifications, the technique can identify all methylated regions of DNA. In this way, it will be possible to identify diagnostic and prognostic biomarkers, biologically relevant genes, and importantly therapeutic targets, such as kinases activated by mutation.


Noncoding RNAs

Noncoding RNA (ncRNA), including microRNAs, transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nucleolar RNAs (snoRNAs), small nuclear RNAs (snRNAs), siRNAs, Piwi-interacting RNAs (piRNAs), and the long ncRNAs (>200 bp), were once thought to be “housekeeping” RNAs. However, ncRNAs have increasingly been shown to have important functions in control of transcription and biological function of a cell.35,36 Currently there are 32,183 human-annotated lncRNAs (http://www.lncipedia.org/) with more being discovered by RNAseq experiments. Given that the function of many lncRNAs have not been explored or even annotated, RNAseq is an ideal tool to systematically capture transcribed lncRNAs. For example, one study discovered 121 novel prostate cancer-associated ncRNAs and found one of them, PCAT-1, to be implicated in prostate cancer progression.37 This report showed that lncRNAs are expressed in a tissue- and disease-specific manner, hence systematically profiling lncRNAs is important to understand disease biology and develop lncRNA-based biomarkers.

Similarly RNAseq can be used to profile small noncoding RNA such as microRNAs (miRNAs). MiRNAs are short (20 to 24 nucleotides), noncoding RNAs that modulate protein translation and mRNA stability in a tissue-specific manner and regulate proliferation, apoptosis, differentiation, growth, migration, and metabolism in cancers.38 Therefore, microRNAs can be used for early detection, diagnosis, prognosis and, potentially, therapy.39 Compared with standard RNAseq, microRNA sequencing captures small RNAs enriched by size selection. In parallel to mRNA sequencing, microRNA sequencing has advantages in identifying novel microRNA and detecting microRNA variants over methods of hybridization-based arrays or RT-PCR. Currently, 2,578 human microRNAs are annotated in the microRNA database (http://www.mirbase.org/), with many of them identified by microRNA sequencing.40


Fusion Genes

Abnormal fusion genes caused by chromosomal rearrangements often play key roles in tumor initiation and progression. Historically, recurrent fusion genes were identified in cancers using cytogenetic techniques.41 Due to the poor resolution of these cytogenetic techniques, identification of the fusion partners and their fusion products is not a trivial task. RNAseq can be used to systematically discover genome-wide fusion events with single nucleotide resolution on fusion break points, which can be used in diagnosis or targeted clinical intervention for cancers.31,42,43,44,45

In addition to fusion gene events caused by genomic rearrangement, RNAseq can also detect trans-splicing and read-through transcripts, which were recently found to be important in cancer development.46,47 A combination of whole-genome sequencing and RNAseq was recently used to identify a novel PAX3-IN80D
fusion gene in an area of extensive DNA chromosomal rearrangement in rhabdomyosarcoma.48 Such discoveries will not be discovered by traditional cytogenetic techniques or DNA sequencing; therefore, RNAseq is an effective method to identify gene fusions, including novel or rare events.






Figure 3.4 The types of genomic alterations that can be detected by next-generation sequencing methods. The flexibility of next-generation sequencing allows for the detection of multiple different genomic alterations. DNA or cDNA generated from RNA is fragmented, and sequence reads are produced. The reads are composed of sequenced regions (depicted here as the colored bars) and spacer regions (depicted here in grey). These generated short sequences are aligned to a reference genome in this case; the reads align to chromosome 2 and 13. From the alignment of many short reads, a consensus sequence emerges in which multiple genomic variations can be detected including point mutations (Reference A changes to a C in this example); small insertions and deletions; deletions including homozygous regions (no generated reads) and heterozygous regions (only half the expected reads are generated); amplifications (more reads are generated); and translocations where the generated sequences align to different genomic locations (in this case, chromosome 2 and 13). (Figure adapted from Meyerson 2010.227)


Alternative Splicing

Alternative splicing is the most prevalent posttranscriptional RNA processing event affecting 95% of all human multiexon genes.49 Spliceosomes catalyze splicing process by recognizing splice sites located at exon-intron boundaries. In some cancers, genomic mutations disrupt important splice sites that result in aberrant splicing, which can cause insertion, deletion, or frame shifts in the amino acid sequence. Moreover, splicing factors that bind to specific pre-mRNA-binding motifs regulate splicing patterns in a tissue- and disease-specific manner.50 Changes of splicing factors are shown to be important contributors during normal tissue development such as epithelial-mesenchymal transition,51 as well as tumorigenesis, such as tumor cell motility,52 metastasis53 metabolism,54 and proliferation.55 RNAseq allows examining of the splicing program at three different levels. First, splicing boundaries can be elucidated by the RNAseq reads at single nucleotide resolution, and expression of splice variants can be quantified by counting the reads aligning to exons specific to the particular variant. Second, genomic mutations at particular splice sites and splicing factor-binding motifs can result in aberrant splicing in tumors. RNAseq read sequences can be analyzed for such potential mutations. Third, the expression level of splicing machinery genes, as well as splicing factor genes can be quantified to study the splicing control in tumors.


GERMLINE MUTATIONS

When chromosomal deletions, translocations, amplifications, point mutations, or nondisjunction events occur in a gamete, the abnormality exists in the germline, and the entire organism that develops after conception bears the alteration in each and every cell. A variation on this theme can occur when the alteration or nondisjunction event occurs in a somatic cell early in its lineage development, leading to mosaicism for the entire organism or for particular cell types. A large number of familial cancer syndromes have been identified (see Lindor et al.56 for a detailed list), and the more common ones affecting childhood cancer are listed in Table 3.1. Although most of these syndromes are rare, it is important to note that they highlight pathways and themes that are common to many types of cancer.

Constitutional deletions can predispose an individual to the development of cancer; the classic example of this phenomenon is the chromosomal abnormalities associated with the hereditary form of retinoblastoma. Hereditary and sporadic forms of retinoblastoma have been distinguished on the basis of clinical and epidemiologic presentation. The hereditary form (i.e., familial or
de novo germline mutation) is estimated to comprise about 40% of affected persons. These patients often have early onset, bilateral disease, with a positive family history for retinoblastoma, leading Knudson to propose a “two-hit” mechanism of carcinogenesis in which the first genetic defect, already present in the germline, must be complemented by an additional spontaneous mutation before a tumor can arise. In contrast, the sporadic form results when two spontaneous mutations take place in the same cell.57

The involvement, and often the deletion, of one of the two alleles from this region of chromosome 13 in patients with retinoblastoma was proven by molecular analysis. The retinoblastoma cells had undergone a “reduction to Homozygosity,” also known as loss of heterozygosity (LOH), consistent with an acquired monosomy in the tumor tissue. These kinds of analyses led to the successful cloning of the first tumor suppressor gene, RB.58 Subsequent studies demonstrated that almost all patients with bilateral disease carry a germline mutation of the RB gene, which may have been inherited from a parent or generated de novo.

Another example of a constitutional chromosomal aberration is the rare syndrome of Wilms tumor characterized by aniridia, genitourinary defects, and mental retardation (WAGR). This variant was found to be correlated with a constitutional deletion of chromosome 11p13, from which a Wilms tumor susceptibility gene, WT1, has been cloned. WT1 encodes a DNA-binding transcription factor whose expression in fetal kidney and embryonic structures suggests its involvement in genitourinary development.59 Patients with Beckwith-Wiedemann syndrome, which is associated with a chromosomal abnormality distinct from the WAGR syndrome, also show an increased susceptibility to the development of Wilms tumor, adrenal cortical carcinoma, neuroblastoma, and rhabdomyosarcoma, in the context of macroglossia, somatic gigantism, visceromegaly, hypoglycemia, and abdominal wall defects. They often have constitutional duplications of the 11p15 region, and their Wilms tumors show LOH in this region. This region is imprinted (see below), and some patients have paternal uniparental disomy (UPD) of chromosome 11, where the maternal copy of this chromosome is replaced with an extra paternal copy. Differential loss of silent allele and duplication of the active allele may contribute to tumorigenesis.


Incidental Germline Findings in NGS Studies

NGS technologies are being used with increasing frequency in pediatric research and clinical care. Although the ability to generate nucleotide sequence data gives unprecedented ability to identify novel disease-associated genetic variants, because the entire protein coding genome is being interrogated, it is possible that potentially significant mutations may be identified that are not related to the disease being studied. For instance, it is possible that germline sequencing could identify mutations associated with disorders like Huntington disease, cancer predisposition syndromes (e.g., breast and ovarian cancer [BRCA1, BRCA2] or colon cancer [e.g., MSH2, MSH6, MLH1]), metabolic disorders (e.g., cystic fibrosis [CFTR], hemochromatosis [HFE], phenylketonruia [PAH], etc.), or other conditions including pharmacogenomics variants that predict toxicity to drugs. It is estimated that the probability of identifying these kinds of mutation is approximately 5% and brings up the question as to when to disclose these findings. The current consensus is to disclose, in concert with expert genetic counselling, when the following criteria are met. First, the genetic change must be known or predicted to be of urgent clinical significance to either the subject or a first-degree relative. Second, knowledge of the finding must have a clear direct benefit that would substantially alter medical or reproductive decision making in the short term. Third, for recessive conditions in which the carrier frequency for mutations in that specific gene is greater than 1% (corresponding to disorders with a disease incidence of more than 1/40,000), the syndrome results in significant morbidity and early diagnosis and intervention would have significant benefit for an affected child. The topic of disclosure of these so-called incidental findings is under intense debate and while guidelines are still evolving, a consensus approach has been published by the American College of Medical Genetics and Genomics.60


Genomic Imprinting

Genomic imprinting, which results in the differential expression of a gene or locus based on parental origin of the allele, can explain a variety of phenomena that appear to violate a simple Mendelian model of inheritance. One can appreciate how this effect can complicate cancer genetics because a particular predisposition to develop a specific kind of cancer (e.g., Wilms tumor) may not always show linkage with a particular aberrant gene but rather may be related to whether the gene was derived from the patient’s mother or father. This situation seems to be relevant to Wilms tumor, osteosarcoma, and embryonal rhabdomyosarcoma (RMS). In a recent study of RMS, a reduction to homozygosity was seen in 50% of all RMS tumors and 65% of all fusion-negative embryonal RMS. Invariably, the paternally derived chromosome 11 is retained and sometimes duplicated in the tumor tissue. The Beckwith-Wiedemann syndrome is another example of the impact of genomic imprinting where there is a correlation between the presence of two paternal copies of the 11p15.5 region and the development of this disorder.61


Trisomy

Trisomy is the presence of an extra copy of a chromosome. Trisomies in the germline result in dramatic deviation from normal growth and development and are, for the most part, incompatible with life; only trisomies of chromosomes 13, 18, and 21 occur with any frequency in the germline of humans, and each is associated with a defined syndrome. The classic constitutional aneuploidy that demonstrates a predisposition to certain forms of cancer is trisomy 21, or Down syndrome. The incidence of both ALL and acute megakaryoblastic leukemia (AMKL) is increased in patients with Down syndrome.

In addition, acquired trisomy 21 is a relatively frequent chromosomal abnormality found in the acute leukemias. A related hematopoietic condition, transient myeloproliferative disorder (TMD), is also associated with Down syndrome or trisomy 21 mosaicism. TMD classically manifests in newborns as a myeloproliferative disorder that can include hepatosplenomegaly, leukocytosis, and circulating myeloblasts. The morphologic picture is consistent with congenital leukemia except that spontaneous remission occurs. However, this condition is not unequivocally benign since respiratory complications due to massive hepatosplenomegaly may require chemotherapeutic intervention and hepatic fibrosis caused by the megakaryocytic infiltrate may lead to liver failure. Moreover, 20% to 30% of Down syndrome patients with TMD develop AMKL at 1 to 3 years of age.

TMD or AMKL in Down syndrome patients is invariably associated with mutations of the GATA1 gene that lead to expression of a truncated GATA1 protein.62 Analysis of umbilical cord blood samples and neonatal blood spots (Guthrie cards) from patients with Down syndrome revealed the following: (1) GATA1 mutations are always present in TMD patients and usually can be detected in the neonatal blood spots of patients who subsequently developed AMKL; (2) some AMKL patients have multiple GATA1 mutations; (3) GATA1 mutations are only rarely found in AMKL patients who do not have Down syndrome; and (4) GATA1 mutations are not found in control cord blood samples from individuals without Down syndrome.63 Taken together, these observations suggest that although mutations of GATA1 in the presence of trisomy 21 may be sufficient to cause TMD, additional genetic events are required for TMD to progress to AMKL.



SOMATICALLY ACQUIRED CHROMOSOMAL ABERRATIONS AND MUTATION

Somatic alterations differ from germline mutations in that they have been acquired by somatic cells. These acquired aberrations are confined to the malignant clone of a cancer patient, are not found in the normal tissues of that individual, and therefore cannot be transmitted from generation to generation. Moreover, in contrast to constitutional chromosomal abnormalities, cell type-specific or cancer-specific abnormalities often correlate directly with a phenotypic effect.


GROSS CHROMOSOMAL REARRANGEMENTS

Gross chromosomal rearrangements (GCRs) can be recognized on a metaphase spread, using a light microscope. Using traditional G-banding techniques, the level of resolution is approximately 5 Mb; using various fluorescent hybridization techniques, the level of resolution can be improved to approximately 100 kb. As discussed above, the advent of molecular genetic techniques, especially NGS has led to a marked increase in the number of known, recurrent somatic mutations. GCRs, or structural variations, can lead to interstitial deletions, amplifications, inversions, and translocations, resulting in the unscheduled expression of proto-oncogenes, generation of oncogenic fusion genes, and deletion of tumor suppressor genes. Given that these GCRs are causal events in malignant transformation, a clearer understanding of their genesis is important in understanding the root causes of childhood cancers. It is not known, for example, whether the regions frequently involved by GCRs represent sites within the genome that are “fragile” and highly susceptible to breakage and re-ligation, or simply sites near growth-promoting proto-oncogenes, whose deregulation gives cells a growth advantage. The following discussion draws heavily from investigations of the childhood leukemias, which have proved particularly amenable to cytogenetic and molecular analyses.






Figure 3.5 Chromosomal rearrangements caused by illegitimate V(D)J recombination. The top panel depicts normal V(D)J recombination, with one of many V segments (light blue) recombining to a D segment (red) followed by a J segment (orange). Discreet V, D, and J segments are flanked by heptamer/nonamer sequences (blue triangles). Splicing of the recombined VDJ segment to the C segment occurs at the RNA level, as depicted. Transcription is regulated by an enhancer region (green). The middle panel shows a chromosomal translocation mediated by V(D)J recombination. In this case, a cryptic heptamer/nonamer sequence within the SCL locus (exons 1 to 6 depicted, the cryptic heptamer/nonamer is represented by a triangle in exon 6) mediates fusion with a TCRD D region, resulting in an interchromosomal rearrangement, and subsequent production of an SCL-TCRD fusion mRNA. The bottom panel shows a V(D)J recombinase-mediate intrachromosomal rearrangement between SIL (only exons 1, 2, and 18 are shown for clarity) and SCL. Cryptic heptamers within the SIL and SCL loci are depicted by blue triangles. The reconfigured genomic DNA results in a SIL-SCL fusion mRNA, controlled by SIL regulatory elements.


Inherited Predisposition to Gross Chromosomal Rearrangements

Several heritable gene defects (see Table 3.1) predispose individuals to development of myeloid and/or lymphoid leukemias; some of these clearly lead to an increased incidence of chromosomal rearrangements. Patients with ataxia-telangiectasia (AT) have mutations of the ATM gene and are prone to the acquisition of chromosomal translocations, most commonly involving antigen receptor genes (IG or TCR). In addition, mice deficient for the ATM protein develop T-cell malignancies, which show chromosomal translocations involving antigen receptor genes. Patients with Nijmegen breakage syndrome are also predisposed to the development of leukemia as well as chromosomal translocations affecting antigen receptor genes. It seems likely that the leukemogenic chromosomal translocations associated with these conditions are caused by a specific recombination defect in V(D)J recombination (see Fig. 3.5).

Individuals with inherited DNA repair defects, such as Bloom syndrome, Fanconi anemia, and Li-Fraumeni syndrome, are predisposed to a spectrum of malignancies, including leukemias, lymphomas, and early onset adult carcinomas. In contrast to the conditions described above, in which the patients develop a malignancy that harbors chromosomal aberrations involving an antigen receptor gene, persons with Bloom, Fanconi, or Li-Fraumeni syndromes have less specific chromosomal aberrations. Although lymphocytes from patients with Bloom syndrome and Fanconi anemia are clearly susceptible to chromosomal breakage in vitro, the leukemias developing in patients with Bloom syndrome and Fanconi anemia sometimes, but not always, show clonal chromosomal translocations.


Molecular Mechanisms Leading to Gross Chromosomal Rearrangements

Some chromosomal translocations seem to be the result of mistakes in normal V(D)J recombination (Fig. 3.5).64 Typically, translocations attributed to illegitimate V(D)J recombination juxtapose a proto-oncogene to a locus that codes for an antigen
receptor (either an IG or TCR) gene. The proto-oncogene present on the translocated chromosome then becomes activated via the regulatory region of the antigen receptor gene. The notion that these translocations are the result of illegitimate V(D)J recombinase activity is strengthened by the presence of features associated with normal V(D)J recombinase action, such as site-specific DNA cleavage at cryptic heptamer sequences and the addition of nontemplated (“N” region) nucleotides at the translocation breakpoints.64 Interestingly, there are now a number of examples of fusions between nonantigen receptor genes that show all of the aforementioned hallmarks of normal V(D)J recombinase activity.65

Additional mechanisms implicated in the generation of GCR include homologous recombination at Alu elements, and DNA DSB at or near extended tracts of alternating purine and pyrimidine residues (Pu/Py tracts), which can form alternate left-handed helical structure, termed Z-DNA.

Finally, defective repair of DNA double-strand breaks (DSB) has been implicated as a cause of oncogenic translocations, specifically in patients with therapy-related AML (t-AML). Some of these t-AML cases are associated with topoisomerase (topo) II inhibitors,66 while others are linked to alkylating agents. t-AML associated with alkylating agent chemotherapy often demonstrates clonal deletions of 7q or 5q, whereas t-AML developing after treatment with topo II inhibitors is characterized by balanced chromosomal translocations often involving MLL, RUNX1, or PML.67

A general hypothesis to account for t-AML induction by topo II inhibitors predicts that DNA DSB induced by the drug are repaired improperly, with re-ligation of strands from two distinct chromosomes, resulting in a chromosomal translocation. A subset of these translocations will lead to the production of oncogenic fusion proteins that give the cell a growth advantage; these will eventually be recognized clinically as leukemias. DNA topoisomerase II functions as a homodimer and catalyzes a three-step reaction consisting of double-strand DNA cleavage, strand passage, and DNA relegation. During this reaction, a short-lived intermediate, consisting of topo II monomers covalently bound to the DNA phosphodiester backbone, is stabilized by topo II inhibitors. These short-lived intermediates are recognized as damaged DNA and trigger apoptotic cell death. It has been proposed that a topo II “subunit exchange,” in which topo II monomers (subunits) that are covalently bound to DNA exchange partners, might lead to a chromosomal translocation. Recently, several chromosomal translocations consistent with this type of subunit exchange mechanism, involving the MLL or NUP98 genes, were identified in patients with t-AML or t-MDS.68,69

An alternative model to account for chromosomal translocations induced by topo II inhibitors suggests that such translocations are initiated by a DNA DSB, followed by processing of the DNA ends, and anomalous joining of these ends to nonhomologous chromosomes via nonhomologous end joining (NHEJ), resulting in a chromosomal translocation. In support of this mechanism, translocation breakpoints from patients with t(4;11) translocations shows that DNA sequences flanking the breakpoints have been duplicated, deleted, and inverted during the translocation process, suggesting that the chromosome ends have undergone processing typical of NHEJ.


Timing of Leukemogenic Chromosomal Translocations

Several lines of investigation have demonstrated that many of the common chromosomal translocations occur in utero, although the leukemia associated with the translocation may not become evident for 10 or more years. Monochorionic twins have been shown to harbor identical MLL or TEL translocations, indicating that the translocation had arisen in one twin in utero and “metastasized” to the unaffected twin through the shared placenta. Moreover, the analysis of neonatal screening blood spots (“Guthrie cards”) has demonstrated that clonotypic TEL-AML1 fusions found in the leukemic cells of children with TEL-AML1 fusions were present at birth.70


ONCOGENIC CONSEQUENCES OF GROSS CHROMOSOMAL REARRANGEMENTS—GENERAL THEMES

The study of gross chromosomal rearrangements, especially chromosomal translocations, associated with childhood cancer has revealed several common themes. First, specific translocations are associated with specific classes of cancer. For instance, the t(15;17) (q22;q12) is exclusively associated with acute promyelocytic leukemia (AML M3) and not other forms of AML.71 Similarly, the t(1;19) is found only in patients with pre-B ALL and not other forms of leukemia. Although there are numerous exceptions to this generalization (e.g., the t(4;11)(q21;q23) is associated with both AML and B-cell precursor [BCP] ALL), the recurrent association of specific translocations with specific forms of leukemia indicates that these translocations are causal events for malignant transformation.

A second theme is that the recurrent chromosomal translocations typically lead to one of two abnormalities. A translocation that takes place within the introns of two distinct genes can lead to generation of a novel chimeric protein. For instance, the RUNX1-ETO1 fusion joins the DNA-binding domain of RUNX1 to effector domains of ETO1, producing a chimeric protein with altered function. Alternatively, a translocation can lead to dysregulated expression of an intact gene, caused by its relocation to sites near the promoter/enhancer elements, such as those of TCR or IG genes.

A third theme is the genes affected by chromosomal translocations often encode either tyrosine kinases, involved in signal transduction, or transcription factors. Transcription factors bind to regulatory elements in DNA, such as promoters and enhancers, where they regulate gene transcription. Many of these proteins can be classified on the basis of recurring structural motifs within their DNA- and protein-binding domains, designated as basic region/helix-loop-helix (bHLH), basic region/leucine zipper (bZIP), zinc finger, and homeodomain. The modular organization of transcription factors provides an ideal framework for their multiple functions, particularly binding to DNA in heterodimeric complexes. It also explains why disruption and rearrangement of transcriptional control genes by chromosomal translocations can produce functional hybrid proteins rather than inert peptides. Tyrosine kinase genes can be aberrantly activated through a variety of mechanisms, such as truncation of the ligand-binding domain of growth factor receptors, and loss or replacement of carboxyl-terminal regulatory tyrosine residues. The transcription factors involved in leukemia and sarcoma pathogenesis often have unique transforming properties that are specific for the different types of progenitors within these distinct cell types.


CHROMOSOMAL TRANSLOCATIONS LEAD TO ACTIVATION OF PROTO-ONCOGENES AND GENERATION OF ONCOGENIC FUSION GENES


B-Lineage ALLs


TEL-AML1 Fusion Gene in Pro-B Leukemia

Although the most common cytogenetic abnormality found in children with ALL is the t(12;21)(p13;q22) (Table 3.2), this translocation is not easily detected by conventional methods because the rearranged chromosomal fragments closely resemble normal chromosomes. When analyzed by molecular approaches, the t(12;21) is found in about one-fourth of pediatric B-cell precursor (BCP) ALL cases, but only 3% to 4% of adult ALL cases. This rearrangement results in fusion of TEL (ETV6) on chromosome 12 to AML1 (RUNX1) on chromosome 21 (Table 3.2).72 Both AML1 and TEL are also involved in variant translocations associated with both lymphoid and myeloid malignancies. TEL contains a dimerization motif conserved in the ETS family of proteins and has been identified in fusion with many different partners, such as TEL-PDGFRβ in CMML; TEL-MN1, TEL-ABL, and TEL-EVI1 in AML; and TEL-JAK2 in ALL. AML1 is also involved in the pathogenesis of AML through its fusion with the ETO gene in AML cases with the t(8;21).










TABLE 3.2 Recurrent Chromosomal Translocations Associated with Hematologic Malignancies
























































































































































































































































































































































Leukemia Type


Chromosomal Abnormality


Genes Involved


Mechanism of Activation


Structural Motif in Chimeric Proteina


Estimated Frequency, %b


References


Lymphoid


B-cell ALL/Burkitt lymphoma


t(8;14)(q24;q32)


MYC


Relocation to IqH locus


bHLHzip


5


t(2;8)(p12;q24)


MYC


Relocation to IgL locus


bHLHzip


<1


t(8;22)(q24;q11)


MYC


Relocation to IgL locus


bHLHzip


<1


B-cell NHL


t(3;11)(q27;q23.1)


BCL6


Gene fusion


Zinc finger


1


Early-B-cell ALL


t(12;21)(p12;q22)


TEL-AML1


Gene fusion


Runt-homology


25


Pre-B-cell ALL


t(1;19)(q23;p13)


E2A-PBX1


Gene fusion


Homeodomain


5


Pro-B-cell ALL


t(17;19)(q22;p13)


E2A-HLF


Gene fusion


bZIP


1


t(4;11)(q21;q23)


MLL-AF4


Gene fusion


A-T hook


4


T-cell ALL


t(8;14)(q24;q11)


MYC


Relocation to TCRα/δ locus


bHLHzip


<1


t(7;19)(q35;p13)


LYL1


Relocation to TCRβ locus


bHLH


<1


t(1;14)(p32;q11)


SCL(TAL1)


Relocation to TCRα/δ locus


bHLH


<1


t(7;9)(q35;q34)


TAL2


Relocation to TCRβ locus


bHLH


<1


t(14;21)(q11;q22)


BHLHB1


Relocation to TCRα locus


bHLH


<1


255


t(11;14)(p15;q11)


LMO1(RBTN1)


Relocation to TCRα/δ locus


Cysteine-rich LIM


<1


t(11;14)(p13;q11)


LMO2(RBTN2)


Relocation to TCRα/δ locus


Cysteine-rich LIM


1


t(7;11)(q35;p13)


LMO2(RBTN2)


Relocation to TCRβ locus


Cysteine-rich LIM


<1


t(10;14)(q24;q11)


HOX11


Relocation to TCRα/δ locus


Homeodomain


<1


t(7;10)(q35;q24)


HOX11


Relocation to TCRβ locus


Homeodomain


<1


t(5;14)(q35;q32)


HOX11L2


Relocation to TCR14q32 locus


Homeodomain


3


75


inv(7)(p15;q34)


HOXA7/9/10


Relocation to TCRβ locus


Homeodomain


3


78


t(10;11)(p13;q21)


CALM-AF10


Gene fusion


Clathrin assembly


2


80


t(4;11)(q21;p15)


NUP98-RAP1GDS1


Gene fusion


Nucleoporin


<1


t(9;12)(p24;p13)


TEL-JAK2


Gene fusion


Tyrosine kinase


<1


ALCL


t(2;5)(p23;q35)


NPM1-ALK


Gene fusion


Tyrosine kinase


90


Myeloid


AML (granulocytic)


t(8;21)(q22;q22)


AML1-ETO


Gene fusion


Runt homology


12


Myelodysplasia


t(3;21)(q26;q22)


AML1-EAP


Gene fusion


Runt homology


1


CML, blast crisis


t(3;21)(q26;q22)


AML1-EV11


Gene fusion


Runt homology


1


AML (undifferentiated)


t(3;v)(q26;v)


EV11


Gene activation


Zinc finger


3


AML (myelomonocytic)


inv(16)(p13;q22)


CBF-MYH11


Gene fusion


Complex with AML1


12


AML (monocytic)


t(9;11)(p21;q23)


MLL-AF9


Gene fusion


A-T hook


7


AML (promyelocytic)


t(15;17)(q21;q21)


PML-RARa


Gene fusion


Zinc finger


7


t(11;17)(q23;q21)


PLZF-RARa


Gene fusion


Zinc finger


<1


AML (undifferentiated)


t(16;21)(p11;q22)


FUS-ERG


Gene fusion


Ets-like


<1


AML (undifferentiated)


t(6;11)(q21;q23)


MLL-AF6q21


Gene fusion


Forkhead


1


CMML


t(5;12)(q33;p13)


TEL-PDGFRB


Gene fusion


Ets


1


AML-M4Eo


t(1;12)(q25;p13)


ETV6-ARG


Gene fusion


Ets


1


AML, CML


t(7;11)(p15;p15)


NUP98-HOXA9


Gene fusion


Homeobox


1.5


AML, MDS


t(2;11)(q31;p15)


NUP98-HOXD13


Gene fusion


Homeobox


1


AML


t(5;14)(q33;q32)


CEV14-PDGFRB


Gene fusion


Tyrosine kinase


1


AML-M5


t(8;22)(p11;q13)


P300-MOZ


Gene fusion


Zinc finger


1


256


AML-M5


t(10;11)(p12;q23)


MLL-AF10


Gene fusion


ZIP, A-T Hook


1


AML-M5


t(3;11)complex


MLL-NRIP3


Gene fusion


AML


t(6;9)(p23;q34)


DEK, NUP214


Gene fusion


Nucleoporin


<1


AML M7


t(1;22)(p13;q13)


RBM15, MKL


Gene fusion


RNA binding


1


257


AML


t(11;v)(q23;v)c


MLL


Gene fusion


A-T Hook


5


95


AML, CMML


t(12;v)(p13;v)c


ETV6


Gene fusion


Ets


1


AML, MDS


t(11;v)(p15;v)c


NUP98


Gene fusion


Nucleoporin


1


258


Ph+CML, ALL


t(9;22)


BCR-ABL


Gene fusion


Tyrosine kinase


100


259


ALCL, anaplastic large cell lymphoma; AML, acute myeloid leukemia; ALL, acute lymphoblastic leukemia; APML, acute promyelocytic leukemia; CML, chronic myelogenous leukemia; bHLHzip, basic region/helix-loop-helix/leucine zipper domain; bZIP, basic region/leucine zipper domain. PDGFβR, platelet-derived growth factor beta receptor; ABL, v-abl Abelson murine leukemia viral oncogene homolog 1; JAK2, Janus kinase 2; ALK, anaplastic lymphoma kinase.


a Based on analysis of DNA-binding/protein interaction domain.

b Percentage of total cases with childhood lymphoid or myeloid acute leukemia.


c “v” represents any of more than 10 chromosomal loci.



Loss of the normal TEL allele is frequently observed in BCP ALL patients with a t(12;21), suggesting that TEL loss of function may contribute to leukemic transformation. BCP ALL patients with the t(12;21) have a good prognosis independent of clinical risk factors, such as age and WBC at presentation, with relapse-free survival rates approaching 90% in studies employing a variety of drug regimens.


E2A-PBX1 Fusion Gene in Pre-B Leukemia

The E2A gene, located at 19p13.3, encodes a transcription factor that contains a bHLH (for basic domain, helix-loop-helix) DNA-binding and dimerization motif. Its fusion with the PBX1 homeobox gene as a result of the t(1;19)(q23;p13) occurs in approximately 5% of childhood ALL cases. E2A-PBX1 hybrids retain the amino-terminal trans-activation domain of E2A but not its DNA-binding region, which is replaced by the homeobox DNA-binding and protein-protein interaction domain of PBX1. Thus, the gene targets of E2A-PBX1 are probably those specified by the homeobox of PBX1.

Reports of E2A-PBX1 involvement in human disease have been restricted to ALLs with a pre-B-cell phenotype. However, lethally irradiated mice repopulated with bone marrow cells expressing E2A-PBX1 fusion genes developed AML, and thymic lymphomas developed in transgenic mice that expressed an E2A-PBX1 fusion gene. Taken together, these results suggest that an E2A-PBX1 fusion can be oncogenic in a wide variety of hematopoietic cells.


MYC Activation in B-cell ALL

Patients with B-cell ALL or Burkitt lymphoma commonly display a t(8;14)(q24;q32) translocation that juxtaposes one allele of MYC, a bHLH/leucine zipper gene located on chromosome 8, with the IGH locus on chromosome 14q32.73 This juxtaposition of MYC coding sequences to IG enhancer elements results in dysregulated expression of the MYC protein. Although the t(8;14) accounts for most B-cell ALL cases with rearranged MYC loci, two variants are also capable of activating MYC. In cells with the t(2;8) or the t(8;22), the MYC gene remains on chromosome 8, and portions of the κ or λ light-chain genes on chromosome 2 or 22, respectively, are translocated to a site downstream of MYC, leading to aberrant expression of MYC.


T-cell ALL


bHLH, HOX, and Other Developmental Genes

Transcription factor genes are the preferred targets of chromosomal translocations in patients with T-cell ALL (Table 3.2). Notable examples include the bHLH genes MYC, SCL(TAL1), and LYL1. When rearranged near enhancers within the TCRB locus on chromosome 7q34, or the TCRA/D locus on chromosome 14q11, these regulatory genes are aberrantly expressed, and their protein products bind inappropriately to the promoter or enhancer elements of downstream target genes.

A useful model of aberrant transcription factor expression in T-cell ALL is provided by SCL activation due to the t(1;14) or an interstitial deletion upstream of the gene (Fig. 3.5). These chromosomal aberrations characterize 25% of all cases of childhood T-cell ALL and lead to ectopic expression of SCL in the thymus. Because the SCL protein forms a pentameric DNA-binding complex with E2A, LMO2, GATA1, and LDB1, its ectopic expression in T cells might be expected to activate specific sets of target genes that are normally quiescent in T-cell progenitors. Alternatively, SCL might be leukemogenic via a dominant-negative effect, since overexpression of SCL can lead to a functional inactivation of E2A homodimers or E2A-HEB heterodimers, presumably by sequestering E2A in the aforementioned pentameric complex. This model is supported by the observations that E2A-deficient mice develop T-cell ALL. Moreover, mice that express SCL mutant proteins, which are unable to bind DNA or to activate transcription but retain the ability to bind E2A, develop a form of T-cell ALL that is indistinguishable from that produced by the full-length SCL protein.

In addition to genes encoding bHLH proteins, additional classes of regulatory genes are activated by chromosomal translocations in patients with T-cell ALL. These include the t(11;14) (p15;q11) or t(11;14)(p13;q11), which juxtapose the coding sequences of LMO1 (formerly known as RBTN1 or TTG1) or LMO2 (formerly known as RBTN2 or TTG2) with regulatory regions of the TCR loci. Although present in high concentrations in the central nervous system, these proteins are expressed only in the most immature T cells74 and, as indicated above, can bind SCL. LMO1 induces thymic lymphomas in transgenic mice, and the age of onset and penetrance of the disease is markedly accelerated by coexpression of SCL in the thymus.

HOX11 and HOX11L2 represent two additional developmental control genes that are inappropriately placed under the control of TCR loci. Located on 10q24, HOX11 encodes a homeodomain transcription factor that can bind DNA and activate specific target genes. A specific homeotic role for HOX11 in mammalian development was demonstrated by ablation of this gene, which blocked the formation of the spleen in mice. Activation of HOX11 by chromosomal translocation in developing T cells is thought to interfere with normal regulatory cascades to promote malignant transformation.

More recently, the HOX11L2 gene, located at 5q35, has been found to be activated by fusion near the BCL11B locus as a result of the t(5;14)(q35;q32), or by fusion to the TCRδ locus as a result of the t(5;14)(q35;q11). Although neither of these translocations is commonly recognized with use of conventional cytogenetic techniques, almost 20% of childhood T-cell ALL patients demonstrated a HOX11L2 gene translocation by fluorescent in situ hybridization (FISH).75 Although some studies have suggested that T-cell ALL patients whose lymphoblasts overexpress HOX11L2 have a poor prognosis, this finding has not been confirmed.76

The chromosomal rearrangements inv(7)(p15q34) and t(7;7) (p15;q34) both lead to a fusion of the HOXA cluster with TCRB.77 The immunophenotypes of T-cell ALL with these translocations were generally negative for cell-surface expression of TCRα/β and TCRγ/δ, reflecting differentiation arrest at a relatively immature stage.77 Patients with this fusion overexpressed many HOXA cluster genes, particularly HOXA7, HOXA9, and HOXA10.78 Of note, the same HOXA cluster genes (HOXA7, HOXA9, and HOXA10) are frequently overexpressed in patients with AML (see the following sections).


Fusion Genes in T-cell ALL

Although most chromosomal translocations in T-cell ALL patients lead to inappropriate activation of normal cellular proto-oncogenes such as MYC, SCL, or LMO2, some can produce fusion genes (Table 3.2). MLL-ENL fusion results from the t(11;19) (q23;p13) translocation and is associated with AML, BCP ALL, and T-cell ALL. Strikingly, in one series, all 11 T-cell ALL patients with the MLL-ENL fusion became long-term survivors, suggesting that this rearrangement is associated with a good prognosis.

The t(10;11)(p13;q21) translocation leads to formation of a CALM-AF10 fusion gene. This fusion was initially identified in the U937 cell line, which was established from a patient with
histiocytic lymphoma and has been shown to differentiate along the macrophage lineage in vitro. Subsequently, this fusion was found in patients with a wide spectrum of hematologic malignancies, but most commonly in patients with T-cell ALL.79 CALM-AF10 fusions were identified in 12 (9%) of 131 consecutive patients with T-cell ALL; all patients with CALM-AF10 fusions had either immature T-cell lymphoblasts that expressed no TCR genes or TCRγ/δ-positive lymphoblasts.80

An unusual fusion gene resulting in ABL kinase activation has been identified in some patients with T-ALL. A small deletion removes approximately 500 kb of chromosome 9, with breakpoints within an intron of the NUP214 gene and within the first intron of ABL.81,82 Remarkably, this deleted fragment becomes ligated as a circular episome that encodes a fusion gene between amino-terminal sequences of NUP214 and the ABL kinase. It is maintained and amplified as an episomal structure and is small enough that it does not appear as a double-minute chromatin body and can only be visualized cytogenetically by FISH analysis for the affected genes. The NUP214-ABL fusion typically occurs in the subset of cases with activated HOX11 or HOX11L2 homeobox transcription factors.


Acute Myeloid Leukemia


Core Binding Factor Fusions (AML1-ETO and CBFB-MYH11)

The same general mechanisms responsible for proto-oncogene activation in ALL are active in AML (Table 3.2). A prime example of a chimeric transcription factor in AML patients is the AML1-ETO protein resulting from the t(8;21)(q22;q22). In this gene fusion, the sequence-specific DNA-binding and protein-protein interaction properties are encoded by a large domain of the AML1 (also known as RUNX1) gene. In normal hematopoietic cells, AML1 forms a stable transcription activation complex with the CBFβ protein. Remarkably, CBFβ is also involved in a common chromosomal rearrangement in AML patients, the inv(16)(p13q22), found in patients with myelomonocytic AML and increased bone marrow eosinophils (designated M4-Eo in the FAB classification). This rearrangement joins the amino-terminal sequences of the CBFB gene to the carboxyl terminus of the heavy-chain gene of smooth muscle myosin (MYH11), resulting in formation of a CBFB-MYH11 fusion protein. The combinatorial versatility of the AML1 locus can be seen from its fusion with sequences from the EVI1 gene in t(3;21)-positive chronic myeloid leukemia in blast crisis or the EAP gene in patients with myelodysplastic syndrome.


PML-RARα Fusions in Promyelocytic Leukemia

Oncologists have long envisioned treatments based on a molecular understanding of oncogenic proteins. Major progress toward this goal has been achieved in patients with acute promyelocytic leukemia (APL; FAB M3) and a t(15;17)(q21;q11-q22) translocation. This translocation leads to a chimeric protein that fuses the ligand- and DNA-binding sequences of the retinoic acid receptor α (RARA) gene on chromosome 17 to the PML gene on chromosome 15 (Table 3.2). In its unaltered form, the RARA protein binds first to the retinoic acid ligand and then to DNA. PML proteins are normally located in macromolecular nuclear organelles, called PML oncogenic domains (PODs). The PML-RARA fusion proteins disrupt these subnuclear structures, causing normal PML, RXR, and other nuclear proteins to disperse throughout the nucleus. They interfere with normal myeloid cell development, leading to a differentiation arrest at the promyelocytic stage. These fundamental observations provide a rationale for use of all-trans-retinoic acid (ATRA) to treat patients with APL. In pharmacologic doses, ATRA binds to the RARA fusion partner, followed by reorganization of PML and its associated proteins into normal-appearing nuclear PODs. Subsequently, the leukemic cells differentiate into mature neutrophils. However, retinoic acid treatment of APL does not result in permanent remissions, limiting the agent’s therapeutic role to the remission-induction period and to combination therapy with cytotoxic drugs or arsenic.83


NUP98 Fusion Genes in the Myeloid Leukemias

The NUP98 locus encodes a 98-kD component of the nuclear pore complex, which mediates nucleo-cytoplasmic transport of RNA and protein, and more recently, has been recognized to serve as a transcriptional “scaffold.”84 NUP98 has been identified in fusion transcripts with more than 25 different partner genes, predominantly in patients with myeloid leukemias and myelodysplastic syndrome.84 About half of the NUP98 partner genes encode homeobox proteins, predominantly of the abd-b type (HOXA7, A9, A11, HOXC11, C13, HOXD11, D13).84 The remaining NUP98 partner genes belong to no recognized gene family, but several are predicted to form coiled-coil structures.85 NUP98 gene fusions are associated with a wide spectrum of malignant diseases. Although MDS and AML are the most common diagnoses associated with these fusions, the NUP98-RAP1GDS1 fusion is seen exclusively in T-cell ALL patients. Many of the NUP98 fusions, including NUP98-HOXD13, NUP98-TOP1, and NUP98-DDX10, have been identified in patients with therapy-related AML or MDS following multiagent chemotherapy. In addition to acute leukemias, NUP98 translocations have also been recognized in patients with CML, most commonly during evolution to blast crisis.86 This observation suggests that the products of NUP98 fusion genes might cooperate with receptor tyrosine kinases, such as BCR-ABL, during the course of malignant transformation. This hypothesis is supported by a report showing that NUP98-HOXA9 and BCR-ABL fusion kinase to induce acute leukemia in a mouse model.87

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Aug 25, 2016 | Posted by in ONCOLOGY | Comments Off on Molecular and Genetic Basis of Childhood Cancer

Full access? Get Clinical Tree

Get Clinical Tree app for offline access