Next-Generation and Third-Generation Sequencing of Lung Cancer Biomarkers




Fig. 10.1 Correlation of EGFR mutations and predicted TKI response





Next-Generation Sequencing (NGS) Background


The clinical use of NGS has significant benefits for diagnostic biomarker discovery and clinical screening capacity compared to traditional molecular assays such as single-gene Sanger-based sequencing (also referred to as first-generation sequencing) (Table 10.1). For clarification, the terminology “next”-generation sequencing or “NGS” refers to sequencing methodologies other than the traditional first-generation Sanger di-deoxy sequencing. NGS broadly encompasses both currently utilized methods referred to as “second-generation sequencing” and new advancements in sequencing known as “third-generation sequencing” technologies. While debate over the exact categories for second- and third-generation sequencing Next-generation sequencing (NGS) exist, in general, second-generation sequencing represents methods that amplify DNA via emulsion PCR (e.g., Ion Torrent) or solid-phase amplification (e.g., Illumina). These methods are in contrast with third-generation sequencing which is performed utilizing non-amplified, single molecules (e.g., Pacific Biosciences and Oxford Nanopore). Regardless of classification as “second-” or “third”-generation sequencing, both methods are encompassed in the term “next-generation sequencing” in which the term “next” refers to any non-Sanger-based sequencing methodology.



Table 10.1 Summary of NGS testing benefits and potential barriers to clinical adoption

































Benefits of NGS vs. Sanger sequencing

Screen multiple genes/samples at one time (massive parallel)

Low input DNA/RNA required

Sensitivity to detect mutant allele at 2–10% (Sanger 15–25%)

Sensitive variant detection in samples with heterogeneity

Quantitative assay

Lower cost per sample/target

Ability to detect copy number alterations and gene fusions

Barriers to NGS adoption

Multiple platforms and rapidly evolving technology

Upfront cost of instrument and training

Lack of guidelines and unclear on process of LDT validation

Complex workflow with need for bioinformatic expertise

Dedicated hardware for analysis and long-term data storage

The utility of NGS (second and third generation) over that of Sanger is the ability to perform massive parallel sequencing. In essence, massive parallel sequencing involves interrogation of numerous samples and numerous alterations with speed and accuracy. Ultimately, this results in higher throughput which reduces cost per sample. NGS is also highly flexible with specific applications that can be tailored to the clinical question [9]. The clinical use of NGS has fundamentally improved our understanding of lung cancer biology and has led to revolutionizing clinical molecular diagnostic testing. As diagnostic lung tissue is often limited, NGS allows interrogation of numerous targets with limited sample input secondary to its ultralow sample input requirement. It is also capable of detecting mutations below 15% mutant allele frequency (compared to Sanger which requires 15–25% mutant allele frequency) [10].

Input material for NGS can be either DNA or RNA. Multiple sequencing methods exist and include whole-genome sequencing (WGS for DNA), whole-exome sequencing (WES for DNA), whole-transcriptome sequencing (RNA-Seq for RNA), and targeted sequencing (TS either DNA or RNA). Each method WGS, WES, RNA-Seq, or TS has specific strengths and weaknesses. In general, DNA-based methods identify small base pair alterations, insertion/deletions, as well as potential copy number changes. One significant difference between WGS, WES, and TS is depth of sequencing reads generated per target, which is higher for TS assays which focus on a selection of targets that typically represent a small fraction of the exome or genome. For instance, in lung cancer, TS-based NGS assays could focus on known genomic alterations in key biomarkers. RNA-based sequencing is utilized for detection of alternative gene-spliced transcripts, posttranscriptional modifications, gene fusion, mutations/single-nucleotide polymorphisms, small and long noncoding RNAs, or changes in gene expression. These methods will be explored and described in more detail below.


NGS Methodology



Whole-Genome Sequencing (WGS)


Currently, WGS represents one of the highest cost NGS methods and is not routinely utilized in routine clinical screening or monitoring of lung cancer. However, like most technologies, the cost of WGS is declining with improvements in NGS technologies [9]. WGS can detect a wide range of genomic alterations, including known disease-associated and novel variants, a feature that makes this technique well suited for research. Barriers to routine clinical lung cancer screening include the cost, the large volume of data produced, and necessary expertise/tools for data mining. Data analysis is a significant challenge for WGS, and streamlined process needs to be generated for this method to fulfill the gaps needed in personalized medicine [11]. Clinical strengths of WGS include the ability to determine breakpoints in balanced chromosome translocations and inversions and detecting genomic alterations outside of coding regions [12]. WGS allows full interrogation of promoters, enhancers, introns, noncoding RNAs (i.e., miRNAs), and unannotated regions [13, 14]. This full view of the genomic landscape is well suited for research applications or driver pathway discovery where a comprehensive profile of point mutations, complex rearrangements, indels, and copy number alterations is required [12]. For example, The Cancer Genome Atlas (TCGA) Research Network utilized WGS for lung adenocarcinomas and identified 25 significantly mutant genes, including both known mutations, TP53 (50%), KRAS (27%), EGFR (17%), STK11 (15%), KEAP1 (12%), ATM, NF1 (11%), BRAF (8%), and SMAD4 (3%), and unknown (never previously reported) mutations, SMARCA4, ARID1A, RBM10, SETD2, PICK3CA, CBL, FBXW7, PPP2R1A, RB1, CTNNB1, U2AF1, KIAA0427, PTEN, BRD3, FGFR3, and GOPC [15]. The trade-off for such complete genomic landscape analysis is low sequencing coverage. This one feature greatly limits the clinical application for routine lung cancer screening. WGS coverages vary depending on methodology but on average are below 100-fold, whereas targeted sequencing assays routinely achieve greater than 1000-fold coverage. Fold coverage is directly correlated with ability to identify tumors with low mutation burden, which is especially problematic in tumors that are not clearly separated from non-tumor stroma (dilutes mutant allele burden) [10].


RNA Sequencing (RNA-Seq)


RNA-Seq is a specialized form of NGS which can be utilized to interrogate the lung cancer transcriptome (represents up to ~4% of the human genome) [16]. Following the central dogma of molecular biology, DNA is transcribed to messenger RNA (mRNA), and mRNA is translated into protein. While the human genome contains approximately 25,000 genes, not all genes will be transcribed and translated into protein. Moreover, not every coded gene will be transcribed in proper order due to alternate splicing. Therefore, sequencing RNA (specifically mRNA) allows one to address questions including what genes are being expressed and at what level of expression. RNA-Seq can generate a comprehensive profile of the complete transcriptome or be utilized for a more focused targeted sequencing application. RNA-Seq as a method allows mapping the boundaries of exons and introns for identification of splice variants, identification of gene translocations, posttranscriptional modification, mutations, and noncoding of miRNAs [9, 12, 17]. It also offers a highly sensitive assay for quantification of the abundance of a transcript, even higher than comparative microarray technology [18]. While RNA-Seq offers several options not available by DNA-based NGS, it has its own inherent challenges which include library construction (inherently more difficult due to labile RNA molecule), data mining (high number of low abundant transcripts—potential false-positive calls), and obtaining complete transcript coverage [19].


Whole-Exome Sequencing (WES)


WES is utilized to specifically sequence the coding exons (~2.5% of the human genome) or the portion of genes that form the template for mRNA and successive protein production. This methodology specifically ignores noncoding regions such as promoters, enhancers, introns, and noncoding RNAs. Elimination of sequencing in these regions decreases the number of sequencing targets and thereby allows for improved fold coverage. WES focus solely on coding exons in annotated genes and therefore only allows variant detection in known coding genes. WES can be designed to also include sequencing of selected or limited regions of noncoding DNA regions which include exon-flanking regions and potentially select miRNAs. Similar to WGS, the amount of sequencing data can be extensive for each sample and the number of total detected variants by WES can be high (20,000–30,000 range) depending on tumor sample and NGS methods/bioinformatics utilized. This large number of variants makes detecting actionable activating mutations a challenge. While more focused than WGS, application of WES to lung carcinoma is still currently best suited for research rather than routine clinical practice. Improvements in NGS such as decreased cost, faster analysis time, increased coverage, and improved accuracy could drive increased adoption of WES into routine clinical practice [10].


Targeted Sequencing (TS)


TS represents the most clinically utilized current NGS assay for lung cancer diagnostic testing. This method focuses specifically on interrogation of known genomic regions of interest. TS limits the sequencing to a small number of targeted regions, ultimately decreasing the amount of sequencing time and data generated, while also making the assay highly cost-effective by increasing the number of samples that can be analyzed simultaneously (multiplexed). Limiting TS to known cancer-relevant alterations makes this assay highly suited for clinical use which requires detecting known alterations such as point mutations and deletions in EGFR or even translocations in ALK or ROS1. However, being so highly targeted, this method may miss variants that are present but not located in regions interrogated by the assay. The adoption of TS via NGS into clinical practice for lung cancer has resulted in the availability of a highly sensitive method for detecting actionable alterations in lung cancer specimens [2022]. A recent report showed NGS-based TS was able to identify EGFR/KRAS/ALK alterations in up to 58% of patients that were called wild type by standard testing, which translated into improved opportunities for therapeutic intervention [23]. Since most NSCLCs are detected once locally advanced and/or inoperable tumors, often only fine needle aspirate (FNA) cytology samples of mets are available for molecular testing. FNA tumor cell content may be very limited and therefore testing by traditional Sanger sequencing would not be possible. However, TS via NGS can utilize nanogram quantities of DNA, and FNA/cytology samples have been shown to be sufficient for TS NGS analysis [2426].


NGS Translocation Detection


Currently, the list of routinely tested and actionable translocation s specific for lung cancer is small and includes ALK, RET, and ROS1. Other kinase gene fusions have been detected by NGS from isolated lung adenocarcinoma DNA and RNA and include MPRIP-NTRK1, AXL-MBIP, SCAf11-PDGFRA, and EZR-ERBB4 [2729]. Regardless of molecular methodology utilized for detection, accurate identification of translocations can be challenging. Utilizing in situ hybridization (ISH) is the current gold standard, but immunohistochemistry (IHC) is often performed as it offers a faster and less burdensome screening/detection methodology. However, IHC does not actually identify the translocation; rather, it identifies overexpression of a protein that occurs secondary to the translocation. Therefore, the IHC approach is applicable for ALK which lacks endogenous expression in the lung, but is not a viable option for identification of RET translocations due to endogenous RET expression [30] and potentially not useful for ROS1 due to false-positive staining and poor correlation with FISH [31]. Unlike ISH and IHC options, NGS can be applied to identify both known and de novo translocations. In addition, NGS allows the simultaneous screening of actionable gene fusions in a single assay with high specificity and low input requirements (sample preservation). The inherent difficulty in identifying translocations via NGS is the high variability of translocation partners and breakpoints along with low incidence of translocations in lung cancer. While the canonical EML4–ALK fusion consists of EML4 exons 1–13 fused to ALK exons 20–29, over 20 different ALK translocation partners have been identified [32]. NGS is gaining clinical utilization for translocation detection in lung carcinoma due to its comprehensive screening of multiple low incidence translocations, paired with high sensitivity for detection, rapid assay run time, and lower cost compared to single assay/single translocation testing options such as ISH [33]. Ultimately, the goal of utilizing NGS for translocation detection is to properly and rapidly stratify patients to the proper best personalized targeted therapy (sunitinib, sorafenib, or vandetanib) [28, 34].


NGS Utilizing Liquid Biopsy


The overarching trend in molecular diagnostics is to do more with less. NGS is perfectly suited for this task, as very little material is required for testing and the methodology is flexible to allow full mutation profiling or translocation screening. However, this is only applicable when tissue or cytology samples are available, which is not the case for routine follow-up or disease management. In these cases, often minimally invasive blood draws (liquid biopsies) are performed. Recently, much interest is focused on nucleic acid isolation from liquid biopsies via capturing rare circulating tumor cells (CTCs) or cell-free DNA (CF-DNA) . A detailed discussion on the advantages and disadvantages of CTCs vs. CF-DNA is outside the scope of this article; however, a good summary was recently published [35]. Both CTC and CF-DNA have been successfully applied to capture starting material for clinical NGS testing. CTCs ) have already shown utility for NGS-based EGFR mutation testing, with one study showing an 84% match in CTC EGFR mutation profile compared to tissue biopsy and in addition multiple EGFR mutations were identified demonstrating the possibility of detecting tumor heterogeneity [36]. Likewise, CF-DNA has been successfully utilized for NGS-based lung cancer diagnostic testing for both general mutation screening and focused identification of acquired tyrosine kinase inhibitor (TKI) resistance EGFR mutations [37, 38]. The difficulty with ) CTC or CF-DNA applications is the very limited amount of DNA and the mixture of genomic and tumor nucleic acid. To overcome these challenges, NGS methodologies have been developed such as Tagged Amplicon Deep Sequencing (TAm-Seq) , Safe Sequencing System (Safe-SeqS) , and Cancer Personalized Profiling by deep sequencing (CAPP-seq) which have demonstrated up to 92% sensitivity and >99.99% specificity for EGFR mutation detection at the variant level [3942]. These novel NGS methods improve the sensitivity of standard NGS by performing highly targeted hybrid capture, high-throughput deep sequencing, and utilizing bioinformatic tools to remove artifacts and discover rare mutations and potentially translocations [43].


Barriers to Adoption of Clinical NGS for Lung Cancer


While NGS has gained widespread use as a research tool, it has only been in the last few years that it has started to gain acceptance and utilization in the highly regulated clinical CAP/CLIA laboratory-based environment. Several barriers exist for widespread clinical adoption including cost, rapid technology change, lack of regulatory guidance, and complex bioinformatic data interpretation challenges (Table 10.1). These items will be discussed in detail below.


Cost of Clinical NGS Testing


Like most new technologies, NGS instrumentation and reagents can represent a high-cost burden for labs interested in undertaking the task of starting NGS testing. Instrument prices vary from sub-100,000 US dollar benchtop sequences to over 1,000,000 US dollars for high-throughput instrumentation. On top of instrument capital purchase cost, there is an annual service contract (price is highly variable). There are also costs for reagents, assay validations, personnel, and data analysis. NGS has a high upfront and operation cost relative to other molecular diagnostic equipment such as real-time PCR or Sanger-based assays. Cost can be greatly minimized per sample or test by the high degree of multiplexing that is capable, but lab volume and in-house expertise should be considered before initiating a NGS sequencing assay in the clinical setting. An additional variable that should be considered is the amount of testing reimbursement that will be generated by NGS testing. Current Procedural Terminology (CPT) codes are continually updated and in 2017 CPT codes for NGS-based testing exist [44]. However, the rate of successful reimbursement and the amount of reimbursement can be highly variable depending on geographic location and payer. This uncertainty in financial return is a direct barrier to widespread clinical adoption.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jan 18, 2018 | Posted by in ONCOLOGY | Comments Off on Next-Generation and Third-Generation Sequencing of Lung Cancer Biomarkers

Full access? Get Clinical Tree

Get Clinical Tree app for offline access