Whole genome sequencing is not at present cost-effective in most settings. But today’s molecular genetic laboratory can effectively sequence panels of several hundred genes, and even entire exomes without a core facility. Diagnostic and research genetics laboratories use identical instruments and methods, but the clinical diagnostic laboratory is subject to an array of regulations and quality assurance practices, all designed to minimize error and maximize the likelihood of a diagnostically useful result. This chapter outlines major areas and points to useful resources for setting up a laboratory. The most important advice is that the accuracy and value of a genetic research finding or of a genetic diagnosis is critically dependent on communication among the ordering clinician, the clinical laboratorian or researcher, and the genetic counselor.
Keywordsendocrinology, genetics, variant, mutation, laboratory, instrumentation, sequencing, next-generation sequencing
Calvin : Hmph. Where are the flying cars? …You call this the future?? HA!
Hobbes : I’m not sure people have the brains to manage the technology they’ve got.
In the first edition, this chapter noted that Calvin would find neither flying cars nor the fabled $1,000 genome at that time. Even with this edition, neither flying cars nor a $1,000 genome is a reality, but today $1,000 can procure a “research grade” exome. More importantly, tens of thousands of whole genomes, exomes, and transcriptomes have been completed using next-generation sequencing (NGS). The genetic causes for many rare inherited diseases have been discovered; for some there had been no clue to the causative gene. The ability to sequence large panels of genes or exomes is now within the capability of many clinical genetics laboratories.
It is also increasingly clear that many diseases with a genetic component will require more than “merely” sequencing the entire genome. Some “pseudogenes” are now known to be expressed; this requires RNA analysis. A variation leading to increased expression of a noncoding element like lncRNA can have dramatic effect. Synonymous (“silent”) single nucleotide variants can alter RNA splice sites or miRNA binding sites; the latter can have numerous subtly ramifying effects. Some epigenetic changes appear to be transgenerational. NGS can identify variation between genotypically identical twins. All this is without considering multiple-gene interactions or gene-environment interactions. Determining the clinical significance of thousands of newly identified genetic variants now makes attaining the sequence look straightforward.
Regulations for diagnostic genetic laboratories
The equipment and methods ( Table 29.1 ) are identical for diagnostic and research genetic laboratories; however, the former must comply with a thicket of federal and state regulations. Physician-scientists, accustomed to federal regulation in healthcare, might still be surprised to learn that federal regulations specify the acceptable variation for measurement of serum sodium. Many laboratory requirements seem onerous but all share a goal: minimizing errors. For example, for a research laboratory it is good practice to regularly calibrate pipettes; for a clinical laboratory accredited by CAP, it is mandatory.
|GENERAL DNA LABORATORY AND SANGER SEQUENCING|
|Capillary electrophoresis analyzer||DNA sequencing, amplicon size analysis|
|Real-time PCR||Genotyping, copy number, gene expression|
|Thermocyclers||General PCR, sequencing, NGS reactions|
|Thermistor array||Verify PCR temperature control|
|Gel electrophoresis rigs||Consider microfluidic capillary analyzers|
|UV camera||For imaging DNA gels. Consider “safelight” transilluminator|
|Desktop centrifuge||Spin down blood samples|
|Refrigerated high speed microfuges||General DNA protocols|
|Compact scanning spectrophotometer OR fluorometer||DNA quantitation|
|DNA analysis software||Sequencing, fragment size analysis|
|Tissue culture (optional)||For cell line control samples|
|Sonicator (optional)||Used for randomly shearing DNA|
|Computers – analysis, data storage||Consider cloud services, at least for backup|
|NGS analysis software suite||Commercial or shareware or cloud-based|
|Data and lab management software||Consider cloud services|
The Sources of Regulation
Clinical laboratory practice is covered by the Clinical Laboratory Improvement Amendments of 1988 (CLIA ’88) (authorized under the Public Health Service Act: Section 353, Subpart 2), published in 2003 (Code of Federal Regulations Title 42, Section 493). CLIA regulations vary in detail. Most provide general goals such as requiring that each assay run includes positive and negative controls, but leave it to the laboratory to determine appropriate details.
Enforcement of Regulations
To oversee compliance with CLIA, the Department of Health and Human Services grants “deemed” status to several organizations and public health departments. The College of American Pathologists (CAP) is among the most active. CAP’s Laboratory Accreditation Program (LAP) is based upon a series of standards rooted in CLIA that help to ensure accredited laboratories provide testing that meet the needs of patients, physicians, and other healthcare practitioners. CAP maintains accreditation “checklists,” which are detailed lists of requirements the clinical laboratory should use to maintain good practices and compliance with CLIA requirements. The Molecular Genetics Checklist includes items for NGS. All checklists are available online but access requires participation with CAP (see Table 29.2 ).
|www.cap.org||College of American Pathologists||Definitive source of clinical checklists|
|www.clsi.org||Clinical Laboratory Standards Institute||Source of clinical laboratory guidelines|
|www.acmg.net||American College of Medical Genetics||Source of clinical laboratory guidelines|
|www.amp.org||Association of Molecular Pathologists||Source of clinical laboratory guidelines, active listserv|
|genome.ucsc.edu/cgi-bin/hgGateway||University of California Santa Cruz||Indispensable browser/portal for genome, transcriptome, chromatin, data across species|
|www.ensembl.org/index.html||European Molecular Biology Laboratory||Browser especially notable for visualization tools for exons and transcripts|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/||National Institutes of Health (NIH)|
|GENERAL GENETICS AND CLINICAL GENETIC INFORMATION|
|www.genenames.org/||Human Gene Nomenclature Committee|
|ONLINE SOFTWARE SOURCES – GENERAL DNA UTILITIES|
|www.wageningenur.nl/en/Expertise-Services/Chair-groups/Plant-Sciences/Bioinformatics.htm||Wageningen University Laboratory of Bioinformatics||Primer design|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/tools/primer-blast/||National Center Biotechnology Information||Complements Primer 3 program|
|biologylabs.utah.edu/jorgensen/wayned/ape/||Shareware Mac and PC||Superb general DNA utilities|
|www.mbio.ncsu.edu/bioedit/bioedit.html||Shareware for PC||General DNA utilities|
|genome.unipro.ru/||Shareware Mac and PC, “Ugene”||General DNA utilities, including some for NGS|
|www.ebi.ac.uk/services||European Bioinformatics Institute of EMBL||Numerous DNA software utilities|
|www.geospiza.com/Products/finchtv.shtml||Freeware||Sanger chromatogram viewer|
|WEB-BASED OR OPENSOURCE SOFTWARE FOR NGS ANALYSIS|
|galaxyproject.org/||Freeware||Comprehensive NGS software|
|www.broadinstitute.org/gatk/||Freeware||Standard NGS variant calling suite|
|www.bioconductor.org/||Freeware/opensource||Comprehensive NGS software|
|www.broadinstitute.org/igv/home||Freeware||NGS data visualization|
|VARIANT ANALYSIS/INTERPRETATION SOFTWARE|
|genetics.bwh.harvard.edu/pph2/||Polyphen-2||Variant pathogenicity prediction|
|sift.bii.a-star.edu.sg/index.html||Sorting Intolerant From Tolerant (SIFT)||Variant pathogenicity prediction|
|mendel.stanford.edu/SidowLab/downloads/gerp/||Genomic Evolutionary Rate Profiling (GERP)||Variant pathogenicity prediction|
|www.mutationtaster.org/||Variant pathogenicity prediction|
|cadd.gs.washington.edu/||Variant pathogenicity prediction|
|VARIANT ONLINE DATABASES|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/snp/||NIH||Comprehensive SNP database|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/dbvar||NIH||Structural variant database|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/clinvar/||NIH||Variants with clinical associations|
|www-ncbi-nlm-nih-gov.easyaccess2.lib.cuhk.edu.hk/medgen||NIH||Database of genetic diseases|
|www.1000genomes.org/||1000 Genome Project||Variation database|
|huvariome.erasmusmc.nl/||Erasmus Medical Center||Variation database|
|www.hgmd.org/||Human Gene Mutation Database||Database for published associations of mutations with inherited disease|
|www.openbioinformatics.org/annovar/||Annovar||Variant database search tool|
A clinical lab must undergo an unannounced inspection at least every 2 years with a documented self-inspection in intervening years. CAP inspector training emphasizes that inspection is an educational rather than an adversarial proceeding, but failure can have significant consequences. Cited deficiencies must be appealed or remedied.
The laboratory must engage in proficiency testing, at least twice per year, from a Health and Human Services-approved proficiency testing provider for every analyte tested, when proficiency testing is available. CAP offers many proficiency tests but few for endocrine genetics. If there is no formal proficiency testing available, the laboratory must have a written policy detailing an alternative. A sample exchange with one or more clinical laboratories is a common approach; it must be carried out with a defined assessment plan.
The FDA recognizes several categories of diagnostic “kits” and reagents: FDA-approved, FDA-cleared, research use only, investigational use only, and analyte specific reagent. Endocrine genetic disorders are uncommon, so testing will probably involve what the FDA designates as a “laboratory developed test” (LDT) rather than an FDA-approved commercial kit. LDTs have been called “home-brews,” a term which should be deprecated. An LDT may combine diverse reagent kits and instruments, for example, a kit for DNA preparation and a kit for real-time PCR DNA amplification from different manufacturers. The results report for an LDT must carry the following specific disclaimer (the font size is not specified):
“This test was developed and its performance characteristics determined by ( the name of the laboratory ). It has not been cleared or approved by the U.S. Food and Drug Administration.”
The Health Insurance Portability and Accountability Act includes provisions for ensuring privacy of all patient information, including test results. Clinical endocrinology laboratories are already required to comply with HIPAA regulations. Storage of NGS information raises logistic concerns – data files can run to hundreds of gigabytes per patient. In a specific uncommon circumstance, access to whole genome was used to identify research subjects. Although “hacking” such data requires IT resources and bioinformatics skill much greater than illegally accessing the final report, hospital IT services are obligated to treat genetic sequence like any other protected health data. Given the size of the data files this is an issue that must be addressed by any laboratory doing large-scale NGS.
A clinical genetics laboratory must be attentive to intellectual property rights, such as the patent for real-time PCR (expires November 2016). Patents are often waived for research. Roughly 20% of genes have been patented. A typical patent claim encompassed any method (DNA, RNA, or protein based) that detected a variant, known or newly discovered, pathogenic or not. Whether or not a gene sequence or a mutation should be patentable has been controversial. The 2013 Supreme Court decision in Association of Molecular Pathology v. Myriad Genetics held that “genes and the information they encode” are not patentable, but made a curious exception for “cDNA.”
The preanalytic phase
The Testing Process Overview
Based on CLIA (Section 493.1200), clinical laboratorians conceptualize the testing process into three distinct phases:
Preanalytical – all steps prior to testing such as patient preparation, sample collection, sample preparation, etc.
Analytical – the actual analytical process.
Postanalytical – steps after completion of the test such as interpretation and results reporting.
The CLSI (Clinical Laboratory Standards Institute) publishes guidelines for many clinical laboratory activities, including nucleic acid extraction and DNA sequencing, including NGS (see Table 29.2 ). Chen et al. provide a concise overview of general quality assurance measures; several recent documents specifically address NGS.
For clinical work, a properly labeled sample must be accompanied by a testing requisition completed by a licensed provider, which in most states means a physician. The sample must be labeled with patient name and unique identifier, while the requisition must include a minimum amount of information:
First and last name
Unique patient identifier (e.g., hospital medical record number)
Ordering physician name
Test request date
Date of specimen collection
Type of specimen if other than blood
There is not a federal requirement for informed consent; however, hospitals and laboratories may require one. The process of consenting offers an opportunity to discuss potential results, including possible incidental findings, with the patient.
Peripheral blood is the standard sample type for genetic testing:
Blood : Collection tubes that contain ethylenediaminetetraacetic acid (EDTA), which inhibit coagulation by chelating cations, are acceptable. Use of heparinized tubes can lead to interference; not all DNA preparative methods remove the heparin, an inhibitor of PCR. If a “serum” tube was used, which has no anticoagulant, some DNA can still be extracted from serum. DNA can be recovered from EDTA tubes after at least four weeks of refrigeration.
Tissues : Tissue is important for assessing tissue-limited mosaicism. Fresh tissue provides dramatically higher quality DNA and RNA than does the standard surgical pathology specimen, which is formalin fixed and paraffin embedded (FFPE). Cells from needle biopsies in alcohol-based fixatives are satisfactory. DNA can be routinely recovered from FFPE; in our experience, amplicons from 200 to 400 base pairs are routinely recoverable.
Chorionic villus sampling, amniotic fluid : These are critical sources for evaluating a fetus for an inherited disorder. Contamination by maternal cells is always a concern and must be assessed by analysis of “identity” markers, such as short tandem repeats (“DNA fingerprinting”) from the sample and from the mother.
Circulating fetal DNA : The level of cell-free fetal DNA in the maternal circulation increases as pregnancy progresses. In some settings, routine genotyping techniques can detect significant variants, such as an Rh-positive allele from the father in an Rh-negative mother. NGS methods can detect fetal aneuploidy by testing maternal blood.
Specimen Identification and Log-in
The laboratory must have a system for tracking primary samples including the time of receipt. The log number should be uniquely associated with the sample. A label printer, which can print small adhesive labels with the patient’s name and log number date, is important for labeling derived sample tubes (such as DNA preparations).
Nucleated cells can be isolated from whole blood by a variety of methods. Our laboratory osmotically lyses red blood cells in a whole blood sample, spins down the remaining cells, and washes the pellet with saline several times before proceeding with nucleic acid preparation. Fresh tissue must be minced or homogenized (ultrasonic or mechanical) before being incubated in lysis buffer. FFPE sections must be dewaxed in serial xylene bathes, rehydrated with graded ethanol washes, and then incubated up to several days in a protease containing lysis buffer.
“Home-brew” procedures are inexpensive but often include toxic organic chemicals (phenol, chloroform, guanidinium isothiocyanate). Most types of RNA other than miRNA are dramatically more susceptible to degradation than is DNA. This is attributed to the ubiquity and sturdiness of several RNases (some renature after boiling). RNA is not usually analyzed for genetic diagnosis; however, in cases where variants (including methylation) are predicted to affect splicing or transcription, confirmation by analysis of RNA could be sought. Tissue specificity of expression will dictate the sample source. Silencing of one allele can be demonstrated if there is a “coding” single nucleotide polymorphism (SNP) distinguishing alleles.
Commercial kits typically forego toxic chemicals in favor of small spin columns or suspensions of charged magnetic particles, which reversibly bind nucleic acids. The reproducibility and labor saving provided by commercial kits can outweigh the additional cost relative to “lab-developed” methods. Our laboratory uses an instrument that employs a spin column for purification. It is neither the smallest nor the fastest system, but an identical manual method is available that we could use in the event of an instrument problem. Reagent costs for automated extractors are higher per extraction than for the corresponding manual commercial methods. Overall yields are also typically lower because of the smaller input volume, but a single extraction from several hundred microliters should be sufficient for most purposes. Many automated procedures do not routinely include an RNase step for DNA purification or a DNase step for RNA purification; this can lead to an overestimate of the concentration.
DNA (RNA) Quantitation
UV spectrophotometry : UV spectrophotometry is the traditional method for quantification of DNA (and RNA) and demonstration of purity. The absorption at 260 nm correlates with the concentration of nucleotides; it does not distinguish long double-stranded DNA from free nucleotides. Absorption at 280 nm gives a measure of residual protein. The A260/A280 ratio is a measure of purification; it should fall in the range 1.8–2.0. A 260/280 ratio greater than 2.0 does not indicate extra-high quality DNA; it most often reflects residual contaminants such as phenol. The A260 can be used to quantitate DNA or RNA. Our laboratory uses the Nanodrop TM spectrophotometer. Each reading requires one μL of sample applied directly to the analysis surface (no cuvette). Preparation for the next sample consists of applying a tissue to the reading surface to remove the prior sample. The instrument performs scanning spectrophotometry over a broad range, including A230, A260, and A280.
Dye-binding : Picogreen TM and Ribogreen TM are representative fluorescent DNA-specific and RNA-specific dyes, respectively. The fluorescence is proportional to nucleic acid concentration. These assays are an order of magnitude more sensitive than UV spectrophotometric analysis but do not detect free nucleotides or short duplexes. The signal can be measured with UV ELISA plate readers, real-time PCR instruments, or readers specifically designed for DNA/RNA quantitation such as the Qubit TM or Fluorodrop TM .
Nucleic Acid Integrity
High molecular weight (intact) DNA, intact RNA, degraded DNA and RNA, and nucleotides all absorb at 260 nm with similar efficiency. For routine preparation from fresh blood this check is not necessary except as a troubleshooting measure.
Agarose gel electrophoresis : High quality genomic DNA should show only high molecular weight (at least 10–20 kB), which has barely migrated out of the sample well. DNA from FFPE routinely shows a diffuse smear in the sample lane. Badly degraded DNA might not show staining above a few hundred base pairs.
Microfluidic (chip or capillary) analyzers : These perform the equivalent of gel electrophoresis, sizing the DNA or RNA products with high resolution, and measuring the concentration, using only microliter samples and run times of a few minutes. The chip is more expensive than agarose gel electrophoresis but saves time, sample, and labor. They can be used to assess quality of NGS library preps before an expensive sequencing run.
Primary samples : The remaining sample should be retained until analysis is complete. Consider spotting aliquots on filter paper or using commercial systems such as DNAStable Blood TM (BioMatrica) for storage at room temperature indefinitely.
DNA samples : All analytes (DNA/RNA) should be stored in buffer, most commonly 10 mM or one mM Tris pH 8.0 supplemented with 0.1 mM or one mM EDTA. DNA can be stored at 4°C at least for months, indefinitely at –20°C. RNA should be stored at –80°C. DNA samples for clinical genetic testing must be stored for at least 20 years (consult state and local authorities as well).
Validation of a new assay is a requirement for diagnostic labs. An assay should be shown (validated) to be able to detect all frequent mutations (“frequent” is at the discretion of the director). This requires testing at least one independently confirmed “positive” sample for every “common” mutation. The designation of a validation sample as “positive” (or negative) is based on prior analysis at a separate diagnostic laboratory or by an independent method. Cell lines carrying a mutation are acceptable. Another option which might be acceptable is synthetic DNA. DNA greater than 1,000 base pairs can be designed to match the region of interest and include one or more mutations at specified allelic frequencies. The synthetic DNA should include defined sequences at the 5′ and 3′ end for PCR amplification. It is advisable to include several mutations and several deoxyuridine-triphosphatase (dUTP) residues into such a control to limit the risk of contaminating patient samples.
For the diagnostic lab, a written procedure must be in place not only for each assay, but also for all phases of testing such as specimen login and sample storage. CLSI provides an excellent guideline. All staff trained to perform a given assay must read, sign, and date the protocol and be documented to show competence performing the assay.
The diagnostic lab should track all reagents by lot including the date when opened and the expiration date. Assay worksheets should indicate when a new lot of a reagent has been introduced. Prior to introducing any new reagent lot, crossover validation should be performed and documented with at least one positive sample.
Ideally every run of every assay should include “positive” and “negative” controls; how to apply this rule is not always obvious. Consider a real-time PCR assay to detect a specific point mutation. The run should include a known normal (negative) sample, a sample known to carry the mutation (positive control), and a “no template control” (NTC) with water or buffer substituting for the DNA sample; this is another “negative” control. The NTC is intended to detect contamination with amplicons from previous reactions. If one is looking for a somatic mosaic, the “positive” control should be diluted by mixing with normal DNA to the level desired as the lower limit of detection. For a target gene that could have any one of numerous mutations a common approach is to have a set of samples with different mutations and rotate usage as controls.
Our laboratory processes 10–20 samples at a time for a clinically significant point mutation in the factor V gene. Typically one sample shows a mutation. A “positive” result often leads to long-term anticoagulation. All tested patients have histories of coagulopathy compatible with a mutation. For any patient who tests “positive” our policy is to process and test a second aliquot of blood from the original stock tube, confirming that the mutation is present in that subject; this is intended to catch labeling errors. For multiexon sequencing, it would be sufficient to retest only the exon(s) showing a mutation. This is not a required policy, but it is recommended in American College of Medical Genetics (ACMG) guidelines.
Data Retention and Storage
“Data” and reports must be kept for 10 years. This includes primary data (computer) files. There are several large data files for an NGS sample, which must be kept (for the diagnostic lab) remains to be clarified but certainly includes the variant file.
Methods – general PCR
Thermocyclers vary in ramping speed, which can affect efficiency, with more expensive metal blocks giving better performance, but for most purposes most thermocyclers are adequate.
Thermal cycling profile validation : For the clinical laboratory every well of every thermocycler must be shown to have the expected thermal cycle profile. Our laboratory uses a thermistor array that sends data by wire to a laptop. Validating real-time thermocyclers is more challenging, because of the difficulty accommodating the array: wireless thermistor arrays exist. Demonstrating a reproducible C t (threshold crossing point) for the amplification curve of a real-time PCR assay in every well can be used as evidence of reproducible performance even if the “true” temperature is unknown.
Automation : Flexible programmable instruments are available, which can set up PCR and DNA sequencing reactions. Although slower and more expensive than manual set-up, for a small number of samples, reproducibility is excellent.
Minimizing PCR Amplicon Contamination
Contamination of the working environment by PCR amplicons from an earlier test is an ever present concern. Several measures can reduce the risk.
Spatial separation : PCR involves three phases, which, ideally, are spatially separated: processing the specimen, setting up the PCR reaction mastermixes, and running the assay. The PCR mastermixes are brought into the specimen processing area, the samples added, and the completed reaction mixtures taken to the instrumentation room with the thermocycler. Post-PCR steps such as DNA sequencing should only be performed in the instrumentation room. The laboratory should have separate sets of pipettes, filtered pipette tips, PCR tubes, and gloves for each of the three work areas. Ideally, the instrumentation area should have a ventilation system separate from that for the sample prep and PCR mastermix prep areas, but in practice, this is expensive and uncommon.
Although one does not want to undercut an argument to have at least three rooms, if necessary, setup of the master mixes and addition of samples can be performed in the same room but preferably on separate dedicated benches. The use of PCR hoods (“dead boxes” with UV lights to “sterilizing” rogue PCR amplicons) is a relatively inexpensive important additional precaution. A laminar flow hood is not necessary.
Unidirectional workflow : PCR mastermixes may only go from the PCR set-up room to the sample setup room; reactions with samples added may only go to the PCR instrumentation room. Color-coded labcoats are helpful so that coats from the analysis room are never worn into the setup areas. Amplicons probably adhere well to labcoats, but this remains a common policy. The completed PCR reaction and instruments like pipetters never leave the PCR instrumentation room.
Use of dUTP and uracil-N-glycosylase (UNG) : Substitution of dUTP for a proportion of deoxythymidine triphosphate in the PCR deoxynucleotide triphosphate mix makes the resulting amplicons susceptible to cleavage by the enzyme UNG. Any prior amplicon contaminating a new reaction would be destroyed when the UNG is activated during the preincubation of the new PCR reaction. “Regular” UNG retains activity despite multiple PCR thermal cycles and can destroy new PCR amplicons. Use of the more expensive heat-labile UNG is strongly recommended.