Genome-wide association studies (GWAS) have now been performed in nearly all common malignancies and have identified more than 100 common genetic risk variants that confer a modest increased risk of cancer. For most discovered germline risk variants, the per allele effect size is small (<1.5) and the biologic mechanism of the detected association remains unexplained. Exceptions are the risk variants identified in JAK2 in myeloproliferative neoplasm and in the KITLG gene in testicular cancer, which are each associated with nearly a 3-fold increased risk of disease. GWAS have provided an efficient approach to identifying common, low-penetrance risk variants, and have implicated several novel cancer susceptibility loci. However, the identified low-penetrance risk variants explain only a small fraction of the heritability of cancer and the clinical usefulness of using these variants for cancer-risk prediction is to date limited. Studies involving more heterogeneous populations, determination of the causal variants, and functional studies are now necessary to further elucidate the potential biologic and clinical significance of the observed associations.
Rare, high-penetrance cancer predisposition genes account for only a small component of the overall familial risk of cancer. Recently, it has been recognized that polygenic inheritance, wherein heritability is determined by the joint action of multiple genes, probably better characterizes the complex genetic architecture of diseases such as cancer. With technological advances in the interrogation of the human genome, genome-wide association studies (GWAS) have helped to identify multiple germline genetic risk variants or susceptibility loci for most common cancer types. Using a hypothesis-neutral genome-based approach, GWAS are able to compare DNA variations, in the form of single-nucleotide polymorphisms (SNPs), in a large set of unrelated cases and controls and pinpoint genetic variants associated with cancer risk.
To date, more than 50 cancer GWAS incorporating more than 15 different malignancies have been reported identifying over 100 genomic cancer susceptibility regions. The rapid discovery of common genetic variants associated with cancer risk generated excitement that such markers may prove useful for cancer risk prediction, improve our understanding of carcinogenesis, and possibly result in the development of targeted treatments for patients. However, as the effect size of most genetic variants is less than 1.5 and the biologic mechanisms underpinning most associations are unknown, significant scientific barriers must be overcome before GWAS results can be meaningfully translated into patient care. This review of reported cancer GWAS summarizes what has been learned regarding the loci mapped, the frequency and magnitude of the cancer risk observed, and the clinical role, if any, for individualized testing for these variants. A more detailed description of population genetics methodology, as well as a more detailed version of the data from which Tables 1–5 of this paper are adapted, are provided in a separate report.
Breast cancer
Genetics of Breast Cancer
Mutations in the well-characterized high-penetrance BRCA1 and BRCA2 cancer susceptibility genes account for less than 20% of the familial risk of breast cancer, with other rarely mutated genes ( TP53 , STK11 , PTEN ) accounting for only a small additional fraction of the risk. Several intermediate-penetrance cancer predisposition genes ( ATM , CHEK2 , BRIP1 , and PALB2 ) have also been described with modest ∼2.0-fold increases in relative risk of breast cancer. Together with the high-penetrance genes, known breast cancer predisposition genes account for only ∼25% of the familial risk of breast cancer.
Breast Cancer GWAS
Breast cancer has been at the forefront of cancer GWAS with 10 published studies and at least 13 independent loci implicated in disease risk (see Table 1 ). The most strongly associated SNP, with an odds ratio of 1.26, is rs2981582 in intron 2 of FGFR2 . The protein encoded by this gene is a member of the fibroblast growth factor receptor (FGFR) family, whose members share evolutionarily highly conserved amino acid sequences and gene structure. FGFR2 is overexpressed and amplified in 5% to 10% of breast tumors and the FGF-signaling pathway has been implicated in early mammary gland development in murine models. Although the precise mechanism(s) of FGFR2 deregulation in breast cancer etiology remains unknown, fine mapping of the region suggests that the causative variants lie in intron 2 of FGFR2 ; by protein-DNA interaction analysis, 2 cis -regulatory SNPs that alter binding affinity for transcription factors have been identified. Since the initial study, the 10q26 loci mapping to FGFR2 has been implicated in several breast cancer GWAS using different patient populations and seems to be strongest in estrogen receptor–positive breast cancer.
Locus | Implicated Gene | SNP | Per Allele OR Ranges a | References |
---|---|---|---|---|
1p11.2 | Pericentric | rs11249433 | 1.16 | |
2q35 | Intergenic | rs13387042 | 1.2–1.25 | |
3p24.1 | SLC4A7 | rs4973768 | 1.11 | |
5p12 | Intergenic ( MRPS30 ) | rs4415084 rs10941679 | 1.16–1.19 | |
5q11.2 | MAP3K1 , MIER3 , C5orf35 | rs889312 | 1.13 | |
6q22.33 | ECHDC1 , RNF146 | rs2180341 | 1.41 | |
6q25.1 | ESR1 | rs2046210 | 1.29 | |
8q24 | Intergenic | rs13281615 | 1.08 | |
10q26 | FGFR2 | rs2981582 rs1219648 rs1078806 | 1.20–1.29 | |
11p15.5 | LSP1 | rs3817198 | 1.07 | |
14q24.1 | RAD51L1 | rs999737 | 1.06 | |
16q12 | TNRC9 (TOX3) , LOC643714 | rs3803662 | 1.16–1.28 | |
17q23 | STXBP4 | rs6504950 | 1.05 |
a ORs <1 in the original publication have been converted to ORs >1 for the alternate allele.
The other susceptibility loci identified in gene-containing regions have not been implicated in cancer previously. The 16q12 locus containing the TOX3 gene encodes an HMG-box protein that may be involved in bending and unwinding DNA and altering chromatic structure. The 5q11.2 locus contains 3 genes, MIER3 , MGC33648 , whose functions are unknown, and MAP3K1 , a component of a protein kinase signal transduction cascade involved in activating the Erk/Jnk and NFκB pathways. For the MAP3K1 variant, an association was found with estrogen and progesterone receptor–positive, HER2/Neu–negative tumors in African American, but not European American women. The 11p15.5 locus contains LSP1 , a cytoskeletal targeting protein for the ERK/MAP kinase pathway expressed in lymphocytes and endothelial cells. A fifth locus at 8q24, a gene desert, has also been associated in GWAS with prostate, colorectal, and urinary bladder cancers.
Using combined data from the Cancer Genetic Markers of Susceptibility group (CGEMS) and Breast Cancer Association Consortium (BCAC), 2 additional SNPs associated with breast cancer were indentified. The 3p24.1 locus maps to SLC4A7 , a sodium- and bicarbonate-dependent cotransporter that regulates intracellular pH. Expression analysis showed this gene was down-regulated in most breast cancer cell lines and tumors. The 17q23 locus maps to intron 1 of STXBP4 , a mediator of insulin’s role in glucose transport not previously associated with neoplasms.
A separate GWAS by CGEMS confirmed the FGFR2 locus and analyses of additional stages to the original study identified 2 additional SNPs : an SNP at 1p11.2 resides in a linkage disequilibrium block neighboring NOTCH2 , which plays a role in epithelial-mesenchymal transition ; an SNP at 17q23, localized to RAD51L1 , is evolutionarily conserved and essential for DNA repair by homologous recombination.
The DeCode group in Iceland, in 2 separate GWAS, detected an association with an SNP near TOX3 and found additional loci at 2q35 (an intergenic region with no known nearby genes), and 5p12, containing MRPS30 (which codes for an evolutionarily, highly conserved, mitochondrial ribosomal protein).
A GWAS of Ashkenazi Jewish breast cancer cases by Gold and colleagues confirmed susceptibility loci mapping to FGFR2 and identified an additional locus at 6q22.33, which contains ECHDC1 , encoding a protein involved in mitochondrial fatty oxidation, and RNF146 , encoding a ubiquitin protein ligase. This locus showed significant but weaker association in non-Ashkenazi Jewish whites, and, like most GWAS-based associations to date, was correlated with estrogen receptor–positive breast cancer. A Chinese GWAS found an association at 6q25.1, ∼60 kb upstream of ESR1 , which codes for an estrogen receptor. This locus showed significant but weaker association with breast cancer in a European cohort.
Clinical Correlations
The magnitude of risk associated with each of the loci identified is modest, with odds ratios largely ranging from 1.1 to 1.4. Although the established independent breast cancer loci are believed to result in a joint population attributable risk (PAR) of more than 60%, the contribution of these loci to the familial risk of cancer is no more than ∼8% thereby leaving most of the familial risk of breast cancer unexplained. There has been significant interest in determining whether the presence of risk variants predict for a particular clinical outcome. In a study evaluating 5 breast cancer susceptibility loci, only 1 SNP was associated with overall survival after diagnosis, however, after adjusting for known prognostic factors, this association no longer proved significant.
To date, 2 modeling studies predict that together the 7 most common breast cancer–associated SNPs would add little in terms of improved discriminatory accuracy when compared with, or when used in conjunction with, a standard clinical breast cancer risk model (eg, the Gail model). In the first clinical study, based on more than 5000 breast cancer cases and nearly 6000 controls, the addition of 10 breast cancer SNPs to a standard clinical breast cancer risk model predicted the risk of breast cancer only slightly better than the clinical model alone suggesting that risk prediction based on currently identified risk SNPs is premature. As the strongest associations have been found in estrogen receptor–positive disease, a GWAS for women with triple-negative breast cancer is ongoing and may demonstrate different genetic associations.
Prostate cancer
Genetics of Prostate Cancer
Family studies show strong evidence for a genetic predisposition to prostate cancer, with a 2- to 3-fold increased risk of disease in first-degree relatives of affected men. Germline mutations in genes such as BRCA2 have been found to be associated with prostate cancer risk, however, such mutations explain less than 10% of the familial risk of prostate cancer.
Prostate Cancer GWAS
Prostate cancer GWAS have identified more than 2 dozen SNPs associated with disease risk (see Table 2 ). Using a genome-wide linkage scan, the DeCode group previously identified the association of the 8q24 locus with prostate cancer risk in Icelandic patients. In African Americans, a group with a high incidence of prostate cancer, a higher minor allele frequency of the associated 8q24 loci was demonstrated. A separate admixture mapping study, a method that screens through the genome of populations of recently mixed ancestry, also emphasized the importance of the 8q24 region. Subsequently, 2 GWAS, by the CGEMS group and the DeCode group, confirmed the association between the 8q24 region and prostate cancer risk. This association has been replicated in subsequent GWAS and through fine mapping analyses of the region with allele-specific risks for prostate cancer ranging from 1.4 to 2.0. In a multiethnic study, Haiman and colleagues, identified 7 independent risk variants in the 8q24 region and observed that the risk variants were most common in the African American population possibly suggesting a partial explanation for the higher incidence of prostate cancer in African American men. Analysis by Ghoussaini and colleagues identified at least 5 loci at 8q24.21 independently separated by recombination hotspots. The gene nearest this 8q24.21 region, mapping at least 116 kb distally, is MYC , aberrations of which have been linked to multiple cancers, with evidence suggesting that the 8q24 predisposition locus may be involved in MYC regulation.
Locus | Implicated Gene | SNP | Per Allele OR Ranges a | References |
---|---|---|---|---|
2p15 | EHBP1 | rs721048 | 1.15 | |
2p21 | THADA | rs1465618 | 1.08 | |
2q31 | ITGA6 | rs12621278 | 1.33 | |
3p12 | Intergenic | rs2260753 | 1.18 | |
3q21 | Intergenic | rs10934853 | 1.12 | |
4q22 | PDLIM5 | rs17021918 rs12500426 | 1.11 1.08 | |
4q24 | TET2 | rs7679673 | 1.10 | |
6q25 | SLC22A3 | rs9364554 | 1.17 | |
7p15 | Intergenic | rs12155172 | 1.05 | |
7p15.2-15.1 | JAZF1 | rs10486567 | 1.12–1.35 | |
7q21.3 | LMTK2 | rs6465657 | 1.12 | |
8p21 | NKX3-1 | rs2928679 rs1512268 | 1.05 1.18 | |
8q24 | Intergenic | HapC 14 SNPs | 2.10 | |
rs16901979 | 1.79–1.80 | |||
DG8S737 | 1.64 | |||
rs1447295 | 1.36–1.60 | |||
rs1016343 | 1.37 | |||
rs6983267 | 1.26–1.42 | |||
rs4242382 | 1.41–1.87 | |||
rs1006908 | 1.15 | |||
rs620861 | 1.17 | |||
rs16902094 | 1.14 | |||
10q11.2 | MSMB | rs10993994 | 1.16–1.25 | |
10q26.13 | CTBP2 | rs4962416 | 1.17–1.20 | |
11p15 | IGF2, IGF2AS, INS, TH | rs7127900 | 1.22 | |
11q13.2 | Intergenic | rs10896449 | 1.10–1.28 | |
rs7931342 | 1.19 | |||
17q12 | TCF2 ( HNF1B ) | rs4430796 | 1.18–1.38 | |
rs7501939 | 1.41 | |||
17q24.3 | Intergenic | rs1859962 | 1.20–1.26 | |
19q13.2 | PPP1R14A | rs8102476 | 1.12 | |
19q13.41 | KLK2 , KLK3 | rs2735839 | 1.20 | |
22q13 | TTLL1, BIK, MCAT, PACSIN2 | rs5759167 | 1.20 | |
22q13 | TNRC6B | rs9623117 | 1.18 | |
Xp11.23-p11.22 | NUDT10 , NUDT11 | rs5945619 | 1.19 | |
LOC340602 , GSPT2 , MAGED1 | rs5945572 | 1.23 |
a ORs <1 in the original publication have been converted to ORs >1 for the alternate allele.
Two distinct loci in chromosome 17 have been implicated in prostate cancer risk. The 17q12 locus containing the HNF1B/TCF2 gene was identified in multiple GWAS. HNF1B encodes a member of the transcription factor superfamily and is involved in nephrogenesis. Heterozygous germline mutations in HNF1B cause maturity-onset diabetes of the young. One of the SNPs at the 17q12 locus seems to be protective for type 2 diabetes, consistent with epidemiologic data demonstrating an inverse relationship between diabetes and prostate cancer risk. Family-based studies confirmed an association of HNF1B with increased risk for prostate cancer among Hispanic men diagnosed at less than 50 years of age, and subsequent fine mapping uncovered evidence for 2 independent prostate cancer loci in HNF1B .
An SNP in 10q11.2, near the MSMB gene, has been associated with prostate cancer in 2 separate GWAS. MSMB codes for prostate secretory protein of 94 amino acids (PSP94), which is synthesized by prostatic epithelia and is underexpressed in prostate tumors. Decreased serum levels of PSP94 have been associated with increased prostate cancer risk. Prostate cancer was also associated with an SNP at 19q13.41 between KLK3 , which codes for prostate-specific antigen protein, and KLK2 , which is amplified and overexpressed in prostate carcinoma tissue. The same study identified the Xp11.23 locus, which falls between NUDT10 and NUDT11 , 2 genes believed to play a role in signal transduction and found to be highly expressed in prostate and testis tissue. This association was confirmed in another GWAS by Gudmundsson and colleagues. Sun and colleagues used combined data from their previous study and CGEMS public data to identify a locus at 22q13 associated with aggressive prostate cancer cases.
Clinical Correlations
As with breast cancer, the magnitude of risk associated with each of the prostate cancer risk loci is modest, with odds ratios ranging from 1.2 to 2.0. The joint contribution of identified loci to the familial risk of prostate cancer approaches 20%. Unfortunately, none of the prostate cancer risk SNPs consistently distinguish risk for more or less aggressive cancer, nor are they associated with cancer-specific mortality. In addition, a family history of prostate cancer still confers a greater risk than the presence of any individual risk allele, thereby providing no evidence that changing screening recommendations in men carrying a prostate cancer–associated risk SNP would be warranted. The effect of carrying multiple risk alleles on prostate cancer risk has also been assessed with results demonstrating that men who carried 4 or more of 5 possible risk alleles had a 4.5-fold increased risk of disease. There was no evidence that the risk alleles were associated with disease aggressiveness, earlier age at diagnosis, or presence or absence of family history. A subsequent analysis demonstrated that these 5 risk alleles do not improve prediction models for disease risk or disease-specific mortality once known risk factors (age, prostate-specific antigen [PSA], family history) or prognostic factors (Gleason score, diagnostic PSA, stage, age, primary treatment) are taken into account. Thus, the clinical usefulness of using risk SNPs as a tool for risk stratification has remained limited. As an alternative to the case-control study design, a recent GWAS used a case-case design of more or less aggressive prostate cancer to identify a genetic variant that predisposes to aggressive but not indolent disease. It is feasible that additional similar studies identifying genetic variants predisposing to more aggressive disease may help to risk stratify populations appropriate for screening, prevention, and more aggressive treatment.
Prostate cancer
Genetics of Prostate Cancer
Family studies show strong evidence for a genetic predisposition to prostate cancer, with a 2- to 3-fold increased risk of disease in first-degree relatives of affected men. Germline mutations in genes such as BRCA2 have been found to be associated with prostate cancer risk, however, such mutations explain less than 10% of the familial risk of prostate cancer.
Prostate Cancer GWAS
Prostate cancer GWAS have identified more than 2 dozen SNPs associated with disease risk (see Table 2 ). Using a genome-wide linkage scan, the DeCode group previously identified the association of the 8q24 locus with prostate cancer risk in Icelandic patients. In African Americans, a group with a high incidence of prostate cancer, a higher minor allele frequency of the associated 8q24 loci was demonstrated. A separate admixture mapping study, a method that screens through the genome of populations of recently mixed ancestry, also emphasized the importance of the 8q24 region. Subsequently, 2 GWAS, by the CGEMS group and the DeCode group, confirmed the association between the 8q24 region and prostate cancer risk. This association has been replicated in subsequent GWAS and through fine mapping analyses of the region with allele-specific risks for prostate cancer ranging from 1.4 to 2.0. In a multiethnic study, Haiman and colleagues, identified 7 independent risk variants in the 8q24 region and observed that the risk variants were most common in the African American population possibly suggesting a partial explanation for the higher incidence of prostate cancer in African American men. Analysis by Ghoussaini and colleagues identified at least 5 loci at 8q24.21 independently separated by recombination hotspots. The gene nearest this 8q24.21 region, mapping at least 116 kb distally, is MYC , aberrations of which have been linked to multiple cancers, with evidence suggesting that the 8q24 predisposition locus may be involved in MYC regulation.
Locus | Implicated Gene | SNP | Per Allele OR Ranges a | References |
---|---|---|---|---|
2p15 | EHBP1 | rs721048 | 1.15 | |
2p21 | THADA | rs1465618 | 1.08 | |
2q31 | ITGA6 | rs12621278 | 1.33 | |
3p12 | Intergenic | rs2260753 | 1.18 | |
3q21 | Intergenic | rs10934853 | 1.12 | |
4q22 | PDLIM5 | rs17021918 rs12500426 | 1.11 1.08 | |
4q24 | TET2 | rs7679673 | 1.10 | |
6q25 | SLC22A3 | rs9364554 | 1.17 | |
7p15 | Intergenic | rs12155172 | 1.05 | |
7p15.2-15.1 | JAZF1 | rs10486567 | 1.12–1.35 | |
7q21.3 | LMTK2 | rs6465657 | 1.12 | |
8p21 | NKX3-1 | rs2928679 rs1512268 | 1.05 1.18 | |
8q24 | Intergenic | HapC 14 SNPs | 2.10 | |
rs16901979 | 1.79–1.80 | |||
DG8S737 | 1.64 | |||
rs1447295 | 1.36–1.60 | |||
rs1016343 | 1.37 | |||
rs6983267 | 1.26–1.42 | |||
rs4242382 | 1.41–1.87 | |||
rs1006908 | 1.15 | |||
rs620861 | 1.17 | |||
rs16902094 | 1.14 | |||
10q11.2 | MSMB | rs10993994 | 1.16–1.25 | |
10q26.13 | CTBP2 | rs4962416 | 1.17–1.20 | |
11p15 | IGF2, IGF2AS, INS, TH | rs7127900 | 1.22 | |
11q13.2 | Intergenic | rs10896449 | 1.10–1.28 | |
rs7931342 | 1.19 | |||
17q12 | TCF2 ( HNF1B ) | rs4430796 | 1.18–1.38 | |
rs7501939 | 1.41 | |||
17q24.3 | Intergenic | rs1859962 | 1.20–1.26 | |
19q13.2 | PPP1R14A | rs8102476 | 1.12 | |
19q13.41 | KLK2 , KLK3 | rs2735839 | 1.20 | |
22q13 | TTLL1, BIK, MCAT, PACSIN2 | rs5759167 | 1.20 | |
22q13 | TNRC6B | rs9623117 | 1.18 | |
Xp11.23-p11.22 | NUDT10 , NUDT11 | rs5945619 | 1.19 | |
LOC340602 , GSPT2 , MAGED1 | rs5945572 | 1.23 |
a ORs <1 in the original publication have been converted to ORs >1 for the alternate allele.
Two distinct loci in chromosome 17 have been implicated in prostate cancer risk. The 17q12 locus containing the HNF1B/TCF2 gene was identified in multiple GWAS. HNF1B encodes a member of the transcription factor superfamily and is involved in nephrogenesis. Heterozygous germline mutations in HNF1B cause maturity-onset diabetes of the young. One of the SNPs at the 17q12 locus seems to be protective for type 2 diabetes, consistent with epidemiologic data demonstrating an inverse relationship between diabetes and prostate cancer risk. Family-based studies confirmed an association of HNF1B with increased risk for prostate cancer among Hispanic men diagnosed at less than 50 years of age, and subsequent fine mapping uncovered evidence for 2 independent prostate cancer loci in HNF1B .
An SNP in 10q11.2, near the MSMB gene, has been associated with prostate cancer in 2 separate GWAS. MSMB codes for prostate secretory protein of 94 amino acids (PSP94), which is synthesized by prostatic epithelia and is underexpressed in prostate tumors. Decreased serum levels of PSP94 have been associated with increased prostate cancer risk. Prostate cancer was also associated with an SNP at 19q13.41 between KLK3 , which codes for prostate-specific antigen protein, and KLK2 , which is amplified and overexpressed in prostate carcinoma tissue. The same study identified the Xp11.23 locus, which falls between NUDT10 and NUDT11 , 2 genes believed to play a role in signal transduction and found to be highly expressed in prostate and testis tissue. This association was confirmed in another GWAS by Gudmundsson and colleagues. Sun and colleagues used combined data from their previous study and CGEMS public data to identify a locus at 22q13 associated with aggressive prostate cancer cases.
Clinical Correlations
As with breast cancer, the magnitude of risk associated with each of the prostate cancer risk loci is modest, with odds ratios ranging from 1.2 to 2.0. The joint contribution of identified loci to the familial risk of prostate cancer approaches 20%. Unfortunately, none of the prostate cancer risk SNPs consistently distinguish risk for more or less aggressive cancer, nor are they associated with cancer-specific mortality. In addition, a family history of prostate cancer still confers a greater risk than the presence of any individual risk allele, thereby providing no evidence that changing screening recommendations in men carrying a prostate cancer–associated risk SNP would be warranted. The effect of carrying multiple risk alleles on prostate cancer risk has also been assessed with results demonstrating that men who carried 4 or more of 5 possible risk alleles had a 4.5-fold increased risk of disease. There was no evidence that the risk alleles were associated with disease aggressiveness, earlier age at diagnosis, or presence or absence of family history. A subsequent analysis demonstrated that these 5 risk alleles do not improve prediction models for disease risk or disease-specific mortality once known risk factors (age, prostate-specific antigen [PSA], family history) or prognostic factors (Gleason score, diagnostic PSA, stage, age, primary treatment) are taken into account. Thus, the clinical usefulness of using risk SNPs as a tool for risk stratification has remained limited. As an alternative to the case-control study design, a recent GWAS used a case-case design of more or less aggressive prostate cancer to identify a genetic variant that predisposes to aggressive but not indolent disease. It is feasible that additional similar studies identifying genetic variants predisposing to more aggressive disease may help to risk stratify populations appropriate for screening, prevention, and more aggressive treatment.
Colorectal cancer
Genetics of Colorectal Cancer
Analysis of phenotype concordance in monozygotic twins of cases, suggests that inherited susceptibility is responsible for ∼35% of all colorectal cancers (CRCs). However, only ∼6% of CRCs occur in the setting of a known high-penetrance cancer predisposition syndrome, such as (familial adenomatous polyposis) or Lynch syndrome. Therefore, most of the genetic risk of CRC remains unexplained.
CRC GWAS
There have been 7 GWAS in CRC (see Table 3 ). The first GWAS of CRC identified the 8q24 locus, containing the rs6983267 SNP, with an associated ∼1.2-fold increased risk of disease. This same SNP was also associated with about a 1.2- to 1.4-fold increase in prostate cancer risk. In addition to the risk of CRC, the rs6983267 SNP at 8q24 was also found to be associated with adenoma risk with an odds ratio of 1.16. The nearby pseudogene, POU5F1P1 , expressed in several human malignancies shows 95% homology to POU5F1 , a candidate stem cell gene that encodes a transcription factor, but despite close mapping, the causative variant has not yet been identified. Two recent publications suggest that, at least in CRC predisposition, the rs6983267 SNP at 8q24 may be connected to enhanced Wnt signaling and subsequent MYC regulation. Two additional SNPs in the 8q24 region have been implicated with similarly modest risks of CRC.
Locus | Implicated Gene | SNP | Per Allele OR Ranges a | References |
---|---|---|---|---|
8q23.3 | EIF3H | rs16892766 | 1.25 | |
8q24.21 | LOC727677 , POU5F1P1 | rs10505477 | 1.17 | |
rs6983267 | 1.17–1.27 | |||
rs7014346 | 1.19 | |||
10p14 | Intergenic | rs10795668 | 1.11 | |
11q23 | Intergenic | rs3802842 | 1.12 | |
14q22-q23 | BMP4 | rs4444235 | 1.11 | |
15q13 | Intergenic | rs4779584 | 1.23–1.26 | |
GREM1 | rs10318 | 1.19 | ||
16q22.11 | CDH1 | rs9929218 | 1.10 | |
18q21.1 | SMAD7 | rs4939827 | 1.16–1.20 | |
19q13.11 | RHPN2 | rs10411210 | 1.15 | |
20p12.3 | Intergenic | rs961253 | 1.12 |
a ORs <1 in the original publication have been converted to ORs >1 for the alternate allele.
Nearly half of the susceptibility loci in CRC are in linkage disequilibrium or are nearby genes of the transforming growth factor beta (TGF-β) signaling pathway previously implicated in carcinogenesis. Increased TGF-β1 expression has been linked to tumor progression and recurrence in CRC, and germline mutations in components of the TGF-β signaling pathway, namely SMAD4 and BMPR1A , are responsible for juvenile polyposis, a high-penetrance CRC susceptibility syndrome. The rs4939827 SNP lying in an intron of SMAD7 at 18q21 was associated with CRC risk in 2 GWAS with a third study indicating an opposite effect. The COGENT study performed a meta-analysis of 2 prior GWAS in CRC and followed up with replication analyses in 8 case-control series totaling more than 20,000 cases and controls. This study implicated 2 other components of the TGF-β signaling pathway: an SNP in 19q13.11 maps to RHPN2 , a gene involved in regulating actin cytoskeleton organization and gene expression responses to TGF-β signaling and an SNP in 14q22 is near the transcription start site of BMP4 , a member of the TGF-β family that is overexpressed in colon cancer cells. Possible other genes implicated along the TGF-β signaling pathway are BMP2 and GREM1 . In addition, an SNP at 16q22 maps to an intron of CDH1 , a gene with a well-established role in CRC etiology and in which germline mutations cause hereditary diffuse gastric cancer. Risks at each of these loci were modest, in the range of 1.1 to 1.2.
Clinical Correlations
Overall, the 10 risk loci identified account for only ∼6% of the excess familial risk of CRC. There is currently no evidence that individual SNPs or panels of SNPs adds to the discriminatory accuracy of current clinical criteria based on age, personal and family history of adenomas or CRC, and preexisting inflammatory bowel disease. Nor is there convincing evidence that these SNPs correlate with survival, early age at onset, site of tumor, or a histologically more aggressive subset of disease By comparison, the relative risk for CRC for an individual carrying the 8q24 variant is ∼1.2 versus a 1.8-fold increased risk for the first-degree relatives of individuals with an adenoma and a 2.5-fold increased risk for individuals with a first-degree relative with CRC. Thus, at the current time, recommendations for CRC screening would not be altered from that of the general population based solely on the presence of a CRC-associated risk SNP.
Gastrointestinal (noncolorectal) cancers
An estimated 10% of patients with pancreatic cancer have an inherited form of the disease. However, only a small fraction of the familial risk of pancreatic cancer is explained by mutations in BRCA , p16 , STK11 and the mismatch repair genes associated with Lynch syndrome. The first GWAS in pancreatic cancer identified SNPs mapping to the first intron of the ABO blood group gene on chromosome 9q34 to be associated with a 1.2-fold increased risk of pancreatic cancer. Earlier epidemiologic data have pointed to an association between ABO blood type and pancreatic and gastric cancer risk. A second pancreatic cancer GWAS identified 8 additional SNPs mapping to 3 loci (13q22.1, 1q32.1, and 5p15.33). The 2 SNPs at 13q22.1 are in intergenic regions between 2 genes belonging to the family of kruppel-like transcription factors, KLF5 and KLF12 . Somatic deletions in this area of chromosome 13 have been found in a variety of cancers, including pancreatic cancer. The 1q32.1 region harbors the NR5A2 gene, which encodes a nuclear receptor of the fushi tarazu subfamily and is predominantly expressed in the exocrine gland of the pancreas, liver, intestines, and ovaries. The third locus at 5p15.33 is in intron 13 of CLPTM1L , and is part of the CLPTM1L-TERT locus. CLPTM1L has been implicated in carcinogenesis and the 5p15.33 region has been identified in several cancer GWAS including brain tumors, lung cancer, basal cell cancer, and melanoma.
A GWAS in Japanese patients with esophageal squamous cell carcinoma identified the 12q24 and 4q21-23 susceptibility regions with odds ratios of 1.67 and 1.79, respectively. The 4q21-23 region includes 7 members of the alcohol dehydrogenase (ADH) family involved in alcohol metabolism. The 12q24 region is in linkage disequilibrium with ALDH2 , a gene that encodes a member of the ADH family and is 1 of the key enzymes in alcohol metabolism. Previous candidate gene studies of esophageal squamous cell cancer have identified risk variants at both the ADH1B and ALDH2 genes.
A gastric cancer GWAS in Japanese patients identified an SNP at 8q24.3 mapping to the PSCA gene and conferring an allele-specific risk of 1.62 specifically for diffuse-type gastric cancer. PSCA was originally identified as a prostate-specific stem cell antigen but has been reported in bladder, esophageal, and stomach cancers, as well as in a recent GWAS of bladder cancer.