Genetics of Coagulation



Genetics of Coagulation


Rodney M. Camire

Paris Margaritis



The determination of the gene, complementary deoxyribonucleic acid (cDNA), and amino acid sequences of all of the proteins known to be involved in the hemostatic process has been completed. This task, begun in the early 1980s, has allowed for (a) deduction of amino acid sequences from the cloned cDNAs of several species, (b) characterization of gene structure and organization, (c) characterization of mutations responsible for inherited abnormalities, (d) characterization of polymorphisms useful for gene-tracking studies, (e) production of large quantities of recombinant protein for research and for therapeutic purposes, and (f) the development of animal models of hemostatic disease through transgenic and gene knockout technologies. It has also made possible the contemplation of novel gene-based approaches for the treatment of bleeding diatheses. This chapter focuses primarily on various genetic aspects of procoagulant, anticoagulant, and fibrinolytic factors involved in human blood coagulation, with emphasis on gene structure and mutations that predispose to bleeding or thrombosis. We have also focused on mutational mechanisms and how new genomic information will impact genetic analysis.


GENOMIC INFORMATION

The completion of the Human Genome Project in April 2003 was an enormous scientific achievement,1,2 providing substantial advances in the understanding of gene structure and organization, genetic variation, comparative genomics, genomic medicine, and gene interaction with environmental factors. There are currently three major Web portals that serve as central hubs for genome-related resources.3 The first is the National Center for Biotechnology Information (NCBI) run by the National Institutes of Health (NIH) (http://www.ncbi.nlm.nih. gov/); the second is Ensembl (http://www.ensembl.org/index. html), which is a collaborative effort between the Wellcome Trust Sanger Institute and European Molecular Biology Laboratory’s (EMBL’s) European Bioinformatics Institute; and the third is the University of California, Santa Cruz (UCSC) Genome Browser (http://www.genome.ucsc.edu) developed by an academic research group at the UCSC.

An immediate application of genomic sequence information is to accelerate the identification of genes that are associated with human disease. There are several strategies to accomplish this, the most common being genetic linkage analysis (genome-wide linkage scan or homozygosity mapping),4 by which the chromosomal location of a disease-causing gene can be identified. Genome-wide linkage analysis usually requires families in which several individuals are affected with the disease of interest. Evenly spaced DNA markers are used to trace the inheritance of each copy of each chromosome. Chromosomal segments that do not influence disease segregate randomly, whereas the particular copy that, in each family, carries the disease-causing mutation will be shared among affected family members more often than would be predicted by chance.5 The DNA markers that are typically used are termed microsatellites, which are di-, tri-, or tetranucleotide tandem repeats in DNA sequences and can be readily followed by simple polymerase chain reaction (PCR) techniques. Typically, the segment of the genome between identified microsatellite markers that influences the disease state is quite large: millions to tens of millions of base pairs in length, spanning dozens to hundreds of genes. However, the human genome project allows researchers to move rapidly from large chromosomal regions to individual candidate genes because all genes identified within any linkage region are immediately available.

The development of a haplotype map of the human genome will facilitate the identification of genes that are affected by environmental factors. A haplotype block of DNA, consisting of a group of nearby alleles or genetic markers that are inherited together, is roughly consistent among all humans, but different individuals have different versions of the blocks, allowing one to identify the relationship between haplotype blocks and diseases.6 The International Haplotype Map Project (HapMap; http://hapmap.ncbi.nlm.nih.gov/) that aims to define the haplotype structure of human populations should greatly facilitate genetic linkage studies by allowing researchers to find genes that affect health, disease, and individual responses to medications and environmental factors.


MUTATIONS: MECHANISMS AND DATABASES

Alterations in a gene sequence can be neutral, detrimental, or favorable, and are semantically referred to as a mutation when the variation leads to a phenotypic change, a polymorphism when the genetic change is present in >5% of alleles in a population, and a rare sequence variant when the change occurs in <5% of alleles. Therefore, distinct populations show variations not only in disease prevalence but also in the frequency of polymorphic changes.


Types of Genetic Mutations


Point Mutations

A point mutation is defined as a change of a single base pair of a DNA sequence in a gene caused by the substitution of one nucleotide for another. A single codon in frame (missense mutation) may predict replacement of a single amino acid in an otherwise normal protein. When the change results in a stop codon (nonsense mutation), a truncated protein product
results. Sometimes, the mutated proteins are not properly produced, secreted into the blood, or located in the physiologic site. When they do circulate in substantial amounts, the variant protein often circulates with impaired function. When a single base change is phenotypically silent, this is commonly referred to as a single nucleotide polymorphism (SNP). Outside the actual coding region, it is even more likely that nucleotide substitutions are silent. Notable exceptions are mutations in binding sites for transcription factors that regulate gene activity, mutations that affect splicing of the immature RNA, and mutations that may alter the stability of the RNA.

Up to approximately 50% of the single nucleotide substitutions occur as transitions within cytosine-phosphoguanine (CpG) dinucleotides, the most common hot spot for mutation.7,8 This finding is all the more remarkable when one considers that CpG dinucleotides are underrepresented in most genes, presumably because of the evolutionary loss of these dinucleotides. CpG dinucleotides are hot spots for mutation because they are sites of methylation of the cytosine residue (i.e., a CpG becomes a TpG or a CpA).7 Methylcytosine readily deaminates to thymine spontaneously. After the C-to-T transition occurs in the reverse or “antisense” DNA strand, the sequence in its complementary coding strand is changed from G to A. Because the common (but not exclusive) codon for Arg is CGx, many CpG dinucleotide mutations are found within Arg codons, producing Arg-to-stop, Gln, His, Cys, or Trp mutations. When the same CpG transition occurs among different families, it often is possible to distinguish independent mutational events from founder effects through haplotype or pedigree analyses. The founder effect is the establishment of a new, typically isolated, population by a few original “founders” that carry only a fraction of the total genetic variation of the parental population.


Insertion/deletion

Insertion/deletion is a type of variation that is defined as a change in the DNA sequence created by the addition or subtraction of nucleotides. When the insertion or deletion is within the triplet open reading frame and the change is a number of nucleotides that is not a multiple of three, this disruption results in a frameshift mutation. During protein translation, a frameshift usually results in a premature stop codon producing a grossly abnormal protein with impaired function and viability.


Categorization of Mutations

Strictly speaking, a recessive mutation implies that heterozygotes (one copy of mutation) have a normal phenotype and that only homozygotes (two copies of the mutation) are phenotypically affected. Recessive-negative mutations display a predominantly recessive inheritance with a nearly infinite number of different mutations. Most mutations in the genes encoding the coagulation and fibrinolytic genes are recessive negative and come to clinical attention when present in homozygous (doubly or compound heterozygous) or hemizygous (for X-linked disorders) form.

A dominant mutation refers to a genetic change that lacks phenotypic differences between affected heterozygous and homozygous individuals. Dominant-negative mutations show a dominant inheritance with a narrow spectrum of mutations. From the perspective of family studies, this distinction is informative in that dominant mutations should be suspected after successive generations are affected; recessive mutations normally occur only within one generation. Less commonly, dominant (incomplete) inheritance is seen when there is a mutation that causes a gain of function, the best known example of which is factor VLeiden. Genetically, gain-of-function mutations are clearly less heterogeneous than loss-of-function mutations, as there are infinite ways to destroy gene function, whereas only a few mutations lead to improvement of gene or protein function.

In reality, mutations are not always strictly dominant or recessive, as the penetrance of a mutant phenotype can vary dramatically, largely because of concomitant genetic changes. Therefore, recessive-negative mutations are not always entirely recessive. For example, although homozygous protein C deficiency produces severe thrombotic disease and purpura fulminans at birth,9 heterozygotes may either experience thromboembolic disease later in life or never develop venous thrombosis.10,11


Normal Variants and Polymorphisms

Normal variants and polymorphisms are particularly useful in gene-tracking studies of families with a severe bleeding or thrombotic disorder. In such studies, a polymorphism is used as a marker of a defective gene, followed throughout a family when the actual gene mutation is not known. This approach has been particularly useful in genetic counseling for hemophilia A or B, and is increasingly applied as markers of risk factors for venous thrombosis and cardiovascular disease.12 In the case of protein C, fibrinogen, factor VII, plasminogen activator inhibitor-1 (PAI-1), and the prothrombin 20210GA variant, a particular relation exists between polymorphisms and plasma levels of specific clotting factors.


Epigenetics

In addition to mutations, epigenetic changes within the genome can also have a direct effect on phenotype. Epigenetics refers to those mechanisms that control gene expression in a potentially heritable way yet do not involve changes to the underlying DNA sequence.13 These traits exist on top of or in addition to the traditional molecular basis for inheritance. Driving these changes are covalent modifications to cytosine bases and histones (e.g., DNA methylation and acetylation of histones) and changes in positioning of nucleosomes.14 Epigenetic changes to DNA plays an important role in the normal way cells respond to their environment and contribute to the regulation of many cellular processes. It is not surprising then that aberrant placement of epigenetic marks or defects in the epigenetic machinery lead to human disease. This is now well established in cancer and is becoming more appreciated in cardiovascular and autoimmune disease as well as neurologic disorders.15 Thus epigenetic factors constitute a key link between genetics, disease and environment playing a decisive role in the underpinnings of human pathology. The Human Epigenome Project (HEP; http://www.epigenome.org/index. php) with the stated goal of cataloging and interpreting genome-wide DNA methylation patterns of all human genes in all major tissues, should allow researchers to make these important links.


Multifactorial Disease

Contrasting with rare monogenetic diseases are the milder and much more common multifactorial or complex diseases, such as type 2 diabetes mellitus, hypertension, coronary artery disease, and venous thrombosis. In venous thrombotic disease, thrombotic episodes typically occur relatively late in life, and in up to
50% of patients, genetic risk factors such as heterozygosity for mutations in coagulation inhibitors or cofactors are found.16


Databases of Mutations

Mutations in the genes for essentially all of the coagulation proteins can be found at the Human Gene Mutation Database (HGMD) at the Institute of Medical Genetics, at http://www. hgmd.cf.ac.uk/ac/index.php searching using the relevant gene symbol (Table 10.1). Additionally, a coagulation sequence and structure database (CoagBase; www.isth.org/default/index. cfm/standards/coagbase/) is maintained by the International Society of Thrombosis and Haemostasis, and ClotBase (http://www.clotbase.bicnirrh.res.in) provides a wide range of useful information including a catalog of known mutations for many coagulation factors.17








Table 10.1 Characteristics of human coagulation factor genes













































































































































































































































































































Genes


Gene Symbol


Gene ID


Location


Exons


Gene


mRNA


cDNA


Prothrombin


F2


2147


11p11-q12


14


20.3 kb


1,997 bp


1,866 bp


Factor X


F10


2159


13q34-qter


8


26.7 kb


1,502 bp


1,467 bp


Factor VII


F7


2155


13q34-qter


8


13.1 kb


2,478 bp


1,335 bp


Factor IX


F9


2158


Xq27.1-q27.2


8


31.3 kb


2,804 bp


1,386 bp


Factor XI


F11


2160


4q34-q35


15


22.6 kb


2,217 bp


1,878 bp


Tissue factor


F3


2152


1p22-p21


6


11.6 kb


2,153 bp


888 bp


Factor VIII


F8


2157


Xq28


26


˜186 kb


9,030 bp


7,056 bp


Factor V


F5


2153


1q23


25


˜80 kb


6,914 bp


6,675 bp


Protein C


PROC


5624


2q13-q14


9


10.8 kb


1,843 bp


1,386 bp


Thrombomodulin


THBD


7056


20p12-cen


1


4.2 kb


4,048 bp


1,728 bp


Protein S


PROS1


5627


3p11-q11.2


15


80 kb


3,309 bp


2,031 bp


EPCR


PROCR


10544


20q11.2


4


6 kb


1,483 bp


717 bp


ATIII


SERPINC1


462


1q23-q24


7


13.5 kb


1,395 bp


1,395 bp


TFPI


TFPI


7035


2q31-q32.1


9


˜85 kb


1,431 bp


915 bp


Heparin cofactor II


SERPIND1


3053


22q11.21


5


13.6 kb


2,217 bp


1,500 bp


Factor XII


F12


2161


5q33-qter


14


12 kb


2,048 bp


1,848 bp


Prekallikrein


KLKB1


3818


4q34-q35


15


31 kb


2,245 bp


1,917 bp


HMW kininogen


KNG1


3827


3q26-qter


11


27 kb


˜3,200 bp


1,935 bp


tPA


PLAT


5327


8p12-p11


14


32.7 kb


˜2,600 bp


1,689 bp


Plasminogen


PLG


5340


6q26-q27


19


53.5 kb


˜2,900 bp


2,433 bp


Plasmin inhibitor


SERPINF2


5345


17p13


10


˜16 kb


˜2,300 bp


1,476 bp


PAI-1


SERPINE1


5054


7q21.3-q22


9


12.3 kb


˜2,800 bp


1,209 bp


TAFI


CPB2


1361


13q14.11


11


˜48 kb


˜1,700 bp


1,272 bp


Fibrinogen Aα chain


FGA


2243


4q28


5


5.4 kb


2,182 bp


1,935 bp


Fibrinogen Bβ chain


FGB


2244


4q28


8


8.2 kb


1,918 bp


1,476 bp


Fibrinogen γ chain


FGG


2266


4q28


10


8.4 kb


1,559 bp


1,314 bp


Factor XIII A chain


F13A1


2162


6p25.3-p24.3


15


>160 kb


˜4,000 bp


2,199 bp


Factor XIII B chain


F13B


2165


1q31-q31.2


12


˜28 kb


˜2,200 bp


1,986 bp


vWF


vWF


7450


12p13.3


52


˜178 kb


8,923 bp


8,442 bp


γ-Glutamylcarboxylase


GGCX


2677


2p12


15


13 kb


3,245 bp


2,277 bp


Epoxide reductase I


VKORC1


79001


16p11.2


3


˜5 kb


1,003 bp


492 bp


The size of the cDNAs is defined as the number of nucleotides from the start to the stop codons.


EPCR, endothelial cell protein C receptor; TFPI, tissue-factor pathway inhibitor; HMW, high-molecular-weight; tPA, tissue-type plasminogen activator; PAI-1, plasminogen activator inhibitor-1; TAFI, thrombin-activatable fibrinolysis inhibitor; vWF, von Willebrand factor.


Genomic and protein information for each gene can be obtained by using the Gene ID and searching “All Databases” at http://ncbi.nlm.nih.gov/.


Mutations can be found by searching for Gene Symbol at http://www.hgmd.cf.ac.uk/ac/index.php. Some coagulation factors have their own mutation databases.


These can be found within each section.




MOLECULAR GENETICS OF HEMOSTATIC PROTEINS

Hemostatic proteins can be divided into procoagulant serine proteases, procoagulant cofactors, anticoagulants, contact activators, and fibrinolytic factors, plus several other relevant proteins fibrinogen, factor XIII, and von Willebrand factor (vWF). While not considered coagulation factors per se, both γ-glutamyl carboxylase and vitamin K 2,3-epoxide reductase I (VKOR) play an essential role in hemostasis. These enzymes are involved in the posttranslational carboxylation of specific glutamic acid residues using vitamin K within the amino-terminal Gla domains of vitamin K-dependent proteins (prothrombin, factor VII, IX, X, protein C, S, and Z).18,19 Warfarin inhibits VKOR, and a deficiency of either of these enzymes is classified as vitamin K-dependent coagulation factors deficiency, typically characterized by hemorrhagic manifestations.20 Knockout of the γ-glutamyl carboxylase gene in mice results in offspring surviving to term but dying uniformly at birth of massive intra-abdominal hemorrhage.21








Table 10.2 Genetic models of thrombosis and hemostasis




















































































































































































Gene Knockout


Strain


Viable


Mortality Observed (Days of Gestation)


Phenotype


References


Prothrombin


C57BL/6


No


9-11


Fatal hemorrhage/yolk sac defect


31


Factor X


Swiss


No


11-13


Partial embryonic lethal/intra-abdominal bleeding


54


Factor VII


C57BL6/J


No


At birth


Fatal perinatal bleeding


77


Factor IX


C57BL/6


Yes



Severe bleeding following trauma


93,94,95


Factor XI


C57BL/6


Yes



Normal/prolonged aPTT


114


Tissue factor


Several


No


9.5-12


Fatal embryonic bleeding


125,126,127


Factor VIII


129sv


Yes



Severe bleeding following trauma


137


Factor V


C57BL/6J


No


9.5-10


Fatal hemorrhage/yolk sac defect


166


Protein C


Swiss


No


18.5-19


Consumptive coagulopathy


195


Protein S


129sv


No


15.5-17.5


Consumptive coagulopathy


229,230


Thrombomodulin


129sv


No


8.5-10


Embryonic lethal/yolk sac defect


203,205


EPCR


Swiss


No


9-10


Embryonic lethal/placental thrombosis


453


ATIII


C57BL/6J


No


15-16


Consumptive coagulopathy/severe hemorrhage


454


TFPI


C57BL/6


No


10-13


Intrauterine lethality


266


Heparin cofactor II


C57BL/6


Yes



Normal; differences in thrombosis model


285


tPA


C57BL/6


Yes



Develop normally


330


Plasminogen


C57BL/6


Yes



Predispose to thrombosis; ligneous conjunctivitis


346,347,348


Plasmin inhibitor


C57BL6/J


Yes



Develop normally; enhanced fibrinolytic potential


367


PAI-1


C57BL/6


Yes



Develop normally; mild hyperfibrinolytic state


397,398


TAFI


C57BL/6


Yes



Normal


410


Fibrinogen A chain


C57BL/6


Yes



Wound healing defect/failure of pregnancy


424


Factor XIII A chain


C57BL/6


Yes



Develop normally; severe uterine bleeding; miscarriages


439,440


vWF


C57BL/6J


Yes



Defects in hemostasis and thrombosis; model type 3 vWD


451


γ-Carboxylase


C57BL/6J


No


9.5-18


Massive intra-abdominal hemorrhage


21


EPCR, endothelial cell protein C receptor; ATIII, antithrombin III; TFPI, tissue factor pathway inhibitor; tPA, tissue-type plasminogen activator; PAI-1, plasminogen activator inhibitor-1; TAFI, thrombin-activatable fibrinolysis inhibitor; vWF, von Willebrand factor.


The proteins considered in this chapter and their gene symbols, chromosomal location, size of gene, mRNA, and cDNA sequences are shown in Table 10.1. Most of the hemostatic proteins have been experimentally knocked-out in mouse models, and where appropriate, these studies are summarized in Table 10.2.



Procoagulant Serine Proteases


Prothrombin

The vitamin K-dependent serine protease zymogen prothrombin plays a central role in the coagulation pathway. Following its activation by the prothrombinase complex (i.e., factor Xa, factor Va, calcium ions, and anionic membranes), the resulting product thrombin participates in numerous biologic processes, including the cleavage of fibrinogen to fibrin, which leads to clot formation (see Chapter 8). The gene encoding prothrombin is on chromosome 11p11-q12 and is approximately 20 kb in length with 14 exons.22 The mRNA for human prothrombin was cloned in 1983 and is approximately 2 kb in length and includes a 1,866-bp cDNA.23 The major site of synthesis of prothrombin is in the liver,24 but minor sites are present in the central nervous system (CNS), skeletal and smooth muscle cells, and the kidney.25,26,27,28 Prothrombin is expressed as a single polypeptide chain of 622 amino acids. Following removal of a 43-amino acid leader sequence, it is secreted into plasma as a 72-kDa protein containing 579 amino acids at a concentration of 1.4 µM.23,29

Homozygosity or compound heterozygosity for loss-of-function mutations in the prothrombin gene leads to a bleeding diathesis. There are no known cases of a total deficiency of prothrombin, consistent with the studies in mice showing lethality in utero with a complete absence of the protein.30,31 Prothrombin deficiency has a prevalence of approximately 1:1,000,000 to 1:2,000,000.32 Greater than 60 different mutations are known, most of which lead to a variant protein,32 a number of which have been subjected to detailed biochemical analysis. For example, thrombin Quick I (Arg to Cys at 382) has near normal activity with thrombin-specific low molecular weight (LMW) substrates, but about 100 times lower activity with fibrinogen than that observed with thrombin.33,34

A common polymorphism that constitutes a major genetic risk factor for thrombosis is the 20210G to A mutation.35 Position 20210 in the prothrombin mRNA is where prothrombin pre-mRNA undergoes 3′ cleavage and polyadenylation. This polymorphism was originally found in probands with a strong personal and familial history of venous thrombosis and is associated with slightly elevated plasma levels of prothrombin (˜25%).35 Further epidemiologic data confirmed that the DNA variant confers a three- to fourfold risk factor for venous thrombosis.36,37,38,39 The prevalence of the 20210A allele is between 1% and 2% and is dependent on the geographic origin of the subjects.40 Homozygotes are relatively rare.39

Initial reports indicated that the mutation, located 20 nucleotides downstream of the poly A signal, increases the posttranslational 3′ end processing efficiency, thereby leading to a higher rate of transcription.41 Additional studies showed that the mutant allele influences mRNA stability increasing cytoplasmic half-life.42 However, these studies have been called into question because prothrombin 20210G and 20210A mRNA levels were found at equal levels in fresh liver tissue from a 20210G/A heterozygote.43 This study was able to demonstrate that the 20210A mutation affects the position of the 3′-cleavage/polyadenylation reaction, an event that may lead to its abnormal mRNA function. Additionally, in vitro studies indicate that the G or A allele at 20210 have equal stability further implicating the importance of mRNA processing and/or translation, not cytoplasmic half-life, in regulating prothrombin expression.44


Factor X

Factor X is a vitamin K-dependent plasma protein that plays a central role in blood coagulation. In the presence of its cofactor protein factor Va, activated factor X cleaves two peptide bonds in prothrombin to form thrombin. Factor X is synthesized primarily in the liver, but other tissues also may contribute, such as lung, heart, ovary, and small intestine.45 The gene for factor X has been cloned and is approximately 27 kb in length, contains 8 exons, and is located on chromosome 13q34-qter, 2.8 kb from the factor VII gene.46,47 The cDNA has an open reading frame of 1,467 nucleotides and encodes a protein of 488 amino acids (preprofactor X).48,49,50 Following proteolytic processing, mature factor X (59 kDa) is a two-chain serine protease zymogen of 445 amino acids, covalently associated through a single disulfide bond, circulating at a concentration of 170 nM.51,52

Factor X deficiency is a rare (1:1,000,000) autosomal recessive bleeding disorder.53 Heterozygotes are usually free of symptoms but may exhibit abnormal bleeding during surgery. Over 50 factor X-deficient families are reported and over 100 different mutations have been identified.53 Large deletions have been identified but most defects are missense mutations often associated with detectable factor X antigen levels and mild-toasymptomatic disease.53 Mice rendered experimentally deficient in factor X show frequent embryonic lethality, and those that survive die shortly after birth from massive intra-abdominal bleeding.54,55

A collection of factor X mutations has been described that have differential responses to the intrinsic and extrinsic pathways. For example, factor X Vorarlberg is associated with a marked reduction of activity in the extrinsic system and a much more modest reduction of activity in the intrinsic system.56,57 These types of mutations imply that factor VIIa/tissue factor (TF) and factor IXa/factor VIIIa likely have different sites of interaction with factor X, despite the fact that they cleave the same bond within the zymogen. At least eight naturally occurring factor X variants display marked differences in activity in the intrinsic and extrinsic systems.56,57,58,59,60,61,62,63,64

Factor X was originally called Stuart-Prower factor after the first individuals in whom factor X deficiency was discovered.65,66 One of the index cases of the Stuart family is homozygous for a replacement of Val by Met at position 298.67 A descendent of the Prower case was shown to be a compound heterozygote for an Asp282 to Asn and an Arg287 to Trp mutation.68 An informative historic sketch on the discovery of factor X has been published.69


Factor VII

Factor VII is a vitamin K-dependent serine protease zymogen that circulates in plasma as a single-chain polypeptide (50 kDa) of 406 amino acids at a concentration of 10 nM.70,71,72 Like most clotting factors, it is synthesized primarily in the liver, as a 444-amino acid precursor protein. Activated factor VII, in concert with its cofactor, TF, initiates blood coagulation following vascular injury by activating factors IX and X.73 The factor VII gene is relatively small (˜13 kb), contains eight exons, and is located on chromosome 13q34-qter, only 2.8 kb upstream of the factor X gene.74,75 The full-length mRNA is almost 2.5 kb, whereas the cDNA sequence is 1,335 nucleotides long.

Factor VII deficiency is a rare autosomal disorder with a prevalence of approximately 1:500,000.76 Individuals with factor VII antigen and activity levels <1% display a moderate-to-severe
bleeding phenotype. Mutations associated with this phenotype include frameshift, splice site, promoter, and missense mutations, with the total number of mutations now reaching more than 110.76 The situation is less clear for mutations associated with measurable levels of factor VII activity, as many are asymptomatic.

It is not entirely clear whether a complete absence of factor VII is incompatible with life. For example, the factor VII knockout mouse develops normally to term, but dies shortly after birth from major abdominal and intracranial hemorrhage.77 In addition, McVey et al.78 identified a splice site mutation in the factor VII gene that was associated with a complete absence of factor VII and that resulted in fatal intracerebral hemorrhage at the age of 14 days. In contrast, Peyvandi et al.79 have reported a homozygous 2-bp deletion in the factor VII gene that was associated with a complete absence of factor VII and was nonlethal. Although it is likely that some compensatory mechanism is affecting the phenotype in this individual, this mechanism has not yet been identified.

Several mutations have been identified in the promoter region of the factor VII gene; one of the mutations appears to disrupt the Sp1 binding site, whereas the other two interfere with binding of the transcription factor hepatocyte nuclear factor (HNF)-4 or the binding of an unidentified protein near the transcription state site.80,81,82 Additional transcription factors (early growth response [EGR]-1, cAMP response element binding [CREB] protein, and upstream stimulatory factor-1) also bind within the proximal factor VII promoter region.83 A mutation at -39 resulting in reduced plasma levels and activity may disrupt binding of CREB.84

Interest in the relation between elevated factor VII levels and an increased risk of cardiovascular disease was stimulated by the findings of the Northwick Park Heart Study (NPHS I).78,85 However, results from a second prospective study (NPHS II) did not show a clear correlation between elevated factor VII levels and myocardial infarction risk or the inverse correlation with a factor VII lowering haplotype,86 and further case-control studies showed that this polymorphism is associated with a lower risk of peripheral arterial disease.87

Since cardiovascular disease is a multifactorial outcome, it is likely that the relationship between plasma factor VII levels and arterial thrombosis is influenced by additional thrombotic risk factors such as diabetes, high body mass index, and elevated cholesterol levels.


Factor IX

Factor IX is a vitamin K-dependent serine protease zymogen that plays a critical role in the middle phase of blood coagulation. Activated factor IX is the serine protease component of the intrinsic Xase enzymatic complex, which is also composed of factor VIIIa, anionic membranes, and calcium ions (see Chapters 9 and 14). Factor IX is synthesized primarily in the liver and is composed of 461 amino acids. Following proteolytic processing and removal of the preproregion, it circulates in plasma as a single-chain 415-amino acid protein (55 kDa) at a concentration of 90 nM. The factor IX gene was cloned between 1982 and 1984 and lies on the long arm of the X chromosome, band q27.1.88,89,90,91,92 The full gene sequence is known and is approximately 31 kb in length and has eight exons.92 The mRNA is approximately 2.8 kb in length, whereas the cDNA is 1,386 bp long.89,92

Loss-of-function mutations in the factor IX gene result in hemophilia B. The prevalence of this X-linked disorder is 1 in 25,000 men, whereas women are rarely affected. The fivefold difference in prevalence between hemophilia A and B is roughly equivalent to the difference in size of the coding portions of the factor VIII and IX genes. The clinical manifestation of hemophilia B is almost identical to that for hemophilia A. Several mouse models of hemophilia B have proved useful in the development of new therapies.93,94,95

The first mutation in the factor IX gene, described in 1983,96 was a deletion that showed up in a screening using Southern blotting of genomic DNA. The first point mutation in factor IX, factor IX Chapel Hill, was discovered by protein sequencing, not by DNA analysis.97 At present, over 375 different mutations are known in the factor IX gene, many of which can be found in an international database (http://www.kcl.ac.uk/ip/peter green/haemBdatabase.html). Some mutations occur independently in patients from diverse geographic regions, mostly mutations at CpG dinucleotides, accounting for approximately 50% of the independent missense mutations found in different families.

The numerous mutations in the 5′-untranslated region of the factor IX gene are associated with a certain form of hemophilia B (hemophilia B Leyden), arguably the most fascinating of all promoter mutations.98,99,100 The unique feature of these variants is that individuals exhibit “recovery” from severe hemophilia with the onset of puberty. Therefore, the disease is characterized by the absence of factor IX expression in childhood (factor IX levels <2%) and a gradual rise in factor IX levels after the onset of puberty (see figure 10.1). Studies using transgenic mice that robustly mimic the hemophilia B Leyden phenotype have uncovered the elusive molecular mechanism underlying the puberty-onset spontaneous amelioration of factor IX levels. This process relates to a puberty-onset gene switch (age-related stability element/age related increase element) and that growth hormone is directly responsible for the puberty-onset recovery of factor IX production.101

Three factor IX mutations (i.e., Ala10 to Val, Ala10 to Thr, and Asn9 to Lys) are associated with a unique type of genetic predisposition to bleeding during oral anticoagulation therapy that is attributable to increased warfarin sensitivity.102,103,104 The Ala10 and Asn9 residues are within the factor IX propeptide region and are critical for interaction with γ-glutamyl carboxylase and, therefore, γ-carboxylation; mutations at these sites impair this reaction. When receiving coumarins, patients with these variants showed a disproportionate decrease of factor IX activity approaching that of severe hemophilia B. As a consequence, these patients may have bleeding at the very beginning of oral anticoagulation. After discontinuation of treatment, the factor IX levels return to normal.

While mutations in factor IX are generally not associated with thrombosis, elevated levels of factor IX have been associated with venous thrombosis.105 A patient has been described with a missense mutation in factor IX (Arg338 to Leu; factor IX-Padua) that leads to thrombophilia.106 The levels of the mutant protein were normal in the patient, but the coagulant activity was approximately eight times the normal level. To date, this is the only reported gain-of-function mutation resulting in a hyperfunctional factor IX protein predisposing to thrombosis.







FIGURE 10.1 Factor IX levels measured over time in eight individuals with Leyden variant hemophilia B. These mutations in the promoter result in severe factor IX deficiency until puberty when factor IX levels rise gradually to approximately 30%. (Data from Briet E, Bertina RM, van Tilburg NH, et al. Hemophilia B Leyden: a sexlinked hereditary disorder that improves after puberty. (N Engl J Med 1982;306:788-790.)


Factor XI

Factor XI is a serine protease zymogen that contributes to hemostasis by activating factor IX (see Chapter 15A). Factor XI is converted to factor XIa through proteolytic cleavage, either by factor XIIa or by thrombin. A near full-length cDNA encoding factor XI was cloned in 1986, and the structure of the factor XI gene was determined in 1987.107,108 The gene consists of 15 exons spread over 23 kb of DNA. The factor XI gene is on chromosome 4q34-q35, adjacent to the prekallikrein gene.108 Factor XI is synthesized in the liver and circulates in plasma at a concentration of 30 nM.109,110 It is a disulfide-bond-linked dimer of two 80-kDa polypeptides of 608 amino acids each.109

Factor XI deficiency is an autosomal disorder characterized by trauma- or surgery-induced hemorrhage and is only rarely characterized by spontaneous bleeds.111 A mutation database specific for factor XI can be found at: http://www.factorxi.org/. It is usually rare in most ethnic groups but is highly prevalent among Ashkenazi Jews,112 mutated factor XI genes being as high as 13%.112,113 Homozygosity for factor XI deficiency is associated with a very mild bleeding tendency, even when factor XI levels are quite low, consistent with factor XI-deficient mice that do not show any increased tendency to bleed.114

The recurrence of mutations in Jews is thought to be the result of founder effects.113 This interpretation is supported by the fact that in other ethnic groups the mutations are more diverse. Well over 150 different mutations are known in factor XI, many of which are associated with decreased antigen levels. The mutations vary from splice site defects and amino acid replacements to frameshift mutations.


Procoagulant Cofactors


Tissue Factor

TF is a cell-associated cofactor protein for activated factor VII, and the enzyme complex is considered to be the physiologic trigger of normal hemostasis and thrombotic disease (see Chapter 11).73 The gene for TF resides on the short arm of chromosome 1 (p21-p22) and is approximately 12 kb long and contains six exons.115,116 The size of the predominant TF mRNA in cells is approximately 2.2 kb, with the cDNA being 888 bp long.115,117,118,119 TF is an integral membrane protein of 263 amino acids and is initially synthesized with a 32-amino acid signal peptide. TF is expressed constitutively by epithelial cells and adventitial fibroblasts surrounding blood vessels as well as by cardiomyocytes and astrocytes in the brain and is exposed to the blood following injury.120,121 TF is present in the circulation, associated with hematopoietic cells and their microparticles.122 Cardiovascular disease, sepsis, and cancer have increased “circulating TF” levels, thus potentially contributing to pathologic thrombosis.123

Several studies have identified binding sites for transcription factors that regulate basal and inducible TF gene expression in different cell types. Basal expression appears to be regulated by the Sp1 transcription factor, whereas inducible expression is regulated by c-Fos/c-Jun, c-Rel/p65, and EGR-1.124 Regulation of TF expression by inflammatory mediators and angiogenic factors suggests that it may contribute to both inflammation and angiogenesis.

At present, there are no published missense mutations in the TF gene and no known TF-deficient patients, suggesting that TF deficiency may be lethal. Support for this concept comes from TF knockout mice that die in utero between days 8.5 and 10.5.125,126,127 Promoter polymorphisms in the TF gene, however, have been identified. Six novel promoter polymorphisms have been found that are distributed over two haplotypes with equal frequencies, one of which (TF-603A) has been linked to venous thromboembolic disease, suggesting that it might protect against thrombosis by reducing the level of circulating TF, although no link with myocardial infarction has been found.128 These polymorphisms significantly influence constitutive TF gene expression in human monocytes, but have no major effect on whole blood clotting time.129


Factor VIII

Factor VIII is a metal ion-dependent procofactor that circulates in plasma as a heterodimer of a heavy chain and light chain (see Chapter 12). Thrombin-activated factor VIII is the cofactor protein for factor IXa and plays a key role in the activation of factor X.130 The essential role of factor VIII in blood coagulation is evidenced by the severe bleeding diathesis associated with its deficiency (hemophilia A). The factor VIII gene was originally cloned in 1984 and is approximately 186 kb in length and has 26 exons and lies on the X chromosome (q28).131,132,133 The gene encodes a mature mRNA of approximately 9 kb, and the cDNA is 7,056 nucleotides long.134,135 Preprofactor VIII has 2,351 amino acids, whereas mature factor VIII has 2,332 amino acids and circulates in plasma at a concentration of approximately 0.7 nM.134,135,136 A mouse model of hemophilia A was described in 1995.137 Accounts describing the discovery of the factor VIII protein and gene have been published.138,139

The first description of mutations in the factor VIII gene dates to 1985, in which two nonsense mutations in the coding
sequence and two partial deletions of the factor VIII gene were described.140 The finding of four distinctly different mutations in the factor VIII gene of otherwise similar patients was the first hint that a large variety of mutations exist in the factor VIII gene in patients with hemophilia.

There is also an unusually common inversion in the factor VIII gene that is highly prevalent among patients with severe hemophilia A.141,142 This common rearrangement in the factor VIII gene was discovered in 1993 and is present in approximately 40% to 50% of all severe hemophilia A cases.141 Because it is so prevalent and relatively easy to detect, the discovery of this rearrangement has revolutionized genetic counseling for hemophilia A. It results from an unequal crossover event between a sequence in intron 22 of the factor VIII gene and one of two extragenic copies of this sequence that lie distal on the X chromosome. Inversion of the factor VIII gene by homologous recombination leads to complete disruption of the factor VIII coding sequence and consequently to a severe form of hemophilia A (figure 10.2). Another factor VIII inversion occurs in intron 1 and breaks the factor VIII gene, resulting in production of two chimeric mRNAs143,144; the prevalence of this inversion in patients with severe hemophilia A is approximately 4% to 5%.144

The molecular pathology of the factor VIII gene is updated and reviewed in an electronic version of the hemophilia A database: http://hadb.org.uk/. As of 2010, there were more than 1,400 unique mutations in factor VIII reported, including some 900 different point mutations. Missense mutations in FVIII span the entire molecule with a relatively equal distribution, except for the central B-domain (figure 10.3), consistent with data indicating that this region is not needed for biologic activity.145 The effect of these mutations on gene function is diverse. Severe dysfunction is produced by point mutations that introduce premature stop codons and by mutations that destroy splice junctions between introns and exons. On the other hand, these types of missense mutations that predict replacement of the normal amino acid by another amino acid also contain most, if not all, of the mild and moderately severe forms of hemophilia A.






FIGURE 10.2 Factor VIII intron 22 inversion. The left panel depicts the normal alignment of the factor VIII gene on the X chromosome. Intron 22 (white box) of the factor VIII gene harbors a copy of the so-called A1 gene, whereas highly homologous A2 and A3 genes are located toward the tip of the X chromosome (indicated as qter). The middle panel shows alignment of the A1 gene with the A3 gene as it may occur during male spermatogenesis. When this alignment of A1 with A2 (or alternatively A3) results in a nonhomologous recombination, a rearrangement results as depicted at the right panel. In this rearranged X chromosome, the integrity of the factor VIII gene is completely disrupted because it is cut in two parts that are now aligned in opposite directions.

Gene deletions of <100 bp up to several hundred kilobases have been found in many severely affected patients, and gene deletions may be present in up to 5% of such patients.146 Insertions are relatively uncommon in the factor VIII gene. As with deletions, it is important to distinguish in-frame from frameshift insertions. In the group of large insertions are the so-called LINE (long inserted element) retrotransposons.147,148 These DNA elements comprise approximately 5% of the human genome. Intact LINEs encode a protein with reverse transcriptase-like activity and have the capacity to spread through the genome through an RNA intermediate. The LINE first identified in patients with hemophilia A was derived from an element on chromosome 22. Subsequent to the original description in hemophilia A, insertion of LINEs has been found to disrupt various other genes.







FIGURE 10.3 Graphic representation of missense mutation within factor VIII. Mutations in factor VIII leading to hemophilia A are presented in 40 residue blocks spanning the molecule. Mutations were derived from the hemophilia A mutation database (http://hadb.org.uk) with nonsense mutations excluded. (J. Coagul Disord 2010;2:19-27, with permission.)


Factor V

Activated factor V is the cofactor protein for the serine protease factor Xa in the prothrombinase complex. The precursor of factor Va is factor V, a large heavily glycosylated, single-chain protein that is homologous to factor VIII (see Chapter 12).149 The factor V gene is located on the long arm of chromosome 1q23, not far from the antithrombin gene and is approximately 80 kb in length and contains 25 exons.150 The first factor V cDNA sequences were published in 1986 to 1987.151,152,153 The mRNA is approximately 7 kb, whereas the cDNA sequence is 6,675 nucleotides long. Factor V is synthesized in the liver as a 2,240-amino acid protein and is secreted after preproprocessing as a 2,196 single-chain protein and circulates in plasma at a concentration of 20 nM.154,155,156 In addition to plasma factor V, approximately 20% of the total factor V in whole blood is contained in the α-granules of platelets.156 Although it was originally thought that megakaryocytes synthesize factor V,157,158 studies in humans indicate that most of the platelet-derived factor V originates from plasma through an endocytotic mechanism.159,160,161,162

Factor V deficiency is a rare recessively inherited disorder associated with a mild-to-severe bleeding tendency caused by loss-of-function mutations in the factor V gene.163 It was first described in the 1940s by Owren164 in Norway. Its incidence is about 1:1,000,000 and there have been more than 200 cases described so far. Because factor V deficiency is so rare and the gene is large, relatively few patients have been characterized genetically, with over 50 mutations reported, including missense mutations, insertions, deletions, and splice-site mutations.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jun 21, 2016 | Posted by in HEMATOLOGY | Comments Off on Genetics of Coagulation

Full access? Get Clinical Tree

Get Clinical Tree app for offline access