Evolution of the Immune System

Evolution of the Immune System

Martin F. Flajnik

Louis Du Pasquier


Defense mechanisms are found in all living things, even bacteria, where they are surprisingly elaborate. Although new adaptive or adaptive-like (somatically generated) immune systems have been discovered in invertebrates and the jaw-less fish, adaptive immunity based upon immunoglobulin (Ig), T-cell receptors (TCRs), and the major histocompatibility complex (MHC) is only present in jawed vertebrates (gnathostomes); because of clonal selection of lymphocytes, positive and negative selection in the thymus, MHC-regulated initiation of all adaptive responses, etc., the major elements of the adaptive immune system in gnathostomes are locked in a coevolving unit that arose in concert over a short period of evolutionary time.1,2,3 In addition, a large cast of supporting players, including a large array of cytokines and chemokines, adhesion molecules, costimulatory molecules, and well-defined primary and secondary lymphoid tissues, evolved in the jawed vertebrates as well. This scheme was superimposed onto an innate system inherited from invertebrates, from which many innate molecules and mechanisms were coopted for the initial phase of the adaptive response and others for effector mechanisms at the completion of adaptive responses. Over the last 10 years, we have learned that various components of the innate immune system are also incredibly complex and locked as well in a coevolving unit.4,5

In each group of organisms, one can detect a basic set of immune functions, and these are employed in different ways in representative species. For example, we detect that in the jawed vertebrates fine tuning, or adaptations, or even degeneration of molecules/mechanisms in each group (Taxon) are observed, and not a steady progression from fish to mammals as is documented for most other physiologic systems.6 Given that all the canonical adaptive immune system features are present in cartilaginous fish and apparently none were lost (except in particular groups of organisms that will be discussed), differential utilization of defense molecules rather than sequential installation of new elements is observed. In this chapter, there are only isolated cases of increasing complexity in immune systems, but many examples of contractions/expansions of existing gene families; thus, “more or less of the same” rather than “more and more new features” is the rule. We observe a bush growing from a short stem rather than a tall tree with well-defined branches, and thus deducing the primitive, ancestral traits is not clear cut.


The Common Ancestor Hypothesis

Figure 4.1 displays the extant animal phyla ranging from the single-celled protozoa to the metazoan protostome and deuterostome lineages. It is often suggested or assumed by those unfamiliar with thinking in evolutionary terms that molecules or mechanisms found in living protostomes, like the well-studied arthropod Drosophila, are ancestral to similar molecules/mechanisms in mouse and human. While this is true in some cases, one must realize that Drosophila and humans have taken just as long (over 900 million years) to evolve from a common ancestral triploblastic coelomate (an animal with three germ layers and a mesoderm-lined body cavity, features shared by protostomes and deuterostomes) that looked nothing like a fly or a human, and thus Drosophila is not our ancestor (ie, the manner by which flies and humans utilize certain families of defense molecules may be quite different and both may be disparate from the common ancestor). Thus, understanding of how model invertebrates and vertebrates perform certain immune tasks is an important first step in our understanding of a particular mechanism, but we only deduce what is primordial or derived when we have examined similar immune mechanisms/molecules in species from a wide range of phyla. We will touch upon each of the defense molecule families and will emphasize those which have been conserved evolutionarily and those that have evolved rapidly.

Rapid Evolution of Defense Mechanisms

Immune systems are often compared to the Red Queen in Alice in Wonderland, (Red Queen’s Hypothesis7,8) who must continually keep moving just to avoid falling behind. Because of the perpetual conflict with pathogens, the immune system is in constant flux. This is exemplified by great differences in the immune systems of animals that are even within the same phylogenetic group (eg, mosquito [Anopheles] and fruitfly [Drosophila], both arthropods, or human and mouse, both mammals). In fact, in contrast to what was believed previously, defense mechanisms are extremely diverse throughout the invertebrate phyla, and Kepler et al. have aptly and succinctly described the situation in the title of a recent review: “not homogeneous, not simple, not well understood.”9 In the jawed vertebrate (or gnathostomes), the most rapidly evolving system is an innate system, natural killer (NK) cell recognition, governed by different classes (superfamilies) of receptors in mice and humans, and also extremely plastic even
within the same species, exemplified by the large number of killer immunoglobulin superfamily (KIR) haplotypes found in humans.10 Studies of Ig superfamily (SF) genes expressed in the nervous system and immune system showed definitively that the immune system molecules evolve at a faster rate.11 Finally, rapid evolution of immune system molecules and mechanisms is the general rule, but molecules functioning at different levels of immune defense (recognition, signaling, or effector) can evolve at widely varying rates.

FIG. 4.1. Major Animal Groups and Immune Mechanisms/Molecules Described to Date in Each Group. The first box on the left in each row describes the animal taxon and the approximate number of species in that group. The next box shows specific examples of species or subgroups. The third box lists molecules/mechanisms found in each group: underlined terms indicate somatic changes to antigen receptors or secreted molecules. Figure modified from Hibino et al.48 and Flajnik and Du Pasquier.61 See Table 4.1 for definition of the acronyms.

Conservation of Defense Mechanisms

While the immune system is the most rapidly evolving physiologic system, nevertheless there is also deep conservation of defense families and mechanisms. Klein11a has compared this dichotomy to the two-headed god Janus, the major idea being that certain basic mechanisms/functions are obligatory for immune systems to function, but they still must evolve rapidly to avoid pathogen subterfuge. For example, MHC class I molecules have similar structure/function/features in all gnathostomes, but even within groups of primates class I genes are not orthologous (ie, they can be derived from totally different ancestral class I genes10). So, the idea is to preserve vital immune functions but rapidly modify the gene or pathway to outwit the pathogen. Additionally, certain features are conserved (eg, development and function of conventional αβ T cells), but a second, similar system, can be exploited in very different ways in closely related
species (eg, the function[s] of γδ T cells). The “Janus paradigm,” therefore, can be quite useful when examining any pathway in the immune system.

Convergent Evolution

Early on in the comparative study of immunity, it was assumed that the same features appearing in different taxa proved that they were present in the common ancestor as well (ie, they were submitted to divergent evolution). While this dictum still holds true and establishes one of the dogmas of comparative immunology, later we discovered, because of the aforementioned rapid evolution of immune systems, convergence of similar functions has occurred in evolution (ie, the same function or even molecular conformation has arisen independently in different organisms, sometimes in species that are relatively closely related). While we will discuss several cases of convergent evolution throughout the chapter, for frame of reference, the NK cells, which use different receptor families in primates and rodents to achieve precisely the same ends of recognizing polymorphic MHC class I molecules, is a striking example of convergence.12 Additionally, the emergence of a lymphocyte-based somatic generation of two entirely different receptor families for the same function in jawless and jawed vertebrates is another remarkable illustration of convergent evolution.13 Finally, in innate immunity, the cytosolic nucleotide-binding domain leucine-rich repeat (NLR) proteins, despite their striking similarity in deuterostomes and plants, arose (at least) twice in evolution.14

Multigene Families

Genes involved in immunity are often found in clusters, with extensive contraction and expansion via so-called birth and death processes.15 It is well known that such gene clusters can change rapidly over evolutionary time due to unequal recombination crossovers and gene conversion (and not only in the immune system, but in any cis -duplicating gene family). Often, families of related immune genes— especially those involved in recognition events—are found near the telomeres of chromosomes, presumably because this further promotes gene-shuffling events. Nonclassical MHC class I loci, NK receptors, and NLRs are conspicuous examples of this phenomenon, again believed to be a consequence of the race against pathogens. We will discuss many examples of how such multigene families have been exploited in different species throughout the chapter.

Gene duplication, either in the clusters mentioned above or as a consequence of en bloc duplications, certainly has been a major feature of immune system diversity and plasticity. The two types of duplications are not equivalent, the former being more taxon-specific and the latter (en bloc) having a lasting impact on the entire system. It is now universally accepted that two = =genome-wide duplications (the so-called 2R hypothesis2,16; see Fig. 4.13) occurred early in vertebrate history, tracking very well with the emergence of the Ig/TCR/MHC-based adaptive immune system.2,17 This theory forms the basis for much that will be discussed concerning the evolution of the vertebrate adaptive immune response as one can track the emergence of new immune mechanisms, as well as fine tuning of old ones, by examining the paralogous syntenic groups of genes. Our view is that these genomewide duplications were as crucial as the “RAG transposon”18,19,20 in the development of the Ig/TCR/MHC-based adaptive immunity. In addition, the common ancestor of bony fish (teleosts) underwent a third round of genomewide duplication, which many believe to have played the major role in these fishes’ unique outlier status regarding immune system genetics and physiology.2,21


In addition to gene duplications, polymorphism also augments the diversity of immune recognition within a population. It can be generated any time during the history of a gene family of either receptor or effector molecules: MHC, toll-like receptors (TLRs), Ig, TCRs, NK receptors (NKRs) and related molecules, and antimicrobial peptides (AMPs) are just a few examples. Polymorphism, either within the gene itself or in its regulatory elements, provides populations with flexibility in function of the changing pathogenic environment. This subject, central to the studies on MHC, leukocyte receptor complex (LRC), and NK cell complex (NKC), is becoming well documented for immunity-related genes in insects as well. In Drosophila, polymorphism in regulatory networks is indeed expected as parasites often target their elements.22 In humans, there are two major NK cell haplotypes found in all subpopulations, which are under “balancing selection.” In such a case, the polymorphisms presumably adopt a division of labor required for the maintenance of the species: one haplotype is believed to be involved in protection from virus and the other perhaps for promoting reproduction.23

Somatic Generation of Diversity

Somatic modifications can take place at multiple levels to generate immune system diversity. Long believed to be the sole domain of jawed vertebrates, modifications at the deoxyribonucleic acid (DNA) level via somatic hypermutation, gene conversion, and rearrangement (primary and secondary [eg, receptor editing]) irreversibly modify genes within an individual. The well known V(D)J joining, class switch recombination (CSR), and somatic hypermutation (SHM) are examples of this processes in the IgSF receptors of jawed vertebrates, but modifications to genomic DNA can also occur in the jawless fish and some invertebrates.7,13 The list of organisms undergoing such diversity of germline immune genes will only grow as more organisms are examined and more genome and expressed sequence tags (EST) sequencing projects are undertaken (see Fig. 4.1).

Alternative splicing can be a source of tremendous diversity in some gene families encoding receptors involved in immunity in insects and crustaceans. The Down syndrome cell adhesion molecule (DSCAM) gene in several arthropods (described in detail in the following) was shown to generate enormous diversity via ribonucleic acid (RNA) processing.7,24 In the vertebrates, this mechanism is important in
determining the function of different molecules, best known for the Igs (transmembrane [TM] versus secreted forms, as well as inflammatory versus neutralizing forms in non-mammalian vertebrates). Further diversity can be obtained by the assembly of multichain receptors in which different components are combined. The classical example in the jawed vertebrate adaptive immune system is that of Ig light (L) and heavy (H) chains of antibodies but similar combination can occur with insect peptidoglycan-recognizing proteins (PGRP), vertebrate TLRs, and many others.

The study of the evolution of immunity has resulted in a fundamental appreciation of the heart of immunity, both innate and adaptive. Especially now, with studies in many plants and in both invertebrate and vertebrate animals, we can see what features have been conserved and when they arose. “Simple” genetic models such as Drosophila and Candida elegans provide a glimpse into these elemental mechanisms and also allow us to remove the clouds that surround studies of mouse and human, with so many interconnected pathways. As mentioned, examination of the well-studied mammalian models in combination with studies of invertebrates allows us to deduce the condition of the common ancestor. Interestingly, major pathways of defense known to all immunologists, such as the ones involving TLR, JAK-signal transducer and activator of transcription (STAT), NOTCH, and tumor necrosis factor (TNF) pathways, clearly arose in an early animal ancestor and have been perpetuated in derived fashions in all major taxa. We shall see how these pathways are manipulated in different animals, always drawing upon the best-known mammalian model as a foundation (whenever possible).


Defense molecules can be composed of a very large number of protein folds, some of which are clearly some used to a large extent.25,26,27 Some of the most common families (Fig. 4.2) are IgSF, leucine-rich repeats (LRRs), C-type lectins, and the TNF family, and certain other domains in immune recognition (eg, scavenger receptor cysteine-rich [SRCR]). All of the domains discussed in this chapter are found in Table 4.1 and a few are displayed in Figure 4.2. As a means of introduction, two of these families, which constitute the “top two” quantitatively, will be described in the following.

FIG. 4.2. Major Molecular Families Described in the Text and Representatives of Each Family: Leucine-Rich Repeats (LRRs), Immunoglobulin Superfamily (IgSF), Peptidoglycan-Recognition Protein (PGRP), β1-3 Glucan Recognition Protein (β1-3GRP), and C-Type Lectins. Representative structures are shown above for LRR (toll-like receptor, nucleotide-binding domain leucine-rich repeat [NLR], variable lymphocyte receptor), IgSF, PGRP, β1-3GRP, and C-type lectin. Other acronyms are defined in Table 4.1. For the NLR model, the echinoderms have N-terminal death domains, whereas all other animals have caspase-recruitment domains. Figure modified from Hibino et al.48

Leucine-Rich Repeats

LRRs consist of 2 to 45 motifs of 20 to 30 amino acids in length (XLXXLXLXXNXHXXHXXXXFXXLX) that fold into an arc shape (see Fig. 4.2).28 Both the concave and convex parts of the domain have been shown to interact with ligands. Molecular modeling suggests that the conserved pattern LxxLxL is sufficient to impart the characteristic horseshoe curvature to proteins with 20- to 30-residue repeats. LRRs are often flanked by cysteine-rich domains. LRRs occur in proteins ranging from viruses to eukaryotes and are found most famously in the toll/TLRs, as well as tyrosine kinase receptors, cell-adhesion molecules, resistance (R) factors in plants found at the cell surface, and in the cytosol, extracellular matrix (ECM)-binding glycoproteins (eg, peroxidasin), and are involved in a variety of protein-protein interactions: signal transduction, cell adhesion, DNA repair, recombination, transcription, RNA processing, disease resistance, apoptosis, and the immune response. LRR-containing proteins can be associated with a variety of other domains, whether they are extracellular (LRR associated with IgSF or fibronectin [FN] type III) or intracellular (caterpillar family LRR associated with a variety of effector domains; see subsequent discussion). In these chimeric molecules, the LRR moiety is involved in recognition, most likely due to its extraordinarily malleable structure. There are at least six families of LRR proteins, characterized by different lengths and consensus sequences of the repeats.29 Repeats from different LRR subfamilies never occur simultaneously and have most probably evolved independently in different organisms.

TABLE 4.1 Molecules and Abbreviations Found Throughout the Text

Acronym/Defense Molecule

Full Name



Activation-induced cytidine deaminase

SHM/gene conversion/CSR


Apolipoprotein B mRNA editing enzyme catalytic polypeptide

Innate immunity (antiviral)



Intraembryonic origin of hematopoietic cells


Antimicrobial peptide

Innate immunity (eg, defensins)


Agnathan paired antigen receptor

Similarities to Ig/TCR and NKRs


Avirulence protein

Pathogen effector recognized by plant NLR


Factor B

Enzyme of C′ cascade


Beta 1-3 glucan-recognizing protein

Binds to gram-negative bacteria



Innate/adaptive immunity


Caspase-recruitment domain

Domain in intracellular defense molecules


CARD, transcription enhancer, R(purine)-binding, pyrin, lots of leucine repeats



Complementarity-determining region

Portion of Ig/TCR that binds to antigen


Class switch recombination

Adaptive humoral immunity modification


Death domain

Cytosolic interacting domain


Down syndrome cell adhesion molecule

Insect immune (adaptive?) defense and neuron specification


Extracellular matrix


Effector-triggered immunity

Immunity in plants triggered by NLR


F box-associated domain

Intracellular domain


Fc receptor neonatal

MHC-like FcR


Fibronectin type III repeat

Domain found in many innate molecules


Fibrinogen-related protein

Mollusk (adaptive?) defense


Fusion histocompatibility

Histocompatibility locus in tunicates


Gut-associated lymphoid tissue

GPI Hemolysin


Lipid linkage to cell membrane (eg, VLR) Cell lysis


Interleukin-converting enzyme

IL-1β processing



Adaptive immunity


Immunoglobulin superfamily

Innate/adaptive immunity



Innate (type I)/adaptive (type II) immunity


Immune deficiency

Insect innate defense


Interferon regulatory factor

Innate (transcription factor)


Immunity-related GTPases

Innate immunity


Immunoreceptor tyrosine-based activation motif

Signaling motif for NK and antigen receptors


Immunoreceptor tyrosine-based inhibitory motif

Signaling motif for NK and antigen receptors


Janus kinase

Signaling molecule associated with cytokine receptors


Killer IgSF receptor

NK cell receptor


For example, galectin, C-type, S-type

Many (eg, NKRs, selectins)


Leukocyte immune-type receptors

Fish NK-like receptors of the IgSF


Low-molecular-weight protein

Proteasome subunit


Leukocyte receptor complex

Gene complex containing KIR and many IgSF molecules


Leucine-rich repeat

Innate/adaptive immunity module


Membrane-attack complex

C′, pore-forming


MAC-perforin domain

Potential pore former


MBP-associated serine protease

Lectin C′ pathway

MBP (or MBL)

Mannose-binding protein (lectin)

Lectin C′ pathway


Mollusk defense molecule

IgSF defense molecule


Major histocompatibility complex

T-cell recognition; innate immunity


Macrophage inhibitory factor

Innate immunity; inflammation

MyD88 (also dMyD88)

(Drosophila) Myeloid differentiation primary response gene 88

TLR adaptor


Novel immune-type receptors

Teleost fish NK-like receptors of the IgSF

NK cell

Natural killer cell

Vertebrate innate cellular immunity


Natural killer cell complex

Gene complex with many C-type lectin genes (especially NK cells)


Natural killer cell receptor

Receptor on NK cells


NACHT leucine-rich repeat and PYD-containing protein

Intracellular PRR


Nucleotide-binding domain LRR

Motif of intracellular defense molecules


Nuclear factor-κB (Rel homology domain)

Evolutionarily conserved transcription factor


NACHT leucine-rich repeat protein

Intracellular PRR


Nucleotide oligomerization domain protein

Intracellular PRR


Nitric oxide synthase

Intracellular killing innate defense molecule


Pathogen-associated molecular pattern

Conserved target epitopes on pathogens


Programmed cell death

Many pathways


Defense molecule in shrimp


Peptidoglycan-recognition protein

Gram-positive bacteria defense family; receptor and effector


Propolyphenol oxidase

Plant/invertebrate defense (melanization)


Pattern-recognition receptor

Recognize PAMP, innate/adaptive immunity


Proteasome subunit beta subunit

Proteolytic member of 20S proteasome


DNA polymerase µ

Error-prone polymerase (related to TdT)



Invertebrate defense molecule


Pyrin domain

Domain in intracellular defense molecules


Recombination-activating gene

Ig/TCR rearrangement


Restriction fragment polymorphism-Y

Chicken nonclassical MHC gene cluster


Regulatory factor X

Transcription factor, class I regulation


Retinoic acid-inducible gene

Intracellular double-stranded RNA recognition


Recombination signal sequence

DNA element next to Ig/TCR gene segments necessary for RAG-mediated rearrangement


Retinoid X receptor

Transcription factor encoded in MHC


Somatic hypermutation

Adaptive humoral immunity


Spaezle-processing enzyme

Insect defense molecule in toll cascade


Scavenger receptor cysteine-rich

Innate immunity recognition molecule


TGF-β activated kinase

ubiquitin-dependent kinase of innate pathways

TAP (and TAP-L)

Transporter associated with antigen processing

Rransports peptides from cytosol to ER lumen


TAP-binding protein

Tethers TAP to class I


T-cell receptor

Adaptive defense


Terminal deoxynucleotidyl transferase

Involved in Ig/TCR rearrangement


Thioester-containing protein

Opsonization (like C3)


Transforming growth factor

Immunosuppressive cytokine


Tumor necrosis factor



Protostome cytokine induced by viral infection


Toll-like receptor

Innate receptor on the cell surface or in endosomes




Tumor necrosis factor

Proinflammatory cytokine (and family)


Tripartite motif-containing proteins

Large family of cytosolic innate defense molecules

V-, C1-, C2-, I-

Variable, constant 1 and 2, intermediate IgSF domain

IgSF domain types


Guanine exchange factor, the “onc F” proto-oncogene

Encoded in MHC, involved in adaptive signaling pathways


Variable domain chitin binding

Amphioxus defense molecule


Variable lymphocyte receptor

Agnathan adaptive defense molecule


Plant transcription factor used to upregulate defense genes (analog of NF-κB)


Xenopus MHC-linked IgSF V region

Xenopus MHC-linked NKR-like genes


Xenopus nonclassical

Xenopus class Ib cluster


Sea urchin defense molecule

(Adaptive?) Defense

DNA, deoxyribonucleic acid; ER, endoplasmic reticulim; IL, interleukin; mRNA, messenger ribonucleic acid; RNA, ribonucleic acid.

LRR-containing proteins are involved in immunity from plants to animals. The functions in the immune systems range from control of motility of hemocytes and lymphocytes30 to specific recognition of antigens via a novel system of gene rearrangement (the variable lymphocyte receptors [VLR] described in the following; see Fig. 4.9). LRRs can occur in soluble forms, the ECM, in the cytosol, or as TM forms, either integral membrane proteins or glycophospha-tidylinositol (GPI)-anchored. The bottom line is that because of its basic structure and malleability, the LRR module was locked in early in evolution as an ideal motif for recognition of essentially any ligand.

Immunoglobulin Superfamily

IgSF domains are encountered in a very large number of molecules in the animal kingdom (see Fig. 4.2).31 They are found intracellularly (eg, connectin) or as cell adhesion molecules, many of which are in the nervous system (eg, the neural cell adhesion molecule, NCAM), coreceptors and costimulatory molecules of the immune system (eg, cluster of differentiation [CD]79, CD80), molecules involved in antigen presentation to lymphocytes (eg, class I molecules), certain classes cytokine receptors (eg, interleukin [IL]-1R), and of course Ig (and TCR), where they were first characterized and were bestowed with their name (Ig). They can be associated with other domains such as FN (eg, titin and FREP) and LRRs,32 or they can be the sole constitutive elements of the polypeptide chain often associated to a transmembrane segment and a cytoplasmic tail (or GPI-linked). The β barrel IgSF structure was adopted independently in other families such as cadherins, calycins, lipocalin, etc., and the super (or über) family has hundreds of members and has been selected for several different functions. These functions are somehow related, almost all involved in protein-ligand interactions. The vertebrate lymphocyte surface can express 30 different IgSF members simultaneously.

IgSF domains are commonly classified according to different domain constitution in their β strands and loops.31,33 All conform to the stable shape of a β barrel consisting of two interfacing β sheets, usually linked by a disulfide bridge. There are three types of domains: variable (V), and two types of constant (C1 and C2); the so-called I set domain is intermediate between the C1 and C2. The V domain is most complex with more strands (C′ and C″), which make up complementarity determining region (CDR)2 in conventional Igs and TCRs. C1 domains lack these strands entirely, and C2/I domains have varying sizes in the C′/C″ region. V domains, either alone (eg, the new antigen receptor [NAR]) or in association with another V domain (eg, Ig H/L), recognize the antigenic epitope and are therefore the most important elements for recognition. Domains with the typical V fold, whether belonging to the true V-set or the I-set, have been found from sponges to insects (eg, amalgam, lachesinm and fascicilin) and even in bacteria. The mollusk fibrinogen-related proteins (FREPs, described in the following) have one or two V-like domains at their distal end, associated with a fibrinogen-like domain. For V domains, the interface between dimers is the beta strand bearing the C, C′, C″, F, and G strands, so that in Igs the CDR3 are in the center of the binding site; for C domains, the other beta strand bearing the A, B, D, and E strands forms the interface.

The binding capacities of V domains in molecules besides Ig/TCR can reside in different areas of the molecule (interstrand loops, A-A′ strand, F strand), while in Ig/TCR, CD8, and certain NKR, these regions are the targets for variation in shape and charge. The binding capacities can be modulated whether one domain acts as a single receptor unit (eg, IgNAR) or whether it is associated with a contiguous domain (eg, KIR, variable domain chitin-binding protein [VCBP]) or with another polypeptide chain (eg, TCR, Ig). In the case of a dimer, the binding capacity can again be modulated by the presence or the absence in the G strand of a diglycine bulge, which can modify the space between the faces of the Ig domain. In several cases, the sites responsible for binding are known (KIR, Ig, TCR); in many other cases, they are not known but inferred from crystal structures and/or variability plots (leukocyte immune-type receptor [LITR], chicken Ig-like receptor [CHIR], triggering receptor expressed on myeloid cells [TREMs], DSCAM, hemolin).


Invertebrate Cell Types

Examples of conservation of fundamental mechanisms of genetic control of developmental pathway between protostomes and deuterostomes, even in the absence of homology of the cells or organ considered, are accumulating: the organization and expression of the homeotic gene clusters and eye formation through the function of a complex of proteins including Pax-6.34 The cell types involved, besides direct interaction with the external layer of cells on the skin, or external teguments, have been specialized cells of mesodermal origin devoted to defense. This is true for all coelomates where effector cells have been identified, but recent data have shown that cnidarian diploblastic organisms that lack mesoderm also have many of the same genetic systems as the coelomates35 (see Fig. 4.1). The cells can be circulating or sessile, and often are found associated with the gut. Several morphologically distinct hemocyte types in insects cooperate in immune responses: they attach to invading organisms and isolate them, trapping larger organisms in nodules or forming large multi-cellular capsules around them. Indirect evidence for the role of hemocytes in immune responses can be derived by contrasting properties of such cells in healthy and parasitized animals (ie, modifications in adherence and opsonic activity).

All animals show heterogeneity of the free circulating cells, generically called hemocytes (arthropods), coelomocytes, amebocytes (annelids, mollusks, and echinoderms), or leukocytes (sipunculids). However, the repertoire of insect “blood cells” is clearly less heterogeneous than that of vertebrates. Basically, three or four types of cell lineages can be identified in Drosophila (Fig. 4.336): plasmatacyte, crystal cells, and lamellocytes, and an equivalent number in Lepidoptera (butterflies). The functional roles they play consist of immune defense, disposing of apoptotic and other debris, contributing to the ECM, and modeling of the
nervous system. The immunity role encompasses phagocytosis, encapsulation, and sometimes production of effector molecules (see Fig. 4.336). These roles all require recognition of pathogen-associated molecular patterns (PAMPs) or self-derived defense molecules (ie, opsonization) at the cell surface.37,38,39,40

FIG. 4.3. Types of Immune Responses and Cells in Insects, with Drosophila as the Prototype. Secreted defense molecules are made by fat body cells in response to pathogens, which either act as direct effector molecules or feed back on hemocytes to stimulate their defense functions. Unpaired (UPD) is produced after virus infection, stimulating defense molecule upregulation via the JAK/signal transducer and activator of transcription pathway. Not shown is the RNAi pathway, also induced upon virus infection. Details on stimulation of the toll and immune deficiency pathways are found in Figure 4.4. This figure was modified from Lemaitre and Hoffmann.36

Only in a few organisms has the characterization of hemocyte lineages gone beyond morphologic or basic physiologic functions. Among these free circulating cells are always one or more types that can undergo phagocytosis. Different cells participate in encapsulation, pinocytosis, and nodule formation, and can upon stimulation produce a great variety (within an individual and among species) of soluble effector molecules that may eliminate the pathogen. In an attempt to integrate all of the data available in invertebrates, Hartenstein has proposed a unified nomenclature of four basic types: prohemocytes, hyaline hemocytes (plasmatocytes or monocytes), granular hemocytes (granulocytes), and eleocytes (chloragocytes).37 These designations will be found in the following description of the blood cell types.

Earthworm (annelid) coelom-tropic coelomocytes are called eleocytes. They contain glycogen and lipid and are considered of the same lineage as the chloragocytes involved in the production of immune effector molecules such as fetidin or lysenin. The phagocytic cells of annelids are apparently granular “leukocytes” derived from the somatopleura and involved in wound healing, whereas the ones derived from the splanchnopleura participate in immunity. Heterogeneity of annelid coelomocytes is not encountered in primitive oligochaetes or in hirudinae (leeches). Phagocytic coelomocytes show an acid phosphatase activity and a beta glucoronidase activity.41 The large coelomocytes and free chloragocytes (eleocytes) in the typhlosole of Eisenia foetida appear to produce the bacteriolytic and cytolytic factor lysenin.42 From electron microscopy studies, macrophage-like cells seem to be involved in graft rejection. In the closely related sipunculid phylum, two main cell types can be identified in the blood: erythrocytes (a rare occurrence in invertebrates) and granular leukocytes. The latter are capable of cytotoxicity and even have dense granules reminiscent of vertebrate “NK cells.”43

Two developmental series have been described in mollusks, the hyaline and granular cells, but cephalopods seem to have only one lineage. They participate in encapsulation, with hemocytes adhering around the foreign body like Drosophila lamellocytes. Phagocytosis is carried out by the wandering granular cells, which resemble vertebrate monocytes/macrophages. In oysters, electron microscopy revealed different types of circulating hemocytes, including granular hemocytes resembling the granulocytes of sipunculus mentioned previously.44,45 In crustaceans, the situation is similar to that in mollusks, with three main populations identified based again on the presence of granules in the cytoplasm.
The hyaline cells are involved in the clotting process and the granular cells in phagocytosis, encapsulation, and the prophenoloxidase (PPO) pathway. The hematopoietic organ is located on the dorsal and dorsolateral regions of the stomach.39 Crustacean hemocytes can now be cultured and their response to virus can be examined,46 and markers of the three hematopoietic lineages are available.47

In insects, the so-called prohemocytes are believed to be stem cells. They are only found in the embryonic head mesoderm and the larval lymph glands but not in the hemolymph. However, prohemocytes are frequent in both the hemolymph and hematopoietic organs of the lepidopteran Bombyx (silkmoth). Plasmatocytes of Drosophila have a phagocytic function. This type of hemocyte is equivalent to the granulocytes of Bombyx, which play a key role in phagocytosis in larvae. Lamellocytes seem to be unique to Drosophila, but they are probably the equivalent of the lepidopteran plasmacytoid cells. Their precursors reside in the larval lymph gland, where they differentiate in response to macroscopic pathogens, following a brief phase of mitosis linked to the presence of the pathogens and under hormonal control via ecdysone. The transcription factors (GATA, Friend-of-GATA, and Runx family proteins) and signal transduction pathways (toll/NF-κB, Serrate/Notch, and JAK/STAT) that are required for specification and proliferation of blood cells during normal hematopoiesis, as well as during hematopoietic proliferation that accompanies immune challenge, have been conserved throughout evolution. The specific differentiation of lamellocytes requires the transcription factor Collier. The mammalian early B-cell factor, an ortholog of Collier, is involved in B-cell differentiation in mice. The Drosophila crystal cells are responsible for melanization through the PPO system (see subsequent discussion). In silkworm oenocytoids, crystallike inclusions are also found, but they disappear later after bleeding.36,37,40

Echinoderm coelomocytes express a diversity of effector functions, but no studies of lineages have been performed. In echinoderms, the number of different coelomocytes may vary according to the particular family. The sea urchin is endowed with at least four cell types, only one of which only is phagocytic and corresponds to the bladder or filiform forms. Another type is described as the round vibrating cell involved in clotting. Pigment cells (red spherule cells) have been detected ingesting bacteria; the morphology of phagocytic cells can vary enormously, precluding any easy classification.48

In tunicates, amoeboid cells circulate in the blood and are involved in a large number of processes, such as clotting, excretion, nutrition budding, and immunity. Large numbers of blood cells are present (average of 107 per mL) in the blood of ascidians such as Ciona. Hemoblasts are considered to be undifferentiated cells, perhaps the equivalent of the prohemocytes of arthropods or the neoblasts of annelids. Blood cells in ascidians proliferate in the connective tissue next to the atrium. The pharyngeal hematopoietic nodule of this animal contains a large number of hyaline and granular cells called “leukocytes” with supposed intermediary forms of differentiation between blast and granular mature types. The granular form is likely to be involved in postphagocytic activity, like in earthworms.49 Adoptive transfer of alloimmunity in the solitary tunicate Styela can be achieved via lymphocyte-like cells.

In Amphioxus, cells with phagocytic capacity have been identified in the coelom with a morphology resembling more the phagocytic echinoderm cells than urochordate blood cells, a fact that is consistent with the new systematic positions of amphioxus and echinoderms.50 Both free cells and the lining of the perivisceral coelom are able to phagocytose bacteria. Cells with the morphologic appearance of lymphocytes and expression of lymphocyte-specific genes were detected in this species, the earliest identification of such cells in phylogeny.51

Hematopoiesis in the Invertebrates

The history of the hemocytes is associated with that of the mesoderm among triploblastic organisms. The bilaterian ancestor was most likely a small acoelomate or pseudocoelomate worm similar to extant platyhelminths (flatworms) (see Fig. 4.1). A specialized vascular system or respiratory system was probably lacking, although cells specialized for transport and excretions were likely present because they exist in most extant bilaterian phyla. One can further assume that groups of mesoderm cells in the bilaterian ancestor could have formed epithelial structures lining internal tubules or cavities (splanchnopleura). In coelomates, the mesoderm transforms into an epithelial sac, the walls of which attach to the ectoderm (somatopleura) and the inner organs (splanchnopleura). Blood vessels are formed by tubular clefts bounded by the splanchnopleura. Excretory nephrocytes are integrated into those vascular walls, which also gives rise to blood cells circulating within the blood vessels (the pronephros of anurans and head kidney of teleost fish are important hematopoietic organs in vertebrates). Thus, further evolutionary changes separated the three systems, but there was a close original connection between them.

The origin of hemocytes has been investigated mainly in arthropods. When examining principles that govern hematopoietic pathways, similarities have been observed with vertebrates, raising interesting evolutionary issues.37,40 In jawed vertebrates, the yolk sac or its equivalent gives rise to blood precursors that are primarily erythroid in nature (but see the following: recent data suggest that B1 cells and macrophages are also derived from this embryonic tissue). In succession, definitive hematopoiesis occurs in the aorta/gonad/mesonephros (AGM) region of the embryo, encompassing all of the different cell types and multipotent progenitors (although this is controversial). Like in the vertebrates, hematopoiesis in insects is biphasic. One phase occurs in the embryo and the other during larval development. Additionally, these waves occur in distinct locations of the embryonic head mesoderm and the larval lymph gland. In the early embryo expression of the GATA factor, serpent (Srp) can be detected in the head mesoderm. This GATA family of zinc-finger transcription factors is conserved from yeast to vertebrates where they are involved in various aspects of hematopoiesis. Blood cell formation in the head
follows Srp expression, whereas in the lymph gland there is a long delay between Srp expression and the appearance of the lymph gland-derived hemocytes.38 Hematopoiesis in the head mesoderm and yolk sac may be related evolutionarily. A further similarity occurs at the AGM/lymph gland level in Drosophila. The lymph gland develops from a part of lateral mesoderm that also gives rise to vascular and excretory cells, much like the vertebrate AGM. The conserved relationship between blood precursors and vascular and excretory systems is intriguing.

Hematopoiesis and Transcription Factors in the Vertebrates

As mentioned previously, transcription factors of the family PAX 2/5/8; GATA 1, 2, 3; ets/erg; and runt domain-containing factors have been cloned in several invertebrates. One plausible model to explain the genesis of true lymphocytes in vertebrates is that closely related members of transcription factor families are the result of a relatively late divergence in lineage pathways followed by specialization of duplicated genes.52 These duplications could be those that apparently occurred during the history of chordates (see MHC and “Origins” section2). Within deuterostomes, the generation of true GATA 2 and 3 probably occurred after echinoderms diverged from the chordate branch and the GATA, ets, early B-cell factor, and Pax5-dependent pathways of T-/B-cell differentiation are thus specific to vertebrates. It is already known that lampreys express a member of the purine box 1/spleen focus-forming virus integration-B gene family that is critically and specifically involved in jawed vertebrate lymphocyte differentiation. Expression has been detected in the gut, which may be related to the fundamental nature of “gut-associated lymphoid tissue (GALT)” as a lymphoid cell-producing organ.

In vertebrates, the generation of T-, B-, and NK lymphocyte lineages from pluripotent hematopoietic stem cells depends on the early and tissue-specific expression of Ikaros (and related loci), which by means of alternative splicing produces a variety of zinc-finger DNA-binding transcription factors. The orthologs of Ikaros, Aiolos, Helios , and Eos have been identified in the skate Raja eglanteria, where two of the four Ikaros family members are expressed in their specialized hematopoietic tissues (epigonal and Leydig’s organs; see subsequent discussion) like in mammals.52 In lower deuterostomes, single genes that seem to be related to the ancestor of the Ikaros and Ets family of transcription factors exist, further suggesting that the division of labor between the family members in the jawed vertebrates was a result of en bloc duplications.52,53 The conservation of Ikaros structure and expression reinforces its role as a master switch of hematopoiesis. We discuss this topic further in the lymphoid tissues section.

Responses of Hemocytes

In this section, we simply touch on classical and specific responses in the invertebrates, responses that are more universal are found in the innate immunity section. Proliferation of hemocytes upon stimulation is an unresolved issue in the invertebrates; clearly, clonal selection resulting in extensive proliferation is not the rule. The turnover of cell populations has been the object of numerous, often unconvincing experiments. Still, new data have emerged, and it is clear that in several invertebrates, proliferation occurs in certain cell types following encounters with pathogens. Very little cell proliferation occurs in the circulation of crayfish, but cells in the hematopoietic tissue divide after an injection of the PAMP β1-3-glucan. New cells in the circulation developed into functional synthetic germinal centers (GCs) and GCs expressing the proPO transcript. RUNT protein expression was upregulated prior to release of hemocytes. In contrast, proPO was expressed in these cells only after their release into the circulation.54

By contrast to the study of transcription factors that regulate hematopoiesis, relatively little is known about cytokines that drive hematopoiesis among invertebrates. It was reported that differentiation and growth of hematopoietic stem cells in vitro from crayfish required the factor astakine, which contains a prokineticin domain55; prokineticins are involved in vertebrate hematopoiesis, another case of conservation during the evolution of growth factors and blood cell development.

Parasitization of Drosophila by the wasp Leptopilina boulardi leads to an increase in the number of both lamellocytes and crystal cells in the Drosophila larval lymph gland. This is partially due to a limited burst of mitosis, suggesting that both cell division and differentiation of lymph gland hemocytes are required for encapsulation. In genetic backgrounds where ecdysone levels are low (ecdysoneless), the encapsulation response is compromised and mitotic amplification is absent. This ecdysone-dependent regulation of hematopoiesis is similar to the role of mammalian steroid hormones such as glucocorticoids that regulate transcription and influence proliferation and differentiation of hematopoietic cells.56


To obtain phagocytosis at the site of microorganism invasion implies recruitment of cells via chemoattraction. In vertebrates, this can be done by several categories of molecules such as proinflammatory chemokines/cytokines or the complement fragments C3a and C5a (as mentioned in the following section, C3a fragments as we know from mammals may be found in tunicates but not other nonvertebrates; yet, C3 may be cleaved in different ways in the invertebrates). C3b, mannose-binding lectin (MBL), and many other lectins can function as opsonins, and recent studies of the PGRPs, thioester-containing proteins (TEPs), DSCAMs, and eater have added to this repertoire.36 Ingestion follows phagocytosis, and then killing occurs by an oxidative mechanism with the production of reactive oxygen radicals and nitric oxide. These mechanisms are conserved in phylogeny, and other basic mechanisms are being examined in more detail now in protozoan models.57 Signaling pathways in common between vertebrates and the protozoon Dictyostelium include involvement of cyclic AMPs, integrins, and perhaps mitogen-activated protein (MAP) kinase cascades. Unique to all jawed vertebrates studied to date, the activation of
phagocytes also leads to upregulation of the antigen processing machinery, costimulatory molecules, and proinflammatory cytokines that can enhance adaptive immunity.


Immune responses are often subdivided into recognition, signaling, and effector phases, which are subjected to different pressures, defined by whether orthology is maintained and the relative divergence rates of the genes responsible for the various phases. Recognition molecules are from evolutionarily conserved families, but as described previously, their genes are subjected to rapid duplication/deletion so that orthology is rarely preserved. By contrast, signaling pathways can be conserved (see Fig. 4.5), despite the fact that the genes are often divergent in sequence. Effector molecules can either be extremely conserved (eg, reactive oxygen intermediates) or extremely divergent to the point of being species-specific (eg, AMP). Here, we break the immune response down into these three phases, beginning with the recognition phase.

Initiation of an immune reaction can theoretically involve either the recognition of nonself, altered self, or the absence of self. Nonself-recognition can take place with receptors (pattern recognition receptors [PRRs]) that detect PAMPs, which were originally defined by Janeway and colleagues as evolutionarily conserved epitopes displayed by pathogens but not host cells.58,59 The second mode, altered self, is typified by molecules that are induced in self-cells during infections and recognized by conserved defense molecules, similar to the SOS systems mentioned in the MHC section, or by peptide presentation on MHC molecules. A third mechanism, “am I still myself,” depends upon recognition of self-tags and their changes in expression60 (eg, NK recognition of self-MHC molecules through KIR and C-type lectins). These latter two mechanisms have not been described in the invertebrates for immune defense against pathogens, but it would not be surprising if they were revealed in the future, considering the new features of invertebrate immune systems that have been discovered recently and the usage of this mode of recognition in many invertebrate histocompatibility systems.

Whether the invader is related to its host (cells from individuals of the same species or cells from a parasitoid) or are very distant from the host (fungi and bacteria in metazoa), there are different principles of recognition. Yet PAMP determinants have been identified on very different organisms—sugars such as β1-3 glucan of fungi, lipopolysaccharide (LPS) and peptidoglycans of bacteria, phosphoglycan of some parasites, and especially nucleic acids of bacteria and viruses—and they can trigger similar cascades of events. The foreign ligand can be bound by a molecule in solution that initiates an effector proteolytic cascade (eg, clotting or the complement cascade). On the other hand, a proteolytic cascade can be initiated and result in the production of a self-ligand that interacts with a cell surface or endosomic or cytosolic receptor. In this way, there need not be a great diversity of cell surface receptors, especially in the absence of clonal selection.

Of the over 1 million described species of animals (see Fig. 4.1), approximately 95% are invertebrates representing 33 phyla, some with one species (Placozoa, Cycliophora) and others with over 1 million (Arthropoda). Because they have major differences in body plans, development, size, habitat, etc., wildly different types of immune systems in diverse species should be expected. Early studies of invertebrate immunology reached no consensus of how immunity should be examined, but because vertebrate cellular adaptive immunity was often defined (indeed, was discovered for T cells) through transplantation reactions, attempts to reveal specific memory by allograft rejection were often used. After many unsuccessful attempts to demonstrate memory of such responses (see the following) and after extensive molecular studies, a consensus was reached that an invertebrate adaptive immune system involving somatic generation of antigen receptors and their clonal expression was highly unlikely. However, the term “innate” is rigid and masks the possibility of other somatic alterations of invertebrate immune system molecules, as will be discussed.61,62 We will categorize the molecules based on their location within the cell.

Intracellular Recognition

Nucleotide-Binding Domain Leucine-Rich Repeat (NLR)

One major group of intracellular sensors in animals and plants is the NLR family (see Figs. 4.2 and 4.4).14,63 Each of the family members has a central NB/NACHT (nucleotide-binding domain) and C-terminal LRR used for recognition, and a unique N-terminal domain. The subfamilies are defined by their N-terminal domains, coiled-coil and toll-IL-1 receptor (TIR) in plants, and baculoviral inhibitory repeat, caspase-recruitment domain (CARD), pyrin domain (PYD), and activation domain in animals (see Fig. 4.4). Thus far, the NLRs have been found in deuterostomes but not protostomes (see Fig. 4.1), which is surprising considering that plants have intracellular defense proteins with a similar structure, and seem to have been derived via convergent evolution.14

FIG. 4.4. Structure and Major Functions of Nucleotide-Binding Domain Leucine-Rich Repeats in Plants and Animals.14 Although the structures are quite similar, and recognition can be analogous, these families seem to have arisen via convergent evolution (see text for more details).

The specificity of plant NLR depends principally on the LRRs, and these are targets for diversifying selection, as described previously for multigene families. Plant NLR can recognize pathogen effectors (pathogen-derived avirulence factors), or viral and fungal PAMPs directly via the LRR domains, or via modifications of a host target that interacts with the N-terminal domains, altered self if you will (see Fig. 4.4). This type of activation in plants is termed effector-triggered immunity, which is specific of the NLRs (see Fig. 4.4). A host of downstream effectors are generated, some involved in defense but others activating cell death pathways.14,64

Best described in mammals, the nucleotide oligomerization domain (NOD)/NLR recognizes PAMPs such as peptidoglycan and induces an autophagy-mediated destruction of intracellular pathogens as well as production of proinflammatory cytokines; however, it remains controversial whether there is direct or indirect recognition of the PAMPs (similar to some responses in plants). Polymorphisms in the NOD proteins are associated with inflammatory bowel diseases. The NLRP and NAIP NLRs are activated by in ways that are not well understood by various PAMPs or dangerassociated molecular patterns and form inflammasomes, best known for the activation of caspase 1 and the processing of pro-IL-1beta (or mature IL-18) for release from cells.14,65 The founding member of the family, CIITA, has long been known to upregulate class II genes (and associated genes, such as cathepsins and invariant chain), and its function is somewhat outside the norm. Another member, NLRC5, has been shown to upregulate MHC class I expression, but the mechanism is unknown.66 While the shuttling of the vertebrate NLRs CIITA and NLRC5 to the nucleus seems to be a derived characteristic, movement of plant NLRs into the nucleus to activate transcription occurs, either directly or after recruitment of transcription factors like WRKY described in the following.

NLRs are expressed by echinoderm coelomocytes, again representing a highly diversified family48 (> 200 members, similar to the TLRs and SRCRs). As mentioned, it is surprising that these genes do not seem to be represented in protostomes, and thus the emergence of the family in plants and deuterostomes occurred through convergent evolution.67 On the contrary, in the vertebrates a search of the Danio rerio (and other teleosts) database have yielded a large number of NLR sequences, more similar to the situation in plants.63 In humans, most NLR genes are encoded in clusters on chromosomes 11p15, 16p12, and 19q13, where six sequences are found in a single telomeric region.

Rig-I-Like Receptors (RLR)

The retinoic acid-inducible gene (RIG)-I is an intracellular defense molecule that is unrelated to the NOD proteins, with N-terminal CARD and C-terminal helicase domains.68 With the helicase domain, RIG-1 binds to an uncapped 5′ phosphate group, which is diagnostic of viral RNAs. RIG-I also recognizes short double-stranded RNAs, while a second member of this family MDA5 recognizes long double-stranded RNAs. These molecules contain two CARD domains at the N-terminus, a DEXDc domain, a helicase domain, and a regulatory domain. Ligands bind to the regulatory domain, inducing a conformational change leading to interaction with the adaptor protein MAVS (or IPS-1) and ultimately to the induction of type I interferons (IFNs). A third member of the RLR family is LPG2, which lacks the CARD domain; this molecule was originally believed to be a negative regulator of RIG-I/MDA5-induced signaling, but that has been called into question.

This family is found in all of the vertebrates and in lower deuterostomes, such as amphioxus and echinoderms.63 Somewhat surprisingly, the RLR family is only mildly expanded in sea urchins (12 members). While there is no report of bonafide RLR family members in protostomes but RLR activity is present,69 the cnidarian sea anemone has been reported to have a RLR homologue,70 again showing the importance of studying this taxon for the emergence of immune-related molecules.

Cytosolic Deoxyribonucelic Acid (DNA) Sensors

There are four mechanisms of cytosolic DNA sensing, three of them, the DNA-dependent activator of IFN-regulatory factor, IFI16, and RNA polymerase III (which converts viral DNA into RNA recognized by RIG-I), induce type I IFN production through the intermediate STING, a protein associated with the endoplasmic reticulum (ER).71 In addition to being an intermediate in IFN upregulation (through IFN regulatory factor-3), STING is also a PRR in its own right, responding to the PAMP cyclic dinucleotides produced by intracellular bacteria like Listeria; this suggests that STING was originally a PRR, and then was co-opted by several other PRR sensors to induce effector functions.72 IFI16 is part of the AIM2-like receptor family; the founding member, AIM2, like the inflammasome, activates caspase-1 to process pro-IL-1beta.

These new molecules/mechanisms have so far only been studied in mammals, but it would be surprising if they were not operative (at least) in other vertebrates as a way to combat DNA viruses. To date, they have not been found in the sea urchin or jawless fish databases.

Tripartite Motifs

Tripartite motif (TRIM) proteins belong to a family induced by type I and II IFNs, with 68 members in the human genome. TRIMs are involved in resistance against pathogens in mammals, especially lentivirus (eg, human Trim 5α is a retroviral restriction factor with activity against human immunodeficiency virus).73,74 The activity of proteasomes, responsible for cytosolic protein degradation, has been implicated in the TRIM5α-dependent attenuation of retroviral reverse transcription. TRIMs contain an N-terminal moiety composed of three modules: RING (with an E3 ubiquitinase activity)-Bbox-coiled xoil motif followed by different C-terminal domains. TRIMs fit into two major categories by the function of their C-terminal domain: Category 1 with a PHD, MATH, ARF, FNIII, exoII, or NHL domains, and Category 2 with a B 30-2 domain shared with butyrophilins and other proteins and essential for ligand binding.75 The tertiary structure of TRIM21 revealed two binding pockets in the B30.2 domain formed by six variable loops.76

Despite reports to the contrary, the TRIM family is ancient.77 The family has been greatly diversified in vertebrates and in a taxon-specific manner, as observed for many multigenic immune families.77 The zebrafish genome harbors a striking diversity of a subset of Category 2 TRIMs not encountered in mammals, called finTRIM, with 84 genes distributed in clusters on different chromosomes. This subset, specific of teleosts, is overexpressed after virus infection in the trout. In the B30.2 domain, residues under positive selection are concentrated within a viral recognition motif first recognized in mammalian Trim 5α.78

Finally, trim genes encoding Category 2 proteins are preferentially located in the vicinity of MHC or MHC gene paralogs both in fish and human, suggesting that they may have been part of the ancestral MHC.79 The B30.2 domains most closely related to finTRIM are found among NLRs, indicating that the evolution of TRIMs and NLRs was intertwined by exon shuffling.80 Exon shuffling was likely responsible for the presence of the B30.2 domain in butyrophilin and TRIM genes where it was perhaps favored by the proximity of gene in the MHC. It has been argued that during evolution the combination of SPRY and PRY motifs that build up the B30.2 domain were selected and maintained for immune defense.81

P47 GTPases

Among IFN-inducible immunity-related genes with an interesting evolutionary history, immunity-related GTPases (IRG/p47 in mouse) function as cell-autonomous resistance factors by disrupting the vacuolar membrane surrounding parasites (eg, toxoplasma).82 The IRG system studied primarily in mice (absent in humans83) is present throughout mammals but the number, type, and diversity of genes differ greatly even between closely related species, one of the common themes in immunity described previously.

Concerning the evolutionary origin of the IRGs, the homologs of zebrafish and pufferfish seem to form two teleost-specific groups, another common theme in this chapter. Their putative promoter regions suggest an expression regulated by an IFN. Homology searches failed to find any convincing ancestral form to the vertebrate IRG proteins in the genomes of invertebrates, but in phylogenetic trees vertebrate IRGs clusters with some families of bacterial GTPases. Thus, IRGs may be derived from a prokaryotic GTPase acquired by a horizontal transfer subsequent to the appearance of eukaryotes.82

Integral Membrane (and Sometimes Secreted) Proteins

C-Type Lectins

Lectins were originally defined by their ability to bind carbohydrates in a calcium-dependent manner (how C-type lectins got their name84) and some have been described previously (and throughout the chapter). They are found in many phyla in both the deuterostome and protostome lineages in both membrane and/or secreted forms (eg, MBL described in the following). A large number of C-type lectins have been uncovered in the mosquito genome, and some are involved in bacterial defense through direct binding and others through the melanization reaction.85 Some C-type lectins are encoded in the NKC, including the Ly49 and NKG2 families, as well as CD94 and several other members of the family are central to NK-cell function in mammals. A molecule resembling CD94 but unlikely to be an ortholog (see the following) has been detected on a subset of hemocytes in Botryllus and Ciona, the functions of which are unknown.86 Another large gene family that is implicated in the response of the sea urchin to immune challenge includes 100 small C-type lectins,48 consistent with the enormous expansion of several immune defense families in this animal. We describe other functions of C-type lectins in the NK cell sections.

Scavenger Receptors

The SRCR superfamily is an ancient (from sponges to chordates) and highly conserved group of cell surface and/or secreted proteins, some of which are involved in the development of the immune system as well as the regulation of both innate and adaptive immune responses; they are especially well known for their function in macrophages.87 Group B SRCR domains usually contain eight regularly spaced cysteines that allow the formation of a well-defined intradomain disulfide-bond pattern. Scavenger receptors are best known for their housekeeping function of taking up lipids modified by oxidation or acetylation, but they have many other functions as well, such as uptake of apoptotic bodies (eg, croquemort in Drosophila of the CD36 subfamily88).

SRCRs have been studied mainly in the coelomocytes of echinoderms. Within a few hours after bacterial injection, sea urchin coelomocytes upregulate a variety of genes including an extremely diverse family of SRCRs.48,89 A very large number of SRCR domains are present (approximately 1,200), but each individual may express different groups of SRCR genes at different levels (and even with differential splicing). To assume that they are all involved in defense is premature, as SRCR genes can be both up- and downregulated after infection with bacteria. As mentioned, this high level of gene duplication is a general rule in the echinoderms.

In mammals, the SRCR family as a whole is also poorly defined but is involved in endocytosis, phagocytosis, and adhesion, and some members acts as PRRs that bind to LPS or other bacterial components. SRCRs are widespread in the human genome and participate as domains in the structure of numerous receptors (eg, S4D-SRCRB, CD6, CD5-L, CD163), but without showing the high level of duplication seen in the echinoderm families.87

Down Syndrome Cell Adhesion Molecule

DSCAM in Drosophila and other arthropods was described originally by neurobiologists as an axon-guidance protein, dependent upon a large number of isoforms (> 30,000) generated by alternative splicing for the IgSF domains and the transmembrane segment. DSCAM is also involved in insect immunity, expressed in cells of the hematopoietic lineage, and clearly capable of binding to bacteria; like in the nervous system, a large number of splice variants are generated,
clearly different from the ones expressed in neurons.24 In Drosophila, the DSCAM gene is composed of 115 exons, 95 of which encode alternative possibilities for splicing of exons 4, 6, and 9. The molecule consists of 10 IgSF domains and 6 FN domains, and present as either a membrane or soluble form, presumably generated by proteolysis of the membrane form. Each cell expresses only a fraction of the isoform repertoire.

Knock out (RNAi) and anti-DSCAM treatment significantly suppresses phagocytosis, at least in Drosophila. Soluble DSCAM constructs with different exon combinations were found to have differential pathogen-binding properties.90 In addition, suppression of DSCAM in mosquitoes results in an impaired immunity to Plasmodium; exposure of hemocytes to different pathogens in culture gives rise to specific modifications and selection of alternative splicing patterns. A similar finding was made in crustaceans, in which particular DSCAM isoforms were induced in response to different pathogens in one species91 and epitope II was under selection in a study in Daphnia.92 The diversification of DSCAM seems to be specific of arthropods as neither flatworm nor sea urchin nor vertebrate DSCAM are diversified. The vertebrate DSCAM has only two forms, using two alternate TM exons. The cytoplasmic tail can also be modified by alternative splicing that could change its signaling properties by modulation of tyrosine-based motifs.93 Human DSCAM is duplicated on chromosomes 21 and 11, but does not appear to be involved in immunity.

FIG. 4.5. Comparison of Immune Response Induction, Intracellular Pathways, and Immune Outcome in Insect (eg, Drosophila), Vertebrates (eg, human), and Plants (a Composite of Pathways in Monocots and Dicots). Note that the initiation of the response and the outcome(s) in insects and vertebrates are quite different for both the toll/toll-like receptor pathway (left, A) and the immune deficiency/tumor necrosis factor pathways (right, B), but the intracellular signaling pathways are well conserved evolutionarily in insect and vertebrate (details in the text). Note as well that plants use similar molecules for recognition (leucine-rich repeat-containing molecules) and have similar intracellular pathways with kinase cascades, but all of the molecules of recognition, signaling, and effector are derived by convergent evolution as compared to animals. This figure was modified from Beutler et al.152

Peptodoglycan-Recognizing Protein and β1-3 Glucan Receptors

PGRPs are found in a wide range of organisms but have been best studied in insects, where they are classified into short (S) and long (L) forms. S forms are soluble and found in the hemolymph, cuticle, and fat-body cells.94 L forms are mainly expressed in hemocytes as integral membrane proteins where their final structure depends on combinatorial association of different isoforms, modulated by alternative splicing. We provide a short description here, but delve more deeply in the discussion of the insect toll and immune deficiency (IMD) pathways subsequently (Fig. 4.5).36 The expression of insect PGRPs is often upregulated by exposure to bacteria. PGRPs can activate the toll or IMD signal transduction pathways (see the following) or induce proteolytic cascades that generate AMPs, melanization, or induce phagocytosis. PGRPs directly kill bacteria by inducing a suicide mechanism, first demonstrated to be activated by a type of unfolded protein (stress) response in prokaryotes.95 Besides their defense functions, insect PGRPs expressed in the gut are believed to promote homeostasis with commensal bacteria (also discussed briefly in the following). Both soluble and transmembrane forms are present in sea urchins, some with potential catalytic function.48

In vertebrates, PGRPs are all secreted and have direct microbicidal activity. Best studied in zebrafish, PGRPs are
expressed in many tissues such as gills, skin, and intestine, providing immune defense. They are expressed before the development of adaptive immunity and likely provide an important protective role.96 The human PGRP genes are found on the MHC chromosomal paralogs 1q21 and 19q13/p13. All detected splice-variant isoforms bind to bacteria and peptidoglycan. Like the fish molecules, mammalian PGRPs are also positioned at epithelial surfaces and promote intestinal homeostasis by discriminating somewhat between commensal (eg, lactobacilli) and pathogenic bacteria. Knockout mice have increased pathogenic bacteria on mucosal surfaces that induce colitis after injury in the dextran sulphate sodium autoimmune assay.97

β1-3 glucan receptor proteins (β1-3GRPs, formerly known as gram-negative binding proteins (GNBPs) are related to bacterial β1-3 glucanases.98 They are found in insects and other arthropods where they bind bacteria, fungal β-1, 3-glucans, LPS, and/or bacterial lipoteichoic acid (without necessarily showing glucanase activity). An ortholog is present in the sea urchins, but not in vertebrates to date. Drosophila GNBP1 together with PGRP-SA are required to activate the toll pathway in response to infection.36

Toll and Toll-like Receptors

The toll receptors were originally described in Drosophila as genes involved in early development, specifically in dorsoventral patterning. Later, they were also shown to be essential sensors of infection, initiating antimicrobial responses.36,99 This family was then revealed to be a major force in innate immunity in the vertebrates as well.100,101 As mentioned previously, across the metazoa structurally closely related members of the toll family range from not being involved in immunity (in C. elegans and apparently in the horseshoe crab), to being the equivalent of a cytokine receptor (in Drosophila), to being PRR in the vertebrates and invertebrates.102 Six spaetzle-like and eight toll-like molecules have been identified in Drosophila, but only one or two of them are clearly immunity-related.36,102 In jawed vertebrates, they belong to a multigene family of PRR specific for diverse PAMPs and exhibiting different tissue distributions and subcellular locations.27 In humans, many are on chromosome 4p and q (TLR 2, 1, 6, 10) but the others are distributed on chromosomes 9, 1, 3, and X.

Ectodomains of TLRs comprise 19 to 25 tandem repeats of LRR motifs made of 20 to 29 aa capped by characteristic N- and C-terminal sequences. All of the toll receptors are homologous and appear similar in domain constitution among all animals. They also share the TIR domain, which is the intracellular segment shared with the IL-1/-18/-33 receptors of vertebrates, as well as other molecules in plants. TIR domains associate with Myd88 to initiate signaling cascades culminating in the activation of NFκB/Rel (see the following) (see Fig. 4.5).

In Drosophila, the toll dimer is triggered by an interaction with the unique ligand spaetzle, which is the product of a series of proteolytic cascades, with the most critical enzyme identified (spaetzle-processing enzyme). Activation of the cascades triggers the production of antimicrobial peptides (see Fig. 4.5). The specificity of recognition is not achieved at this receptor level but rather in solution via other intermediates (see the following). C. elegans has only one toll receptor, and rather than being antimicrobial responses, it promotes avoidance of a flatworm pathogen upon engagement; the signaling mechanism for the C. elegans TOL (toll) is not evolutionarily conserved (it clearly does not induce the NF-κB pathway) and is under investigation.67 A toll/TLR gene is present in the sea anemone, a cnidarian, but not in other cnidarians such as hydra or coral, which nevertheless have TIR domains associated with other molecules.35 A TIR domain of the toll-receptor types was detected in sponges, but, like IL-1R in vertebrates, it is associated with a receptor with three IgSF domains.103 Plants do not have toll/TLR per se, but do have LRR-containing transmembrane sensors that function in a similar fashion102 (see Fig. 4.5). In summary, toll/TIR arose before the split of protostomes and deuterostomes, but has been lost in some invertebrate groups and has been recruited to perform multiple functions.

The arsenal of TLRs in vertebrates is endowed with specific and diverse capacities. Each vertebrate TLR has its range of specificities and, in addition, combinations of different TLR can create different binding specificities (eg, the association of TLR2 with TLR6 or TLR1 and 2104 or even TLR2 homodimers in regulatory T cells105). This divergence in recognition function is well illustrated by the phylogenetic analysis of the toll and toll-related receptors in different phyla such as arthropods and vertebrates. Toll and related proteins from insects and mammals cluster separately in the analysis, indicating independent generation of the major families in protostomes and vertebrates.102 Consistent with the expansion of SRCR and NLR genes in sea urchins, hundreds of TLRs also were found in this species.48 TLRs of the protostome type are in small numbers (3 members) while the vertebrate type has been enormously amplified, all within a single family (222 members) and most without introns. Vertebrate TLRs do not diverge rapidly and evolve at about the same rate, and while there have been some duplications in amphibians and fish, they are not greatly expanded like in the echinoderms (no more than approximately20 genes in any species).

Signaling Through Innate Surface Recognition Molecules

Four pathways of innate immunity triggering have conserved elements in eukaryotes: the toll/TLRs, the TNF-α/IMD receptors, the intracellular NOD, and the JAK/STAT. Although toll receptors have been found in almost all triploblastic coelomates, most of the work and the elucidation of pathways have been accomplished in Drosophila and Anopheles.36 The diversity of AMPs that can be produced via the toll/IMD pathways is substantial, and as described previously is classified in several categories depending upon the type of pathogen recognized (eg, gram (+), drosocin, gram (-), diptericin; fungal, drosomycin) with different effector functions (see Fig. 4.5). Insect antimicrobial molecules were originally discovered by the late Hans G. Boman and colleagues in 1981, a seminal finding that
heralded the molecular analyses of innate immunity in the invertebrates.106

Toll and Immune Deficiency Pathways

As described, invertebrate toll receptors are homologous to the vertebrate TLR, in the sense that they are integral membrane LRR-containing proteins (see Fig. 4.5). Drosophila toll is activated after it binds spaetzle, the product of a proteolytic cascade activated in solution after the interaction of molecules produced by fungi or gram-positive bacteria with GNBP and PGRP.107 The TIR cytoplasmic domain of the toll receptor then interacts with MyD88 (itself having a TIR domain) followed by Tube and Pelle, leading to activation of the homologous NF-κB system (Cactus or Diff) that then induces transcription of various defense peptides.36,99,108 This is remarkably similar to the cascade of events following activation of mammalian TLRs where after their interaction with PAMPs at the cell surface, a cascade is induced through TLR including MyD88, IRAK, TRAF, TAK1, to NF-κB via the IKK signalosome. Thus, infection-induced toll activation in Drosophila and TLR-dependent activation in mammals reveal a common ancestry in primitive coelomates (or previous), in which defense genes under the control of a common signaling pathway lead to activation of Rel family transactivators.

The Drosophila IMD pathway is employed in responses to gram-negative bacteria109 (see Fig. 4.5). After interaction with the cell surface receptor PGRP-LC mentioned previously, in a cascade similar to the mammalian TNF-αR signaling pathway, Drosophila tak1, an IKK signalosome, and a Relish-mediated (instead of Diff) NF-κB step, results in transcription of antibacterial peptides like diptericin. The Drosophila intracellular pathway is similar to the mammalian TNF-α receptor cascade, which also progresses via a death domain Mekk3, the signalosome, and NF-κB resulting in cytokine production. In both cases, a link to pathways leading to programmed cell death is possible; overexpression of Drosophila IMD leads to apoptosis. When the activation of either the fly toll or IMD pathway is considered, they are analogous to a mammalian cytokine/cytokine receptor system (eg, TNF-α) in which a soluble self-molecule activates cells via a surface receptor. Fitting with the paradigm put forward on recognition, signaling, and effector phases of the immune response, the diversity of external recognition systems is not matched by an equivalent diversity of intracellular signaling pathways.22 There are conserved signaling cascades coupled to the receptors, giving the impression of conservation of the innate immunity pathways; yet, these pathways are also used in development, so which is primordial remains an open question.

Plants do not have toll/TLR, but do have transmembrane LRR-containing microbial sensors, of which FLS2 that binds to flagellin, is best characterized27 (see Fig. 4.5). These molecules do not have an intracellular TIR domain (note that TIR domains exist in plants, but not associated with the TM sensors), but do recruit a kinase of a similar nature to the toll/TLR kinases (the so-called non-RD kinases) to activate downstream mitogen-activated protein kinase cascades. However, as mentioned previously, the NFκB transcriptional system arose early in the animal kingdom110; plants employ a different system of transcriptional activators, the WRKY molecules, which are activated directly by the mitogen-activated protein kinase cascades, similar to transcription factor found in animals, AP1.

Extracellular Soluble Receptor with Effector Cascade

Proteolytic cascades are initiated immediately following interaction of foreign material bound by preformed proteins in solution, and this principle is conserved throughout evolution. Indeed, the proteolytic cascade upstream of production of the toll ligand spaetzle resembles the complement or clotting cascades. The PPO cascade of arthropods leading to melanization and the genesis of antibacterial products described in the following is another example in which peptidoglycans on microbial surfaces initiate the cascade resulting in the degranulation of hemocytes.

The Complement System

The best-studied immune proteolytic cascade that is surprisingly well conserved in the animal kingdom is complement111,112 (Fig. 4.6). In contrast to the other defense molecules that we have discussed, orthologous complement genes can be detected in all of the deuterostomes without a great deal of expansion/contractions of the gene family. The three major functions of complement in jawed vertebrates are 1) coating of pathogens to promote uptake by phagocytes (opsonization); 2) initiation of inflammatory responses by stimulating smooth muscle contraction, vasodilatation, and chemoattraction of leukocytes; and 3) lysis of pathogens via membrane disruption. Additionally, in the vertebrates, C′ is vital for the removal of immune complexes as well as elicitation of adaptive humoral immunity. The focal point of complement is C3, which lies at the intersection of the alternative, classical, and lectin pathways of complement activation. It is the only known immune recognition molecule (besides its homologue C4) that makes a covalent bond with biologic surfaces via a thioester linkage. C3 has a nonspecific recognition function, and it interacts with many other proteins, including proteases, opsonic receptors, complement activators, and inhibitors. In the alternative pathway, C3 exposes its thioester bond in solution, and in the presence of cell surfaces lacking regulatory proteins that block C3 activation (by cleaving it into iC3b), it associates with the protease factor B (B or Bf). After binding to C3, B becomes susceptible to cleavage by the spontaneously active factor D, resulting in formation of the active protease Bb that in combination with the covalently attached C3 cleaves many molecules of C3 in an amplification step. Another nonadaptive recognition system, the lectin pathway, starts with the MBL (or the lectin ficolin), which is a PRR of the collectin family that binds mannose residues on the surface of pathogens and can act as an opsonin. MBL is analogous to C1q with its high-avidity binding to surfaces by multiple interaction sites through globular C-terminal domains, but apparently it is not homologous to C1q. Like C1q, which associates with the serine proteases C1r and C1s, the MBL-associated serine proteases (MASPs) physically interact with MBL and not only activate
the classical pathway of complement by splitting of C4 and C2 (the same function as C1s; MASP2 appears to be the active protease), but also can activate the alternative pathway in ways that are not understood and thus completely bypass the classical pathway. Indeed, MASP-1 and -2 are homologs of C1r and C1s (see Fig. 4.6). Both C1q and MBL can promote the uptake of apoptotic bodies by phagocytes, via collectin receptors. Another lectin, ficolin, can also initiate the MASP pathway,113 and it would not be surprising if other activators were discovered in the future (eg, the ancient molecule C-reactive protein is also capable of activating C′). Finally, the classical pathway, which is dependent upon antibody molecules bound to a surface, results in the same potential effector outcomes described previously for the alternative pathway. Novel molecules initiating this pathway are C1q, C1r, C1s, C4, and C2, as well as specific negative regulatory proteins such as C4-binding protein.

FIG. 4.6. Evolution of the Complement System. The general pathways and appearance of the various components in the phylogenetic tree are emphasized.

C3 and MBL (and ficolin) are vital players in the immediate innate immune response in vertebrates, and both have been described in nonvertebrate deuterostomes.48,112 Thus far, the best-studied invertebrate systems for investigation of C3 evolution are the sea urchin and the ascidians Halocynthia and Ciona, in which C3 and B molecules and genes have been analyzed in some detail. In contrast to the very high levels of C3 found in the plasma of jawed vertebrates, sea urchin C3 is not expressed at high levels but is induced in response to infection in coelomocytes.114 The C3 opsonic function clearly has been identified, but so far initiation of inflammatory or lytic responses (if they exist) has not been obvious. Receptors involved in the opsonization in echinoderms have not been identified, but in the ascidian gene fragments related to the C3 integrin receptor CR3 were identified, and antisera raised to one of the receptors inhibited C3-dependent opsonization.115

Hagfish and lamprey C3-like genes were thought to be ancestral C3/C4 genes because the sequence predicts two processing sites (leading to a three-chain molecule), like C4, but a C3-like properdin-binding site is clearly present.116 However, like C3 in other animals the hagfish protein is composed of only two chains of 115 and 72 kDa, and sea urchin and ascidian C3 sequences predict only two chains (one proteolytic processing site). The lamprey, but not sea urchin C3, has a recognizable C3a fragment known from gnathostomes to be involved in inflammation, so the role of complement in inflammation may be a vertebrate invention (but see the following).

TEPs have been isolated from Drosophila and the mosquito Anopheles, as well as several other arthropods.117,118 While the insect molecules function in a C3-like fashion (opsonization), phylogenetic analysis shows them to be more related to α2-macroglobulin (note that a few insects actually have molecules more related to C3). TEPs in insects function as opsonins, binding to parasites and promoting their phagocytosis or encapsulation. The evolution multimember TEP families in Drosophila and mosquito followed independent evolutionary paths, perhaps as a result of specific adaptation to distinct ecological environments as described in the introduction. The Drosophila genome encodes six TEPs (whereas there are 15 genes in Anopheles, again consistent with the major expansion of many immune gene families in the mosquito), three of which are upregulated after an immune challenge. Mosquito TEPs are involved in killing of parasites, and the reaction is regulated by LRR-containing molecules to avoid destruction of self-tissues; thus, full-blown complement-like systems complete with inhibitors have arisen independently in protostomes and deuterostomes.119,120

C3-like genes are present in cnidarians121,122 and in the horseshoe crab Limulus.123 Good phylogenetic support was obtained for their relationship to C3, as compared to other members of the thioester-containing family like the TEPs. Thus the emergence of C3 as a defense molecule predates the split between protostomes and deuterostomes. A gene resembling the proteolytic enzyme Bf was discovered in these protostomes as well (and in sea anemones), suggesting that the fundamental system was in place a billion years ago (see Fig. 4.1). The lack of C3 in many other protostomes suggests that the ancestral gene was lost and replaced by the TEPs.117,118

In jawed vertebrates and some lower deuterostomes, certain species express more than one C3 gene, suggesting that the innate system might compensate in animals that do not optimally make use of their adaptive immune system.124 Changes in the amino acid composition of the C3-binding site are found that may somehow regulate the types of surfaces bound by the different isotypes.125 Likewise, in lower chordates such as Ciona, C3 and other complement components can be duplicated.115 Diversification of the carbohydrate recognition domains has been observed also in the Ciona MBP family (nine members).

Like Ig/TCR/MHC, the classical pathway and the terminal pathway membrane-attack complex (MAC) appears first in cartilaginous fish.112 However, as MBL can activate the classical pathway in mammals, it is possible that some portion of this pathway exists in prejawed vertebrates. Nevertheless, C4 and C2 genes have not been detected to date in jawless fish or invertebrates. A bonafide C2 homologue has only been identified to the level of amphibians, although duplicate B genes were isolated from cartilaginous fish and teleost fish that may function both in the classical and alternative pathways. The lytic or MAC pathway, which is initiated by the cleavage of C5 into C5a and C5b, also has not been described in taxa older than cartilaginous fish. Thus, opsonization and perhaps the induction of inflammatory responses were the primordial functions of the lectin/complement pathways. However, a complementary DNA clone for CD59, a molecule that inhibits MAC formation in self-cells, was identified from a hagfish library, and some of the terminal components of the pathway have been detected in lower deuterostomes with no described functions.116 Interestingly, proteins with the MAC/perforin domain have been detected throughout the animal kingdom,48 and some are even involved in cytotoxic reactions; however, it seems that only vertebrates have bonafide terminal C′ components that are highly evolved for targeted destruction of cell membranes. The perforin gene itself also seems to have arisen in gnathostomes, from an ancient MAC/perforin domain-containing gene, macrophage-expressed gene 1 protein, which dates back to sponges; thus, cellular cytotoxic reactions in the invertebrates described in the following must use novel cytotoxic effector molecules.126

C3, C4, C5, and α2m (and TEP) are members of the same small family. A cell-surface-expressed (GPI-linked) member of this family, CD109, has been shown to associate with the transforming growth factor (TGF)-β receptor and modulate its expression.127 The protease inhibitor, α2m, clearly present in invertebrates (protostomes and deuterostomes) and vertebrates, is thought to be the oldest, but obviously this must be viewed with caution considering the data in cnidarians. Along with its ability to bind to and inactivate proteases of all known specificities through a “bait region,” it also has been shown to be opsonic in some situations. α2m, C3, C4, and CD109 (as well as the TEPs) have internal thioester sites, so this feature is primordial; C5 subsequently lost the site. The first divergence probably occurred between α2m and C3, with C5 and then C4 emerging later in the jawed vertebrates.128 Consistent with Ohno’s vertebrate polyploidization scheme is the fact that C3, C4, and C5 genes are located on three of the four previously described paralogous clusters in mammals, and this is also fits with the absence of classical (no antibody) and lytic (no MAC) pathways in phyla older than cartilaginous fish.2 α2m is encoded at the border of the NKC in mice and human, and there are similarities between these regions and the other MHC paralogs (see Fig. 4.13). The C3a and C5a receptors that promote the inflammatory responses upon complement activation have been identified in several vertebrates and (perhaps) some lower deuterostomes; like the chemokine receptors they are G-protein coupled receptors whose genes may also be found on the ohnologs (C3aR, chr 12p13; C5aR, chr 19q13). If indeed such receptors are found in the prejawed vertebrates
as suggested by recent pioneering experiments in Ciona and Styela, it will be interesting to determine whether they are involved in some type of inflammation, thought to be the domain of the vertebrates.129

Melanization (Prophenoloxidase Cascade)

A major defense system in invertebrates is the melanization of pathogens and damaged tissues,130 popularized by poor Gregor in Kafka’s Metamorphosis, when the cockroach Gregor undergoes a melanization reaction from an apple thrown into his thorax by his father. The process is controlled by the circulating enzymes PPO and phenol oxidase. The system is activated by β1-3GRP, PGRP, LPS-binding proteins, and other proteins that can bind to various PAMPs (see Fig. 4.3). The complexes launch a cascade of serine protease activities resulting in cleavage of the pro-form of a prophenoloxidase-activating enzyme into the active form that in turn activates the PPO into phenol oxidase. This leads to the production of quinones and finally melanin. Melanization can completely inhibit parasite growth, whereas concomitant with PPO activation, many other immune reactions are initiated, such as the generation of factors with antimicrobial-, cytotoxic-, opsonic-, or encapsulation-promoting activities. The presence of specific proteinase inhibitors (of the serpin family) prevents unnecessary activation of the cascade and overproduction of toxic products. Phenoloxidase is the key enzyme responsible for the catalysis of melanization. It is a marker of the PPO activating system, and it can be an immune effector by itself as demonstrated in ascidians. It is therefore interesting to assess its conservation within all metazoa. A survey of the different organisms revealed the presence of phenoloxidase in many deuterostome and protostome phyla, and related molecules are also present in sponges. In arthropods, several PPO genes are present in the genome (nine in Drosophila and Aedes). Some may have different “immune” functions such as injury repair. Several components that would maintain the role of melanization in immunity may be lacking in different phyla even if many elements are conserved, and so far the best examples of melanization associated with immunity are still found almost exclusively among arthropods and to a lesser extent in annelids. Despite the presence of molecules involved the pathway, the PPO cascade per se does not exist in vertebrates.

Effector Molecules


Among molecules containing LRR motifs, peroxidasin occupies a special place because of its involvement in hemocyte biology in insects and because of its homology to the LRR motifs in the agnathan VLR and Ig domains similar to Ig itself. Drosophila peroxidasin is an assembly of a cysteinerich motif, six LRR, and four IgSF domains.131 The molecule is conserved in vertebrates, although a role in immunity has not been reported. Another molecule called peroxinectin, with similarity at the level of the peroxidase region, has been described in crustaceans and shown to be associated with immunity via the PPO cascade.132 Its involvement in immunity is unlike any other effector so far described but illustrates the utility of LRR in many different types of molecules and processes. Pathogens bound by AMPs can be phagocytosed or walled off by a barrier of flattened hemocytes and ECM. The ECM forms a basement membrane that becomes stabilized partly through peroxidases that generate tyrosine-tyrosine bonds. The combination of LRR and Ig structures suggests that peroxidasin may precisely mediate adhesion of cells to the ECM.

As mentioned, a large number of LRR-Ig-containing proteins has been discovered, most of them playing roles in embryologic development.32 Many LRR-Ig proteins are encoded in paralogous regions in the vicinity of immune genes, showing an ancient direct connection between the families (see the following).

Fibrinogen-Related Proteins

FREPs are proteins that were first discovered in the hemolymph of snails with an IgSF moiety (one or two V-like domains) and a fibrinogen domain. The fibrinogen domain is found in a large number of defense molecules throughout the animal kingdom (eg, the ficolins). FREP gene expression is upregulated following exposure to the mulluscan parasites such as schistosomes133; a snail strain resistant to schistosomes shows an upregulation of the FREP 2 and 4 genes of up to 50-fold. The original discovery of FREPs followed the recovery of snail proteins that bound to worm antigens, and thus this is one case in which the correlation between an invertebrate receptor and its ligand is clear. However, it is not known whether the IgSF or fibrinogen domain (or both) bind to the antigen or which effector functions are induced after FREP binding.

FREP diversity is remarkable in that there are many polymorphic genes as well as alternate messenger RNA splicing to generate the diversity. In addition, based on the number of genes and alleles in individual snails, there appears to be a somatic diversification mechanism that modifies FREP genes, either via mutation or gene conversion in the region that encodes the IgSF domains.134 Over 300 unique sequences were found in 22 snails, consistent with a somatic diversification mechanism. Currently, there is no mechanism to account for the mutations, but data accumulate for somatic modifications and the use of FREPs in critical defense against pathogens.135

FREPs are also present in arthropods where they lack the Ig domains,136 once again with an expansion of genes in the mosquito Anopheles gambiae as compared to Drosophila. RNAi studies have shown that subsets of FREP genes are vital for defense against malarial parasites, and different FREPs bind to bacteria with different affinities.137 Homo-and heterodimers can form between different FREPs, and multimers can be fashioned that likely increase the avidity of binding. In summary, this ancient family of defense molecules has all of the attributes described in the introduction: rapidly evolving multigene family, conservation in divergent protostomic invertebrate phyla, somatic diversification via alternative splicing, and perhaps an unknown mutational/gene conversion mechanism, as well as heterodimeric (and multimeric) association.


There are a number of additional expanded gene families in the sea urchin genome that encode proteins with immune-related functions. The 185/333 genes were first noted because they are highly upregulated in coelomocytes after exposure to LPS, constituting up to 7% of the messenger RNAs in such cells.138 Subsequently, transcripts were shown to be upregulated by many different types of PAMPs. The encoded proteins have no detectable similarity to any other gene family, but they are highly diversified and are produced (at least) by a subset of coelomocytes.

There are estimated to be at least 50 185/333 genes in the sea urchin genome.139 The gene is composed of two exons, one encoding the leader and the other encoding the mature protein. The second exon is made up of so-called elements and repetitive sequences that are quite different from gene to gene, which can to a large extent explain the diversity of expressed loci. However, there are hints of RNA editing, a unique form of alternative splicing, somatic mutation (perhaps targeting cytosine residues), and other (perhaps) novel mechanisms of diversity to explain the incredible number of different isoforms; additionally, like many other defense molecules, there is evidence of protein multimerization. Whatever the mechanism of diversity generation, it would be surprising if this family were not vital for defense in echinoids. Furthermore, the presence of this unique multigene family in sea urchins is consistent with the great expansion of other immune gene families in this group, including TLRs, SRCRs, and NLRs.7

Variable Domain Chitin-Binding Proteins

VCBP, first discovered in Amphioxus but present in Ciona as well, consist of two Ig domains of the V type but with a different folding motif when compared to Ig or TCR V domains140 followed by a chitin-binding domain. The chitin-binding domain resembles chitinases found throughout the animal kingdom, and like dedicated chitinases VCBP is usually expressed in the gut. Apparently, there are no cell-surface-expressed forms and thus all VCBPs are likely to be secreted, effector molecules. In Amphioxus, their diversity is enormous, apparently entirely because of polymorphism and polygeny, and not somatic alterations. Each individual can carry up to five genes per haplotype, and in limited studies (11 individuals), no identical haplotype has been encountered.141,142 The general structure of the V domain is like that of the vertebrate rearranging antigen receptors, but with some unusual properties, including packing in a “head-to-tail” dimeric fashion, totally unlike Ig and TCR. VCBP diversity does not reside in the Ig/TCR CDR resides, but rather in the A, A′, and B strands, like in DSCAM.

By contrast to the Amphioxus VCBP, there are only a few nonpolymorphic Ciona VCBP genes, which are expressed by gut epithelium and amebocytes. Soluble VCBPs bind to bacteria and induce opsonization in amebocytes. It is hypothesized that these molecules may perform a function similar to mucosal IgA in vertebrates, which provides a “firewall” protecting from invasion of intestinal bacteria and promoting homeostasis.143 If true, this would provide a link to the regulation of commensals in early deuterostomes. It should be noted here, however, that recent data suggest that “tolerance” of commensals occurs in the protostomic invertebrates as well.

Antimicrobial Peptides (Defensins)

Each metazoan taxon produces a variety of molecules with intrinsic antimicrobial activity,144,145 the majority of which fall into three major categories: defensins, catelicidins, and histatins. Even in species with very small genomes (such as the tunicate Oikopleura [60 Mb genome]), selection pressures have been strong enough to lead to expansion of the Phospholipase a2 family, with 128 members.146 Some families are evolutionarily conserved but generally they diverge rapidly and orthologous relationships are not apparent. The best studied group of AMPs is the defensins, which are amphipathic cationic proteins; their positively charged surface allows them to associate with negatively charged membranes (more common in pathogens), and a hydrophobic surface that allows them to disrupt the membranes, either by disordering lipids or actually forming pores. Most of the molecules are proteins, but an antimicrobial lipid called squalamine, which also is modeled to have hydrophobic and positively charged surfaces, is found at very high levels in dogfish and lamprey body fluids.147 Defensins can either be constitutively expressed (eg, in respiratory epithelia in mammals) or inducible (eg, see the following for Drosophila and see Fig. 4.5). Certain responses that seem systemic, like the production of Drosophila defensins, can also take place locally in the damaged tissues themselves; otherwise, a systemic response is initiated in organs distant from the site of infection such as the fat body in Drosophila where induction of bactericidal peptide expression occurs.36 Defensins are the focus of great attention in commercially bred species such as oysters, mussels, and crustaceans. Besides their direct defense functions, in mammals defensins play other roles, such as chemotaxis and immune regulation.

Penaedins. One set of diverse AMPs is the penaedins, present in crustaceans (shrimp). Penaedins are small antimicrobial peptides (5 to 7 kda) that bind to bacteria and fungi, and consist of a conserved leader peptide followed by an N-terminal proline-rich domain and a C-terminal cysteine-rich domain.7,148 Most of the diversity is found in the proline-rich domain,149 suggesting that it is most important for recognition, but both domains are required for recognition of bacteria and fungi. Four classes of penaedins, PEN2 to 5, are expressed by shrimp hemocytes. A great diversity of isoforms is generated, with substitutions and deletions within the proline-rich domain, suggesting that this domain recognizes ligand; nevertheless, both domains seem to be required for function. Like the VCBPs, each penaeidin class seems to be encoded by a unique gene and isoform diversity is generated by polymorphism. Multiple copies of penaedin genes are present in different species, and there is rapid expansion and contraction even within closely related organisms.

Responses to Viruses in the Invertebrates

Compared to immunity to extracellular pathogens in the invertebrates, the study of responses to intracellular pathogens like viruses is in its infancy.150 First discovered in plants and in C. elegans, the RNAi pathway of defense against viruses is also operative in Drosophila and Anopheles.151 Doublestranded RNAs (viral or otherwise) are recognized by the enzyme Dicer 2, generating small interfering RNAs that can associate with complementary RNAs and induce their degradation.

Viruses also induce an “IFN-like response” through a cytokine receptor (domeless) that is homologous to the IL-6 receptor and signals through the JAK/STAT pathway.152 After viral infection, unknown signals (RNA, perhaps) induce the production of cytokines of the unpaired family that bind to domeless on neighboring cells and upregulate a large number of genes involved in defense (see Fig. 4.3). Nothing is known about the effector pathways of these responses, but mutants of one of the induced genes results in increased viral load. It should be noted that this pathway is not cell autonomous, inconsistent with the IFN pathway in vertebrates.

In a paradigm-changing paper, foreign antigen was shown to directly interact with Drosophila toll7, resulting in the induction of antiviral autophagy and inhibition of viral replication.153 Toll-7 interacted with the glycoprotein from vesicular stomatitis virus at the cell surface to initiate the response. Thus, the dual paradigm of indirect and direct recognition by toll and TLR, respectively, must now be modified. There are several other tolls without functions in insects that might be involved in immunity, perhaps via such mechanisms.

In summary, arthropods (and presumably other invertebrates) use an RNAi pathway as well as a signaling pathway to combat viruses.150,152 As opposed to the systemic plant RNAi response, the same pathway in protostomes is rather cell autonomous. This response was lost in the vertebrates, presumably because 1) viruses have been able to effectively counter this response and render it ineffective; 2) there have been remarkable evolutionary innovations in the vertebrate innate and adaptive immune systems to combat viruses; and 3) vertebrates use a Dicer pathway extensively to regulate expression of their own genes. The discovery of a viral immune response quite like a type I IFN response in vertebrates demonstrates that the three major signaling pathways of defense in Drosophila, toll, IMD, and JAK/STAT, are similar to the vertebrate TLR/IL-1R, TNF, and IL-6/IFN pathways, respectively (the first two homologous). Finally, other insect tolls may interact directly with PAMPs to promote antiviral defense such as autophagy or apoptosis. This is a field in which rapid progress will be made in the near future.

Natural Killing Activity Across Metazoa

The word “cytotoxicity” encompasses vastly different protocols of cell killing by of different cell types. It can be an effector function of cells of the adaptive (cytotoxic T-lymphocyte [CTL] or NKT cells) or of the innate arm (bonafide NK, see the following) of the jawed vertebrate immune system. Similarly the term “NK cells” covers different cell types and different functions. NK cells of vertebrates can “recognize” missing self-MHC class I, but also ligands induced on stressed cells following virus infection, transformation or stress. They can also have an immunoregulatory role by interactions with antigen-presenting cells (APCs). Many of these features obviously profit from a comparative approach.

In natural killing, the common denominator is a spontaneous reaction (ie, it does not require any [known] antigenic priming but only cell contact). Some form of natural killing can be observed from the earliest metazoans onwards. Some marine sponge and corals avoid fusion with one another by mechanism of cytotoxic cells or induce apoptosis at the level of the teguments.154 Phenomena more similar to vertebrate NK killing are observed in sipunculid worms where allorecognition among populations was shown to result in killing of allogeneic erythrocytes by lymphocyte-like cells.155 Similar cases can be also encountered in annelids and mollusks,156 The role of IgSF and lectin receptors known to be involved as NK cell receptors in vertebrates has not been examined in these invertebrates, even though some candidate homologs have been identified. Note that the mechanism(s) cannot be perforin-mediated, based on the recent bioinformatics analysis on MAC/perforin domains described previously in the complement section.126

When comparative morphology or function is not informative, searching for conservation of transcription factors or cell surface markers may be useful. Survey of IgSF genes across databases have not yielded any promising candidates to date. Despite the presence of polymorphic IgSF members of the receptor tyrosine kinase family in sponges, their role in allorecognition or killing has not been demonstrated.157 As mentioned, similarities were found among the lectin families, especially in prochordates where cytotoxicity has been reported and associated with a discrete population of hemocytes the granular amoebocytes. The urochordate genome (Botryllus, Halocynthia, Ciona) encodes many lectins with or without typical carbohydrate recognition signatures, among them a putative CD94 homolog has been cloned and its expression followed in Botryllus158; its predicted sequence does not match well to vertebrate CD94. This gene is differentially regulated during allorecognition in Botryllus, and a subpopulation of blood cells has the receptor on their surface in both Botryllus and Ciona. Phagocytosis is inhibited by an antiserum recognizing the Ciona homologue.159

Other C-type lection homologs of CD209 and CD69 are linked to the CD94/L gene on Ciona chromosome1 (L.D.P., personal observation). Could they be part of a “pre-NK complex”? Interestingly, all the human homologs of those lectin genes are present either in the NK complex on 12 p13 (CD94, CD69) or on an MHC paralog 19p13 (CD209). Taken together with studies of the chicken MHC which encodes some C-type lectins (see the following), the data suggest that a conserved MHC-linked region containing several lectin genes was present before the emergence of MHC class I and II genes.2

In addition, a number of genes encoding membrane proteins with extracellular C-type lectin or immunoreceptor tyrosine-based inhibitory motifs (ITIMs) and immunoreceptor tyrosine-based activation motif s (ITAMs) (plus their associated signal transduction molecules) were identified in Ciona, which suggests that activating and inhibitory receptors have an MHC class I- and II-independent function and an early evolutionary origin.160 The ligands of these Ciona molecules are of great interest to uncover.

Natural Killer and Natural Killer-Like Cells

NK cells express both activating and inhibitory cell surface receptors; in fact, the paradigm for positive and negative signaling via such receptors began with these cells60,161; however, activating and inhibitory receptors (often paired) are conserved throughout vertebrates and invertebrate deuterostomes, and are expressed in hematopoietic cells of all types. In NK cells, stimulation of the activating receptors, which associate with proteins having an intracellular ITAM (CD3ζ, DAP12, or DAP10 conserved at least to the level of bony fish162), results in killing of target cells. Inhibitory signaling receptors all possess cytoplasmic ITIMs, which recruit phosphatases and generally are dominant over the activating receptors. These receptors fall in two categories IgSF and C-type lectin group V(II). In general, NKRs recognize MHC class I molecules of either the classical or nonclassical type, the latter sometimes encoded by viruses.163

General Evolution of Natural Killer Receptor Families. As mentioned, NK cells in mammals can use different types of receptors, even encoded by different gene families, IgSF (KIR) in primates and C-type lectins (Ly49) in rodents. A few receptors are conserved but most others are highly variable. Very few families show conservation of domains throughout the jawed vertebrates.163 When dealing with the origin of these genes in invertebrates, one has to imagine under what pressures they evolved. The question remains as to whether NK cells, or NK-like cells, preceded the emergence of T- and B-lymphocytes.

As described in the introduction, NKRs are the most rapidly evolving molecular component of the gnathostome immune system. Most ligands for these diverse NKRs are MHC class I molecules, or molecules of host or pathogen origin related to MHC class I. The KIR families are divergent, as very few genes are conserved even between chimpanzees and humans, and there are different numbers of genes in KIR haplotypes within a species. Humans have two KIR haplotypes A and B, one encoding a large number of activating receptors and the other very few.23 It is speculated that the haplotypes are under balancing selection within the population both for defense against virus and for maternal/fetal interactions. By contrast, CD94 and NKG2 receptors are conserved throughout mammals. So whereas receptors for polymorphic class I molecules are divergent, those for nonpolymorphic or stress-induced class I molecules are relatively conserved (despite the fact that their ligands Qa1 and HLA-E are not orthologous). NK cells play other important roles in other innate immune responses, for example in antiviral immunity. NK cell recognition of virus-infected cells engages the activating KIR and Ly49 receptors and NKG2D in this process. Thus, viruses are hypothesized to supply the evolutionary pressure on diversification of NKRs. In fact, it has been shown in mice that inhibitory receptors can rapidly mutate into activating receptors when viral “decoy” class I molecules evolve to engage inhibitory receptors.164 Generally speaking, inhibitory receptors are older and more conserved, whereas activating receptors evolve more rapidly and can be derived from inhibitory receptors via mutations that result in loss of the ITIM.165

Comparative Studies of Natural Killer Function. NK cells were detected in Xenopus by in vitro 51Cr-release assays. Splenocyte effectors from early thymectomized frogs spontaneously lyse allogeneic thymus tumor cell lines that lack MHC antigen expression.166 This activity is increased after the injection of tumor cells or after treating the splenocytes in vitro with mitogens, suggesting lymphokine activation of the killers. Splenocytes isolated with an anti-NK monoclonal antibody (mAb) revealed large lymphoid cells with distinct pseudopodia. Immunohistology indicated that each anti-NK mAb routinely labeled cells within the gut epithelium but NK cells were difficult to visualize in spleen sections.167

In amphibians, NK cell studies are especially interesting because of natural experiments done by nature (ie, the absence or low levels of MHC classical class I during larval life of some species like Xenopus).168 They are bonafide NK cells, distinct from T cells, as they fail to express TCR Vβ transcripts. NK emerge in late larval life, 7 weeks postfertilization, which is about 2 weeks after the time when cell surface class I can be detected. The proportion of splenic NK cells remains very low until 3 to 4 months of age, but by 1 year there is a sizeable population. Therefore, NK cells fail to develop prior to MHC class I protein normal expression (at least NK cells of the type that can be measured with these assays and with NK cell-specific mAbs) and do not contribute to the larval immune system, whereas they do provide an important backup for T cells in the adult frog by contributing to antitumor immunity.

NK cells have also been described in a number of teleost fish with the most in-depth studies in catfish, in which there are clonal likes of cytotoxic cells,167 some that clearly lack TCR expression.169 A subset of the fish NK cell bears a highaffinity FcR that can be utilized for antibody-dependent cellular cytotoxicity.170 Other subsets of NK cells spontaneously kill allogeneic targets. Further study of these cloned lines may provide much needed information on NK function in phylogeny.

Phylogeny of Natural Killer Lectins

Besides the well-described Ly49 family of receptors in rodents (and horse) and the CD94 and NKG2 families in all mammals, other mammalian NKRs are of interest. Studies in mammals have shown that some NKC-encoded lectin-like receptors in the Nkrp-l family can recognize other lectin-like molecules, termed Clr, also encoded in the NKC.171 Having linked loci encoding receptor-ligand pairs suggests a genetic
strategy to preserve this interaction; perhaps the CD94 homologs of invertebrates are genetically linked to genes encoding their ligands. In addition, as described in the following, the close genetic linkage of receptor and ligand genes is a common theme in “histocompatibility reactions” throughout the animal and plant kingdoms described below.

In chickens, a single gene similar to CD94/NKG2 is encoded in a region syntenic with the mammalian NKC.172 It is linked to CD69, another C-type lectin also encoded in the NKC of mouse/human. Chickens and quail MHC encode two C-type lectin NKR B-lec and B-NK, the latter being most similar to NKPR1.173 Other C-type lectins are found in the RFP-y locus, one that is also most similar to NKRP1.174 These linkages give credence to the idea that the NKC and MHC were syntenic in early jawed vertebrates (see the following). While C-type lectin genes with some similarity to mammalian NKR have been detected in ectothermic vertebrates, no convincing orthology or synteny to the NKC has been found to date. In mammals including marsupials, NKG2D is conserved, and CD94/NKG2 is found in mammals and birds (as well as NKPR1).163 If such C-type lectin NKR are found in the future in cold-blooded vertebrates, they will have to be studied in functional assays.

Given the apparent lack of MHC class I and class II in agnathans and their convergently acquired adaptive immune system (see the following), it is difficult to envisage how NK cells with receptors of any type might function in these animals. It should be mentioned, however, that sequence similarity might be difficult to detect for an MHC peptide-binding region (PBR), given the rapid rate of evolution of this gene family. Furthermore, it would not be shocking if there were NK cells with ligands encoded by other gene families—in mammals, ligands for some activating NKRs have not been identified. It would be of interest to study the non-VLR-expressing lymphocytes in agnathans (if such cells exist) for their killing potential or gene expression.

Phylogeny of Immunoglobulin Superfamily Natural Killer Receptors

IgSF-activating receptors have been recognized from cartilaginous fish onwards with a convincing activating NKp44 homolog first found in carp (called NILT), but with no functional data.175 Subsequently, this family was found in other bony and cartilaginous fish and definitive orthology to mammalian NKp44 was shown. The activating receptor NKp30 is also conserved, with orthologs found to the level of cartilaginous fish176; in addition, as described in the MHC section, there are V(J) genes within the frog MHC called XMIV that are ancient homologues of NKp30 and may be NKRs of both activating and inhibitory types.177 The ligand for NKp30 has recently been uncovered, the stress-induced molecule B7H6.178 Interestingly, there is a perfect correlation between the presence or absence of NKp30 and B7H6 in the vertebrate line, with both genes lacking in birds and bony fish but present in all other gnathostome classes.176 Furthermore, phylogenetic trees suggest a close relationship between NKp30 and NKp44, consistent with their ancient origins within the vertebrate line.

There are other bony fish-specific IgSF NKR families. One family, the novel immune-type receptors (NITRs),179 have one or two Ig domains with a charged residue in the TM and could therefore be associated (by analogy) to an ITAM DAP12 equivalent, and indeed was shown to interact with mammalian DAP12.162 Like NKp30, the NITR N-terminal V domain is also of the VJ type. NITRs were originally believed to be part of the LRC, but this was shown to be unlikely upon further analyses. NITRs can be expressed by cells of the hematopoietic lineage, presumably lymphocytes. NITRs have been found in all bony fish, with rapid contraction and expansion of the gene family, with the majority of proteins predicted to have inhibitory ITIMs in their cytoplasmic tails. In zebrafish, there are many NITR genes that group into 12 distinct families.180 An extreme level of allelic polymorphism is apparent, along with haplotype variation and family-specific isoform complexity. By contrast, only 11 related genes encoding distinct structural forms have been identified in the channel catfish, and the relatively small number of genes allowed functional studies to be performed. Additionally, taking advantage of the ability to grow clonal lines of catfish hematopoietic cells, one granular cell line lacking all markers of B/T cells was shown to express several NITRs. Expressed NITRs were fused to an ITAM-containing motif and transfected into a T-cell hybridomas line with a nuclear factor of activated T cells (NFAT) promoter, and the specificity for particular catfish MHC alleles was maintained.181 Subsequent crystal structure analysis showed the NITR V domains to form dimers, much the same as Ig/TCR heterodimeric V regions. Thus, the sequence analysis (ITAM/ITIM), signaling properties, involvement in cytotoxicity, and recognition of MHC molecules identify the NITRs as excellent candidates for NKRs in bony fish. Furthermore, the work serves as a paradigm for study of potential NKRs when homology and/or conserved synteny are lacking or ambiguous.

IgSF inhibitory receptors usually form larger families of molecules in comparison to activating receptors. This function can be devoted to two distinct families of receptors, giving another example of the extremely rapid evolution of these molecules. There are many ITIM-carrying IgSF integral membrane receptors across the classes of vertebrates, and they seem to have had independent histories as it is difficult to convincingly detect orthologous genes between species. This is especially true of multigene families in fish, with members equipped with possible ITIMs, including the teleost NITR and LITR, and bird CHIR and CD300L.163 Several members of these families can be expressed on NK cells, but expression studies are in their infancy in fish. It is sometimes difficult to distinguish FcR families from NK KIR-like domains, and both FcR and KIR seem to stem from a same lineage. Given the role of the FcR binding to bonafide antibodies and conferring specificities to cells of the innate arms of the immune system, it is likely that these molecules will be restricted to jawed vertebrates. The KIR activity that can incorporate pathogen and virus recognition may be more primitive, but the ancestry of KIR is not well understood, and the bonafide KIR family seems to be restricted to primates.163

Genes encoding the classical FcRs on the long arm of human chromosome 1 (1q21-23) are linked to other FcR-like genes. A large multigene family, which includes genes encoding the FcγR and the NK cell Ig-like receptors, is located in the LRC (human chromosome 19q13). This region could in fact be paralogous to 1q23 and may even have been originally associated with the MHC (see the following). These families belong to a larger class of activating or inhibitory receptors. Their phylogenetic conservation in birds, amphibians, and bony fish suggests a biologic importance even though the size of the families, their expression pattern, and the specific nature of the receptors vary greatly among species.182 In several cases, a commitment to a task in the immune system may not be conserved among homologous members, and the evolutionary fate of the family will be probably affected. Comparison of key residues in the domains may suggest a possible common involvement in MHC recognition for the two families recently discovered in birds (CHIR)183 and the teleosts (LITR).184 Other families were generated within a single class or even within a single order of vertebrates (eg, the KIR described previously). The relationships of KIR with many other multigene families such as IpLITR or NITR remain to be explored. What was the scenario that led to the present mammalian situation? If the fish observation on potential MHC binding holds true, the IgSF type of receptor seems to be the most primitive NKR.163

Other Immunoglobulin Superfamily Families to Explore Further

In more primitive vertebrates, the physical or genetic linkage of relatively large IgSF families is well documented in the teleost NITR but not yet elucidated in the case of other interesting families in prochordates like the VCBP. In the sea urchin genome, many IgSF await a complete analysis and will certainly contribute to a better understanding of the evolution and origin of Ig/TCR.48 Large families of LRR-IgSF in amphioxus185 could perhaps represent interesting intermediaries in the genesis of either VLR in agnathans or Ig/TCR in gnathostomes (see Fig. 4.10). In hagfish, the discovery of leukocyte expressed receptors agnathan-paired receptors (APARs) revealed what might have been a precursor of Ig or TCR.186 APARs resemble Ag receptors and are expressed in leukocytes and predicted to encode a group of membrane glycoproteins with organizations characteristic of paired Ig-like receptors. Based on their transmembrane regions, APAR-A molecules are likely to associate with an adaptor molecule with an ITAM and function as activating receptors. In contrast, APAR-B molecules with an ITIM are likely to function as inhibitory receptors Thus, the APAR gene family has features characteristic of paired Ig-like receptors. APAR V domains have a J region and are more closely related to those of TCR/B-cell receptor (BCR) than any other V-type domain identified to date outside of jawed vertebrates. Thus, the extracellular domain of APAR may be descended from a VJ-type domain postulated to have acquired recombination signal sequences (RSS) in a jawed vertebrate lineage (see Fig. 4.10).

In jawed vertebrates, three such receptor families with VJ-type domains have been identified: a small family of mammalian proteins known as signal-regulatory proteins, a large family of the previously described teleost NITR, and the MHC-linked XMIV in Xenopus. These molecules are examined in more detail in the conclusion. Many IgSF proteins expressed in the immune systems are also expressed in nervous systems where the signaling cacscades may be conserved. This selection of IgSF domains, in two different systems in which homologies are found in molecules with quite different functions, may reveal adaptation capacities and constraints exerted on surface receptors.187


Not long ago, it was believed that only jawed vertebrates had a true adaptive immune system. From the previous discussion of invertebrate immune responses, clearly mechanisms exist to generate high levels of immune diversity—even at the somatic level—one of the hallmarks of an adaptive response.7 As described in the introduction, many in the field agree that the boundary between innate and adaptive immunity is artificial, and it may not be a useful dichotomy when studying immune responses in diverse organisms.61,62 Despite this reluctance to exclusively classify systems as innate or adaptive, some features clearly fall into the latter category, such as clonal expansion of uncommitted lymphocytes and specific memory. These conditions are not fulfilled for the DSCAM, FREP, 185/333, or any other invertebrate systems described previously, but of course we must be open to new mechanisms besides the conventional outlook of adaptive immunity. Additionally, such adaptive immunity arose in concert with the emergence of lymphocytes in the lower chordates, as these cells clearly are the major players; we will discuss this in the conclusion.


A typical Ig molecule is composed of four polypeptide chains (two heavy [H] and two light [L]) joined into a macromolecular complex via several disulfide bonds (Fig. 4.7). Each chain is composed of a linear combination of IgSF domains, and almost all molecules studied to date can be expressed in secreted or transmembrane forms.

Immunoglobulin Heavy Chain Isotypes

Like all other building blocks of the adaptive immune system, Ig is present in all jawed vertebrates (see Fig. 4.7). Consistent with studies of most molecules of the immune system, the sequences of IgH chain C region genes are not well conserved in evolution and insertions and deletions in loop segments occur more often in C than in V domains. As a consequence, relationships among non-µ isotypes (and even µ isotypes among divergent taxa) are difficult to establish.188 Despite these obstacles, the field has developed a working evolutionary tree among all of the isotypes.

Immunoglobulin M

IgM is present in all jawed vertebrates and has been assumed to be the primordial Ig isotype. It is also the isotype expressed earliest in development in all tetrapods; until recently, it was believed to be the case in fish as well, but

this view has changed (see the following). The secretory µ H chain is found in all vertebrates, usually consists of one V and four C1 domains, and is heavily glycosylated. H chains associate with each other and with L chains through disulfide bridges in most species, and IgM subunits form pentamers or hexamers in all vertebrate classes except teleost fish, which form tetramers.189 The µ CH4 domain is most evolutionarily conserved, especially in its C-terminal region, whereas the CH2 domain evolves at the fastest rate.188 There are several µ-specific residues in each of the four CH domains among vertebrates suggesting a continuous line of evolution, which is supported by phylogenetic analyses. Like TCR TM regions, µ TM regions are also well conserved among sharks, mammals, and amphibians, but the process by which the Ig TM messenger RNA is assembled varies in different species. In all vertebrate classes except teleosts, the µ TM region is encoded by separate exons that are spliced to a site on µ messenger RNA located approximately 30 basepairs from the end of the CH4-encoding exon. By contrast, splicing of teleost fish µ messenger RNA takes place at the end of CH3 exon.190 In holostean fish (gar and sturgeon), cryptic splice donor sites are found in the CH4 sequence that could lead to conventional splicing, but in the bowfin there is another cryptic splice donor site in CH3.191 The TM region itself is interesting as it is the only one that does not contain a residue capable of making an ionic bond with the ITAM-containing molecule (in this case, Ig-α and Ig-β.) As mentioned, some modifications apparently related to the particular environment were noticed in the Antarctic fish Trematomus bernacchii. There are two remarkable insertions, one at the VH-CH1 boundary and another at the CH2-CH3 boundary; the latter insertion results in a very long CH2-CH3 hinge region. Rates of nonsynonymous substitutions were high in the modified regions, suggesting strong selection for these modifications. These unusual features (also unique glycosylation sites) may permit flexibility of this IgM at very low temperatures.192

FIG. 4.7. Immunoglobulin (Ig) Isotypes in the Jawed Certebrates. A: Mammalian isotypes and their relationships to Igs in vertebrates from other classes. Each oval represents an Ig superfamily C1 domain. IgD is shown in two forms, mouse (left) and human (right).2 B: Major Ig isotypes in all vertebrates. The bottom panel displays the approximate divergence times of all isotypes. IgM/D/W was found at the inception of adaptive immunity. 1, IgX is in the IgA column because it is preferentially expressed in the intestine, and IgA seems to have been derived from an IgX ancestor; IgX seems to have been derived from both IgM and IgY ancestors; 2, secreted IgM in teleost fish is a tetramer, and the transmembrane (TM) form only has three C domains; 3, the teleost fish IgD H chains incorporate the µC1 domain via alternative splicing in the TM form, and the secretory form does not have a V region and does not associate with L chains; 4, the new bony fish isotype, IgZ/T, may not be found in all fish species; 5, the secreted form of shark/skate IgM is present as a pentamer and monomer at approximately equal levels; 6, the TM form of IgW has four C domains; 7, a major TM form of IgNAR has three C domains, and IgNAR is related to camelid IgG by convergent evolution; 8, no TM form has been found (to date) for IgM1gj. The bottom panel displays the approximate divergence times of all isotypes. IgM/D/W was found at the inception of adaptive immunity.

It has been known for a long time that in all elasmobranchs, IgM is present at very high amounts in the plasma of cartilaginous fish and that it is found in two forms: multimeric (19S) and monomeric (7S).193 It is unlikely that the two forms are encoded by different gene clusters because 1) peptide maps are identical; 2) early work by Clem found the sequences of the cysteine-containing tail of 19S and 7S H chains to be identical; and 3) all identified germline VH families are represented for the 19S form.194 Although most studies (but not all) reported that 19S and 7S are not differentially regulated during an immune response, in a recent study, the 19S response wanes over time and a stable 7S titer is maintained for periods of up to 2 years after immunization.195 In addition, antigen-specific 7S antibodies observed late in the response have a higher binding strength than those found early, suggesting a maturation of the response, also generally at odds with the previous literature. Finally, when specific antibody titers were allowed to drop, a memory response was observed that was exclusively of the 7S class. This work has shown that a “switch” indeed occurs in the course of an immune response; whether the “switch” is due to an induction of the 19S-producing cells to become 7S producers or whether there are lineages of 19S-and 7S-producing B cells is an open question. One working hypothesis is that J chain expression is important for regulating whether a B cell makes 19S or 7S Ig, but of course that could be at the lineage level or the switch level (see the following).

Immunoglobulin M1gj

Nurse shark Ginglymostoma cirratum expresses an IgM subclass in neonates.196 The VH gene underwent V-D-J rearrangement in germ cells (“germline-joined” or “gj,” see the following). Expression of H1gj is detected in primary and secondary lymphoid tissues early in life, but in adults only in the primary lymphoid tissue, the epigonal organ (see the following). H1gj associates covalently with L chains and is most similar in sequence to IgM H chains, but like mammalian IgG it has three rather than the typical four IgM constant domains; deletion of the ancestral IgM second domain thus defines both IgG and IgM1gj. Because sharks are in the oldest vertebrate class known to possess antibodies, unique or specialized antibodies expressed early in ontogeny in sharks and other vertebrates were likely present at the inception of the adaptive immune system. It is suggested that this isotype interacts either with a common determinant on pathogens or a self-waste product.

Immunoglobulin New Antigen Receptor and New Antigen Receptor-T-Cell Receptor

A dimer found in the serum of nurse sharks and so far restricted to elasmobranchs, IgNAR is composed of two H chains each containing a V domain generated by rearrangement and five constant C1 domains.197 IgNAR was originally found in sera, but TM forms exist as ccomplementary DNA and cell-surface staining is detected with specific mAbs. The single V resembles a fraction of camel/llama (camelid) IgG that binds to antigen in a monovalent fashion with a single V region, but it clearly was derived by convergent evolution. In phylogenetic trees, NAR V domains cluster with TCR and L chain V domains rather with that VH. A molecule with similar characteristics has also been reported in ratfish, although it was independently derived from an ancestral Ig like the camelid molecule emerged from bonafide IgG.198 IgNAR V region genes accumulate a high frequency of somatic mutations (see the following).

The crystal structure of a Type I IgNAR V regions showed that, in contrast to typical V regions, they lacked CDR2 and had a connection between the two IgSF sheets much like an IgSF C domain.199 The domain wraps around its antigen (hen egg lysozyme [HEL]), with the CDR3 penetrating into the active site of the enzyme. The structure of a Type II V region has a disulfide bond between CDR1 and CDR3 that forces the most diverse regions of the molecule to form raised loop, similar to what has been described for camelid V domains. In total, the differential placement of disulfide bonds forces major changes in the orientations of CDR1 and CDR3, and provides two major conformations for antigen binding.200 The structure of a Type II NAR bound to HEL showed that it also interacted with the active site of the enzyme.

While analyzing the TCRVδ repertoire in nurse sharks, an entirely new form of this chain, which encodes three domains, V-V-C, was detected.201 The C is encoded by the single-copy Cδ gene, and the membrane-proximal V is encoded by a Vδ gene that rearranges to the DJδ elements. The membrane distal V domain is encoded by a gene in the NAR family, found in a rearranging VDJ cluster typical of all cartilaginous fish Ig clusters. The NAR-TCR V genes, unlike IgNAR V genes which have three D segments, only have a single D region in each cluster. The particular Vδ loci linked to each NAR-TCR gene—called NAR-TCR-supporting Vδ—encode a cysteine in CDR1 that likely makes a disulfide bridge with the NAR-TCR V domain. The J segment of the rearranged NAR-TCR V gene splices at the RNA level directly to the supporting Vδ segment, which has lost its leader exon (Fig. 4.8). This organization likely arose from an IgNAR V cluster that translocated to the TCRδ locus upstream of a Vδ gene segment. After modifications of the supporting Vδ genes, this entire V-V gene set duplicated and diverged several times in different species of sharks. About 25% of the expressed nurse shark TCR δ repertoire is composed of this TCR (encoded by 15 to 20 V-V genes in this species), and have proposed that the typical γ/δ TCR acts as a scaffold upon which sits the single chain NAR V.

Our interpretation is that, true to the proposal that γ/δ TCRs interact with free antigen, the NAR V is providing a binding site that can interact with antigen in a different way than conventional heterodimeric Vs. Thus, this is the first case in which a particular V region family has been shown to be associated with a BCR and TCR; in the case of the BCR, the function likely resides within the Fc portion of IgNAR and for the TCR the function (cytokine secretion, killing) lies within the T cell itself. Interestingly, a second TCRδ chain locus has also been described in marsupials and monotremes with properties similar to NAR-TCR.202 In this case, there are also two V domains, but in marsupials one (proximal to the membrane) is germline-joined, and only the membrane-distal domain undergoes rearrangement. In monotremes, both of the V regions undergo rearrangement, like for the NAR-TCR. This new TCRδ locus is preferentially expressed early in development. This type of TCR is described in more detail in the TCR section.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Aug 29, 2016 | Posted by in IMMUNOLOGY | Comments Off on Evolution of the Immune System

Full access? Get Clinical Tree

Get Clinical Tree app for offline access