salomonsen, jan, chattaway, john a, chan, andrew c y

24
This is a peer-reviewed, final published version of the following document and is licensed under Creative Commons: Attribution 4.0 license: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y, Parker, Aimée, Huguet, Samuel, Marston, Denise A, Rogers, Sally L, Wu, Zhiguang, Smith, Adrian L, Staines, Karen, Butter, Colin, Riegert, Patricia, Vainio, Olli, Nielsen, Line, Kaspers, Bernd, Griffin, Darren K, Yang, Fengtang, Zoorob, Rima, Guillemot, Francois, Auffray, Charles, Beck, Stephan, Skjødt, Karsten and Kaufman, Jim (2014) Sequence of a complete chicken BG haplotype shows dynamic expansion and contraction of two gene lineages with particular expression patterns. PLoS Genetics, 10 (6). e1004417. doi:10.1371/journal.pgen.1004417 Official URL: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004417 DOI: http://dx.doi.org/10.1371/journal.pgen.1004417 EPrint URI: http://eprints.glos.ac.uk/id/eprint/1999 Disclaimer The University of Gloucestershire has obtained warranties from all depositors as to their title in the material deposited and as to their right to deposit such material. The University of Gloucestershire makes no representation or warranties of commercial utility, title, or fitness for a particular purpose or any other warranty, express or implied in respect of any material deposited. The University of Gloucestershire makes no representation that the use of the materials will not infringe any patent, copyright, trademark or other property or proprietary rights. The University of Gloucestershire accepts no liability for any infringement of intellectual property rights in any material deposited but will remove such material from public view pending investigation in the event of an allegation of any such infringement. PLEASE SCROLL DOWN FOR TEXT.

Upload: others

Post on 28-Oct-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

This is a peer-reviewed, final published version of the following document and is licensed underCreative Commons: Attribution 4.0 license:

Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y, Parker, Aimée, Huguet, Samuel, Marston, Denise A, Rogers, Sally L, Wu, Zhiguang, Smith, Adrian L, Staines, Karen, Butter,Colin, Riegert, Patricia, Vainio, Olli, Nielsen, Line, Kaspers, Bernd, Griffin, Darren K, Yang, Fengtang, Zoorob, Rima, Guillemot, Francois, Auffray, Charles, Beck, Stephan, Skjødt, Karsten and Kaufman, Jim (2014) Sequence of a complete chicken BG haplotype shows dynamic expansion and contraction of two gene lineages with particular expression patterns. PLoS Genetics, 10 (6). e1004417. doi:10.1371/journal.pgen.1004417

Official URL: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004417DOI: http://dx.doi.org/10.1371/journal.pgen.1004417EPrint URI: http://eprints.glos.ac.uk/id/eprint/1999

Disclaimer

The University of Gloucestershire has obtained warranties from all depositors as to their title in the material deposited and as to their right to deposit such material.

The University of Gloucestershire makes no representation or warranties of commercial utility, title, or fitness for a particular purpose or any other warranty, express or implied in respect of any material deposited.

The University of Gloucestershire makes no representation that the use of the materials will notinfringe any patent, copyright, trademark or other property or proprietary rights.

The University of Gloucestershire accepts no liability for any infringement of intellectual property rights in any material deposited but will remove such material from public view pending investigation in the event of an allegation of any such infringement.

PLEASE SCROLL DOWN FOR TEXT.

Page 2: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

This is the final published version and is licensed under a Creative Commons Attribution 4.0

International License.

Salomonsen, Jan and Chattaway, John A and Chan, Andrew C

Y and Parker, Aimée and Huguet, Samueland Marston, Denise

A and Rogers, Sally L and Wu, Zhiguang and Smith, Adrian

L and Staines, Karenand Butter, Colin and Riegert,

Patricia and Vainio, Olli and Nielsen, Line and Kaspers,

Bernd and Griffin, Darren K and Yang, Fengtang and Zoorob,

Rima and Guillemot, Francois and Auffray, Charles and Beck,

Stephan and Skjødt, Karsten and Kaufman, Jim (2014). Sequence of

a complete chicken BG haplotype shows dynamic expansion and

contraction of two gene lineages with particular expression

patterns. PLoS genetics, 10 (6). e1004417. ISSN 1553-7404

Published in PLOS ONE, and available online at:

http://journals.plos.org/plosgenetics/article?id=1...

We recommend you cite the published version.

The URL for the published version is

http://journals.plos.org/plosgenetics/article?id=1...

Disclaimer

The University of Gloucestershire has obtained warranties from all depositors as to their title

in the material deposited and as to their right to deposit such material.

The University of Gloucestershire makes no representation or warranties of commercial

utility, title, or fitness for a particular purpose or any other warranty, express or implied in

respect of any material deposited.

The University of Gloucestershire makes no representation that the use of the materials will

not infringe any patent, copyright, trademark or other property or proprietary rights.

The University of Gloucestershire accepts no liability for any infringement of intellectual

property rights in any material deposited but will remove such material from public view

pending investigation in the event of an allegation of any such infringement.

PLEASE SCROLL DOWN FOR TEXT.

Page 3: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

This work is licensed under a Creative Commons Attribution-NonCommerciai-ShareAiike 4.0 International License.

Page 4: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

OPEN@ ACCESS Freely available online ·.<W·PLOS I GENETICS

Sequence of a Complete Chicken BG Haplotype Shows Dynamic Expansion and Contraction of Two Gene Lineages with Particular Expression Patterns Jan Salomonsen 1

'2

'3

'aa, John A. Chattaway4", Andrew C. Y. Chan4

, Aimee Parker4"b, Samuel Huguet4"c,

Denise A. Marston5"d, Sally L. Rogers5"e, Zhiguang Wu 5"r, Adrian L. Smith5

'6

, Karen Staines5,

Colin Butter5, Patricia Riegert1

" 9 , Olli Vainio 1'7

, Line Nielsen2"h, Bernd Kaspers8

, Darren K. Griffin9,

Fengtang Yang 10, Rima Zoorob11

'12

"1, Francois Guillemot12"l, Charles Auffray12

"\ Stephan Beck10"

1,

Karsten Skj0dt 1' 13

, Jim Kaufman 1'4

'5

' 14*

1 Basel Institute for Immunology, Basel, Switzerland, 2 Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark. 3 Department of

International Health, Immunology and Microbiology, University of Copenhagen, Copenhagen, Denmark. 4 Department of Pathology, University of Cambridge, Cambridge,

United Kingdom, 5 Pirbright Institute (formerly Institute for Animal Health), Compton, United Kingdom, 6 Department of Zoology, Oxford University, Oxford, United

Kingdom, 7 Department of Medical Microbiology, University of Oulu and Nordlab, Oulu, Finland, Blnstitute for Animal Physiology, Department of Veterinary Sciences,

Ludwig Maximllians University, Munich, Germany, !J School of Biosciences, University of Kent, Canterbury, United Kingdom, 10 Wellcome Trust Sanger Institute, Hinxton,

United Kingdom, 11 1nsti tute for Cellular and Molecular Embryology, CNRS UMR 7128, Nogent-sur-Marne, France, 121nstitute Andre Lwoff, CNRS FRE 2937, Villejuif,

France, 13 Department of Cancer and lnOammatlon, University of South Denmark, Odense, Denmark. 14 Department o f Veterinary Medicine, University of Cambridge,

Cambridge, United Kingdom

Abstract

Many genes important in immunity are found as multigene families. The butyrophilin genes are members of the B7 family, playing diverse roles in co-regulation and perhaps in antigen presentation. In humans, a fixed number of butyrophilin genes are found in and around the major histocompatibility complex (MHC), and show striking association with particular autoimmune diseases. In chickens, BG genes encode homologues with somewhat different domain organisation. Only a few BG genes have been characterised, one involved in actin-myosin interaction in the intestinal brush border, and another implicated in resistance to viral diseases. We characterise all BG genes in 1312 chickens, finding a multigene family organised as tandem repeats in the BG region outside the MHC, a single gene in the MHC (the BF-BL region), and another single gene on a different chromosome. There is a precise cell and t issue expression for each gene, but overall there are two kinds, those expressed by haemopoietic cells and those expressed in tissues (presumably non-haemopoietic cells), correlating with two different kinds of promoters and 5' untranslated regions (S' UTR). However, the mult igene family in the BG region contains many hybrid genes, suggesting recombination and/or deletion as major evolutionary forces. We identify BG genes in the chicken whole genome shotgun sequence, as well as by comparison to other haplotypes by fibre fluorescence in situ hybridisation, confirming dynamic expansion and contraction within the BG region. Thus, the BG genes in chickens are undergoing much more rapid evolution compared to their homologues in mammals, for reasons yet to be understood.

Citat ion: Salomonsen J, Chattaway JA, Chan ACY, Parker A, Huguet S, et al. (2014) Sequence of a Complete Chicken BG Haplotype Shows Dynamic Expansion and Contraction of Two Gene Lineages with Particular Expression Patterns. PLoS Genet 10(6): e1004417. doi:10.1371/journal.pgen.1004417

Editor: Scott Edwards, Harvard University, United States of America

Received May 30, 2013; Accepted April 14, 2014; Published June 5, 2014

Copyright: © 2014 Salomon sen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which perm~s unrestricted use, d istribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was originally supported by core funding to the Basel Institute for Immunology (which was founded and supported by F. Hoffmann-La Roche & Co. Ltd. CH-4005 Basel, Switzerland) and the CNRS, and then by core funding to the Institute for Animal Health (which was sponsored by the Biotechnology and Biological Sciences Research Council (BBSRC) of the UK). More recently, th is work was supported by the Wellcome Trust, through a studentshlp RG49834 filled by JAC and programme grant 089305 to JK. FY is supported by Wellcome Trust core funding (WT0980S 1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests: The authors have declared that no competing interests exist.

• E-mail: [email protected]

~These authors contributed equally to this work.

aa Current address: Department of International Health, Immunology and Mkrobiology, University of Copenhagen, and Niel s Steensens Gymnasium, Copenhagen, Denmark ab Current address: Insti tute for Food Research, Norwich Research Park, Norwich, Norfolk. United Kingdom ac Current address: Department of Biomedical Science, University of Sheffield, Sheffield, United Kingdom ad Current address: Animal Health and Veterinary laboratories Agency, New Haw, Addlestone, United Kingdom ae Current address: School of Natura l and Social Sciences, University of Gloucestershire, Cheltenham, United Kingdom of Current address: The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, Scotland, United Kingdom ag Current address: Retired, Helfranzkirch, France oh Current address: Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark oi Current address: INSERM UMR-S 945, H6pital Pitit'-SalpoHriere, Paris, France oj Current address: National Institute for Medical Research, London, United Kingdom ok Current address: European Institute for Systems Biology and Medicine, CNRS·ENS·UCBL, Universite de Lyon, Lyon, France of Current address: UCL Capcer Institute, University College London, London, United Kingdom

PLOS Genetics I www.plosgenetics.org June 2014 I Volume 10 I Issue 6 I e1004417

® CrossMark

d,_\~ "'ow..c

Page 5: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

Introduction

i\'lany of the genes involved in immunity are part of multigene f.·unilies. In some f.'lmilics, each gene is conserved for a specific function dedicated to a particular outcome, in others allelic polymoq>hism and copy number vmiation allow rapid evolution in response to new challenges, and in still other f.'!milies both kinds of genes arc found. Some well-characterised examples for adaptive immunity include genes of the m:uor histocompatibility complex (MHC). for example, both the MHC class I and class II genes of humans and other higher apes have been relativcl)' stable 0\·er I 0 million years (i\Iy), whereas these genes have undergone many changes including extreme copy number variation (CNVJ in monkeys [I ,2J. Further examples out of many are the genes encoding natural killer (NK) receptors, which not only undergo enormous CNV, but even use different structural f.'!milies to carry out similar functions [3,4) . Understanding the forces i1woh·ed in this complex inteq>lay of genomic structure, biological function and evolution is one of the challenges of modern genetics, with intense theoretical and experimental interest over many decades [for example: 5- 22].

The regions in and around the manunalian MHC also include genes involved in innate inununity, such as the famil)• of butyrophilin (and butyrophilin-like) genes for which an important role in the immune response is emerging. These genes are members of the 137 gene superf.'lmil)•, many members of which arc involved in immune co-regulation L23- 26J. Some bul)~·ophilin

molecules function as inhibit01y co-regulators, some may be involved in recognition of stress responses by yo T cells, while others seem to have more specialised functions (such as spllhesis of milk f.'lt globules) and the functions of still others are as yet unknown [23- 3+]. Most importantly, bul)•rophilin genes have strong genetic associations with a variety of diseases in humans [35- 49]. These genes encode transmembra ne glycoproteins with two extracellular immunoglobulin (Ig)-like domains (one or four for butyrophilin-like molecules), and a few C)'toplasmic heptad repeats followed by a B30.2 (or PRY-SPRY) domain [23-25,34,50,51). In humans, one of these genes is located in the .MHC and the others in the extended i\HfC region, while in mouse some of these genes have been translocated elsewhere L 23-25,5 1 ,52]. However, within each species the number and kinds of butyrophilin (and butyrophilin-likc) genes seem to be fi.xed.

The SKINT genes are another multigene family within the B7 superf.'!mily for which impo1iant roles in immune responses are emerging [24,25,53- 55]. T hese genes have an extracellular V-like region related to butyrophilins and other members of the B7 superfamily, but have at least three transmembrane regions followed by short cytoplasmic tail. The SKlNTI gene is responsible for selection of a population of yo T cells which become located specifically in mouse skin. Around the SKlNTI gene (located on a non-MHC chromosome) are several other SKINT genes and pseudogenes, the exact number of which varies between mouse strains. The single member of this f.'lmily in humans is a pseudogene. Thus the SKJNT fam ily provides an example of B7 superf.'lmily genes which appear to be evolving more rapidly than the butyrophilins.

Instead of bul)•rophilin genes, a related f.'lmily of BG genes is found in and ncar the chicken MHC on chromosome 16. Indeed the chicken MHC was discm•ered as a serological blood group (the "B locus") determining the highly polymoq>hic BG antigen on el)'throcytes [56-59). It is now clear that there is a multigene f.'!m ily of llG genes, with one gene in the i\IHC (the Bf-BL region of the B locus) and an unknown number ofllG genes in the nearby

PLOS Genetics I www.plosgenetics.org 2

The Dynamically Evolving BG Family

llG region of the B locus. I t is also clear that BG genes are expressed, not only on el)1hroc)'tes, but with a wide tissue distribution and a number of associated immunological phenom­ena [60- 68] . BG genes encode disulfide-linked dime1·s, each chain having a single extracellular Ig-like region (part of the V domain f.-unily) and a long cytoplasmic tail composed of many heptad repeats which presumably fom1 an a lpha helical coiled-coil. However, there is one chicken butyrophilin-like gene, Tvc- 1, in the chicken genome which was described as the receptor for avian leukosis \~rus subgroup C, located on chromosome 28 f69) .

Thus, chicken BG genes might be derived fi·om ancestral but)~·ophilin genes, and perhaps have similarly important func­tions. Despite much speculation conceming associated i.Jnmuno· logical functions (reviewed in [591), there arc only two clear indications of functions for BG genes. One fortuitous discove•y was the "zipper protein", originally dcsCJibed as a soluble cytoplasmic proteu1 which turned out to be the tail of a BG protein, and which has a role in controlling actin-myosin interaction in intestinal epithelial cells L70J. The other important study re-examined chickens that had been used to show that the BF-BL region (and not the BG region) determined resistance to the tumours caused by nla rck's disease \~rus (an oncogenic herpes· virus) and Rous sa rcoma virus (an acutely transforming retrovirus). By single nucleotide polymorphism (SNPJ analysis and resequen­cing of genomic DNA, the authors found that a retroviral insertion into the 3' UTR of the sutgle BG gene of the 131:-BL region, the 8.5 or BG I gene, correlated with resistance to the llnnom-s. Moreover, they presented e\'idence that an inununo-receptor tyrosine-based inhibitOI) ' motif (ITDd) present in the cytoplasmic ta il might be important to BG l function [71]. T hus, the cytoplasmic tail has been identified as i.Jnportant in the two best studied examples of a functional cflcct for any BG gene, opening the question of what role the high level of pol)~norphism in the extracellular region might play.

It has become clear that the BG multigene f.'lmily is quite complex in comparison to butyrophilin genes, and that an understanding of the true functions of particular BG genes will only be possible once a detailed picture of genomics and expression is available. In this paper, we provide the genomic organisation of the BG genes in the B 12 haplol)ve, detemtinc cell and tissue expression for each gene of the B 12 haplot)1>e, compare the B 12 haplotn>e in detail \li th a red junglefowl haplotype used for the whole genome shotgun (WGS) sequence and at less resolution with fh·e other haplot)1>Cs, and then consider what the data may mean in tcnns of multigene f.·unily evolution. The results take us to a new level of understanding, from which more detailed analyses can be launched.

Results

There are 14 BG genes in the B 12 haplotype o f C line chickens: One on chromosome 2, another in the MHC on chromosome 16 along with a cluster of 12 in the BG region

A cosmid librm)' constructed from the genomic DNA of a CB con genic chicken line (1312 haplot)1>e on a CC inbred chicken line background) had previously been used to define contigs, one of which (cluster I) was the BF-BL region (the classical MHC of the chicken) and three othc1-s (clusters II- IV) were later recognised as the Rfi>·Y region (a region of non-classical MHC genes) [72- 74). We screened this libraJ)' with a BG eDNA probe and picked 50 colonies, which were gro\1~1 up. Analysis by Southern blot allowed the cosmid clones to be grouped into several contigs, hut already

June 2014 I Volume 10 I Issue 6 I e1004417

Page 6: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

Author Summary

Many immune genes are multigene families, presumably in response to pathogen variation. Some multigene families undergo expansion and contraction, leading to copy number variation (CNV), presumably due to more intense selection. Recently, the butyrophilin family in humans and other mammals has come under scrutiny, due to genetic associations with autoimmune diseases as well as roles in immune co-regulation and antigen presentation. Butyro­philin genes exhibit allelic polymorphism, but gene number appears stable within a species. We found that the BG homologues in chickens are very different, with great changes between haplotypes. We characterised one haplotype in detail, showing that there are two single BG genes, one on chromosome 2 and the other in the major histocompatibility complex (BF-BL region) on chromosome 16, and a family of BG genes in a tandem array in the BG region nearby. These genes have specific expression in cells and tissues, but overall are expressed in either haemopoietic cells or tissues. The two singletons have relatively stable evolutionary histories, but the BG region undergoes dynamic expansion and contraction, with the production of hybrid genes. Thus, chicken BG genes appear to evolve much more quickly than their closest homologs in mammals, presumably due to increased pressure from pathogens.

many of the cosmids had sufre red deletion of the BG genes, and in retrospect others had lost portions of their sequences.

Three authentic contigs were eventual!)• defined by extensive restriction double digest mapping, subcloning, limited sequencing after PCR, and comparison to genomic DNA by Southern blot (Figure S l). One of these contigs cmTesponded to cluster I (the chicken MHC, or BF-BL region) which we had already shom1 contained a nG gene provisionally named the 8.5 gene Qater renamed nG 1). We fully sequenced the 8.5 gene (accession munber KC963427, [59]), and later the whole of the cluster I contig (accession number AL023516, f73J). The other two contigs (muncd cluster V and VI) each conta ined six BG genes, which were given a variety of provisional names (now renamed llG2 through BG 13). In addition, a related region without BG genes was found in each cluster. Three representative cosmids covering most of clusters V and VI were fully sequenced by standard shotgun techniques, confirrning the presence of the si_x nG genes in each cluster along with a small re&>ion containing genes for a kinesin motor, a C-t}1le lectin-like receptor and an unidentified protein (Figure I , Figure S2, accession number KC955 130).

\ \'e were concerned whether we had cloned a ll of the BG genes from the CB chicken. Screening revealed one additional 1312 13G sequence (accession mm1ber KC955 131) from one of our caecal tonsil eDNA librar ies, which was called CTnG (and which we will now rename nGO). 13LAST analysis of the chicken WGS sequence (www.ensembl.org, release 2.1) showed that this gene is present on chromosome 2 (positions I 00590000- 1 00600000), a different chromosome from the chicken MHC on chromosome I G (Figure S3).

Using the partial sequences of all the genes identified at the time, we had designed potentially universal primers, and petfonned RT-PCR on a variety of cells and tissues (Figure 2, Figure S4). We found all of the genes from the cosmids expressed (except one, BG2, which we later realised had a single nucleotide change compared to the 3' end of one of the primers). In addition, we fo und our supposed!)• universal primers did not amplify n GO,

PLOS Genetics I www.plosgenet ics.org 3

The Dynamically Evolving BG Family

but specific primers showed that it has a wide if not ubiquitous tissue distribution.

In addition, we found several nG sequences fi·mn eDNA isolated from Bl2 haplot)VC chickens of the CB congenic line, but not n 12 haplot)'Pe chickens from the parent c line. The congenic line en (B 12) was derived from the Cline (which contains both B4 and n 12 hapJOt)1JeS) by backcrossing with the highly inbred CC (B4) line. The additional sequences (1, 11, lila and IV) were eventually found to be nG genes fi·mn the n4 haplotn>c (Figure S4) that presumably were acquired during the backcrossing to produce the en congenic chicken line.

Finally, to ascertain the relati\'e location of the clusters, we used some cosmid clones as probes in metaphase and fibre-fluorescence in situ hybrid isation {fibre-FISH) of chromosomes fi·om 13 12 splenocytes stimulated with the mitogen concanavalin A (Figure 3). We found a single large cluster of BG genes defined by hybridisation with cosmids from clusters V and VI, separated from the chicken J\.JJ-IC as detected by a cluster I cosmic!. In some fibres, hybridisation conesponding to a nG gene was fotmd at the end of the cluster I towards the nG cluster, which oriented the end of the 1\H-IC with BG l toward the BG cluster. Comparison of the hybridisation pattern of the cluster V and VI probes with hybridisation expected by relative nucleotide sequence identity showed that cluster V and VI are contiguous, with cluster V closest to cluster l.

This organisation of the BG and i\UIC clusters was confmned at the level of DNA sequence (Figure SS). Several BACs that span the BF-BL region through the TRIM region to some unidentified nG genes have been isolated from e n chickens [75]. Amplification fi·mn these BACs identified four BG genes located at one end of the cluster V contig, with the outermost being nG2, followed by BG3, nG4 and nG.'l. Similar!>•, amplification h" t"'""" BGR of cluster VI and n G7 of cluster V physically linked these two clusters, with 1047 nucleotides of DNA between them.

Thus, we found 14 nG genes present in the Bl2 haplotype (as defined by the parent C line). There are two singleton genes, nGO present on chromosome 2 (in the chicken \ \'GS sequence) and nGl found in the BF-BL region of the chicken MHC on chromosome 16. Upstream of the nGI gene is a region containing TRli\1 genes among others, upstream of which is the nG region, the sequence of which is 99,274 nucleotides long. 'ntere arc 12 BG genes located in this n G region, a ll in the same transcriptional orientation, and split into two clusters by the presence of kinesin, lectin and other genes.

Each BG gene has very specific cell and tissue expression, with one group expressed in haemopoiet ic cells and another g roup expressed in tissues

As mentioned above, we performed re,·erse transct; ptase-PCR (RT-PCR) on a variety of cells ancl Lissues with what we thought at the time were universal pdmcrs (which however turn out not to amplify nGO or BG2). For each cell and tissue, we cloned the PC R products and coun ted the number of clones from several independent amplifications, a method used successfully for assessing the relative expression of MHC genes l76,77]. With this simple assay, we found truly striking patterns of expression for each of the analysed genes, with only a few genes expressed in each cell type and restricted patterns even for tissues (Figure 2). To provide additional support for this approach, we developed specific RT-qPCR assays for two haemopoietic and two tissue BG genes, and found that the results with spleen, bone manow, liver and duodenum confirm our expectations based on the data from the approach of amplifying, cloning and sequencing (Figllt'e S6).

June 2014 I Volume 10 I Issue 6 I e100441 7

Page 7: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

I. ~~12 L~?2.:_ WlTIUdp ITU1111111BJ11)

Kinesin Motor

lm10 L~~~~-~!n . HI~ llr'llll0'110

The Dynamically Evolving 8G Family

~5' End of LOC4255771

I.~G9 I.~G8 or mrr~ wrmrm-rllidb~D

Chromosome 16 ~C-Type Lectin

··· .. .... ............................ .. ......... ......... .............. ... .... ... .. .......... .. ................ ..................

. · ··········· ····························· ·· ······ ························· ·· ·· ······ ································ ········· .· ,.--------------- ~ • • • • • • •I TRH-1reg'on '• • • ............... J I Cluster I

.. -----------------------. • • • q SF /BL reg;on 1

I ·-----------------------· -Chromosome 2

Figure 1. Fourteen BG genes of the B12 haplotype are present as two singletons (BGO on chromosome 2, and BG1 In the BF-BL region or classical MHC on chromosome 16) and a cluster of twelve genes in the 8G region on chromosome 16 (BG2-BG13, all in the same transcriptional orientation but separated into clusters V and VI by a region containing a kine sin motor protein gene, a C-type lectin gene, and an unassigned gene called LOC4255771). The genes are depicted with their introns, exons and intragenic regions to scale (except for regions with dotted lines) and in the orientation as typically shown for the chicken MHC and surrounding regions. The 8GO gene was discovered as a eDNA from a C8 (812) chicken caecal tonsil library, but the sequence of the gene is based on the whole genome shotgun sequence (release 2.1 ), located at positions 100590000-100600000 on chromosome 2. doi: 10. 1371/journal.pgen.1 004417.g00 1

ce:!"-S/t41Uf'S &lntrVI

U1"4!N~ !GU !GU &Gtl !GlO r.G9 065 1.(;1 OG6 i':~m· I J ' 6 • ~tl·"f' .. , 131 "' "' ,. 611 "'' <O<

nr z"f7-' fl1 rues 0 0 " 0 0 8c.ds 0 0 11 1 16 thro.'T~'ttl 0 rt'>M:r"'N'e

,.,."" 0 ,. 16

"'' 0 r , 10 IS ... ,,, II 16 LPS •J.'ifc:

dff'\1!,.-iti( c.trs ,. bonem.JtrO' ..

'"""'' 10 ln(mic: 16 11 u IS

tb-trr>A/5 .t.ot. IS ~tt(!.Qu«U!) s stro."l\J {~rt)

strNU (t"~.b;J'c~) bo..HSJ

"'"'~ '"""" '""""' ..... ) lnt~

~ II tr.t~OCtttS(I)f:Ool'llt)

bf•~ 17 116-..-, 17 0 .... lute htvc ~t:'.AI~

~rtV

OGS [G.( <GI OGI 0 ( ' P.t3 )3.7 X111 >73.

·HF 1 .. "" 0 0 0/lff 0 0/lff

0/lff

0 0/lff 0/lff 0/lff 0/lff 0/lff

Of• 0/lff

0 01• 0 0/lff

0/lff s 0/)ff

• 01• 10 Oliff

II II 01•

• 0 Oliff 0 0/·

0/· 0/· 0/· 0/· 0/·

du~~r l "" OGI f GO .. (ff,(j

fU "<>•

"'' 01• 01•

o1ur C/Jff 0/>ff C/Jff 0/lff

01• 0/lff

r 1 Of• otur 0/IIT 0/lff

01• 0/lff

01• Of• 0/• 01• 01• 01• 01• 01•

cAt.r:rcO.~

I • ... t-: ,

11 u IS

No.

'"' ., '' ., ))

30

" 10 30

)1

61

,.

" 20 17

Sl 10 2J 2S

" "

No. Kit

10

Figure 2. Individual 8G genes of the B12 haplotype have striking expression patterns, as assessed by RT-PCR from cells and tissues using what were expected to be "universal primers" followed by cloning and sequencing. At the top, the heading of columns indicates the genes (with their present names along with alternative names previously used) in the same orientation as in Figure 1, and sequences labelled I and II apparently picked up from the 84 haplotype du ring derivation of C8 congenic line chickens from the 812 haplotype of C line chickens. Also shown are the number of independent PCR reactions, and the number of total 8G clones sequenced. On the left, the labels for rows describe the isolated cells and tissues from which the RNA was derived, along with separation techniques and treatments that were carried out (as described in Materials and Methods). Values in the table indicate the number of sequences found by RT-PCR, cloning and sequencing for each gene. After the work was well underway, it was realised that the primers were not "universal", and therefore presence and absence of 8GO and 8G2 were determined by specific primers (designated by a number for the sequences, followed by a plus or minus); NT indicates not tested. The coloured boxes indicate the results for presumed haemopoietic (blue) and t issue (green) genes. To be clear, complete separation of these expression patterns in tissues is not expected: all tissues contain blood vessels, some t issues contain tissue-resident macrophages and some tissues contain primary or secondary lymphoid tissue. · doi:1 0.1371/journal.pgen.l 004417.g002

PLOS Genetics I \'lww.plosgenetics.org 4 June 2014 I Volume 10 I Issue 6 I e 1004417

Page 8: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

The Dynamically Evolving BG Family

A BG Region BF/BL

Fibre FISH .. ... .... ' . . . . ··- .... .

. ··- ·· . ,. . ..·. . . . . ·-- ..

I----Cluster VI------II---Cluster V----1 B

<t N C) v Cll .0 e 0.. .,..

~ ./ /

d -· <'1 d ., .. <t J C) v

/ / /

Cll J -· .0 e 0.. J "''"

/ / / d

'""' """ """' ""' 5(((it ro>n 1{((1 6)))1 "''" -Fibre FISH . .

. .. .. .. . . . ... . .. . ... .. ... . ... -. . . .

Figure 3. The two cosmid clusters are contiguous with the orientation cluster VI-cluster V, followed by the TRIM and BF·BL regions, as assessed by fibre-FISH and sequence comparison. A. Fibre-FISH of DNA from Con A-stimulated C-812 spleen cells (812 haplotype) with a BF·BL probe (cosmid c4.5 in white), a cluster V probe (cosmid cG43 in red) and a cluster VI probe (cos mid cG24 in green), with the image of red hydridisation shifted above for clarity. Note the single spot of hybridisat ion at the inner edge of the white hybridisation, which indicates the BG1 gene and correctly orients the BG region. B. Detailed comparison of two BG region probes indicates orientation of the two clusters. Upper panel, on top are the gene sequences for BG2·BG13 (as depicted in Figure 1 ), and to the left are the sequences for the two probes (cG43 for cluster V in red and cG24 for cluster VI in green). with a dot plot showing sequence identity (dottup program set to 150 nucleotide word size, as described in Materials and Methods). Lower panel, interpretation of hybridisation patterns expected based on the dot plot, compared to two representative examples of actual fibre-FISH, with cG43 in green and cG24 in red. doi:1 0.1371/journal.pgen.1 00441 7.g003

ln the blood cells of chickens with the II 12 haplotwe, T cells strongly express BG9 and express weakly JIG 12. B cells also express BG9 and BG 12, but in addition strongly express IIG7 and weakly express BGIJ and BG 13. In contrast, thromhoC)1CS do not express BG9, but more evenly c.xpress BG5 and BG 11 along with BG7, BGIJ and BGI3 Qike B cells) and I.IGI2 Qike both B and T cells). Macrophages isolated from blood, whether or not treated with agents like lipopolysaccharide (LPS) and intetferon·gamma (11\/Fy), strongly express BG 7 Qike B cells) and BG9 Qikc T and B cells). Denchitic cells developed in culture with IL4 and Glll-CSF from precursor'S in the blood strongly c:-.vrcss BG9 Qike B cells, T cells and macrophages).

BG molecules were originally discovered as blood group antigens on erythrocytes, which do not contain appropriate

PLOS Genetics I www.plosgenetics.org 5

messenger RNA. However, bone marrow fi·om phenylhydrazine· treated chickens should be enriched compared to nonnal bone marrow in erythroid precursor cells which have Ri~A for the BG molecules found on erythrocytes. On this basis, et)•throcyte BG molecules of the B 12 haplotype may include BG 7 Qikc B cells, thrombocytes and macrophages), BGIJ (like B and thrombocytes), IIG II Qikc thrombOC)'Ies) and BG 12 Qike B cells, T cells and thrombocytcs). Also found but not enriched were BG5 Qike thrombocytes) and BG 13 Q.ike T and B cells).

Primm)' l)~nphoid organs such as th)~nus and bursa are the source of mature peripheral T and B cells, respectively. ln these organs, precursor l)~nphoid cells undergo complex difie rentiation and selection events dependent on a variety of other cell types,

June 2014 I Volume 10 I Issue 6 I e1004417

Page 9: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

including one or more types of so-called stromal cells. These stromal cells, at least in the th}~nus, are known to include both haemopoietic and non-haemopoietic cell tn>cs. As might be expected, the expression of BG genes in these organs is complex. Comparison of the expression of BG genes in thymus and bursa with the small lymphoid cells suggest that (at least some oQ the precursor T cells (th}~nocytes) express I3G9 and BG 12 like T cells. At least some of the precursor B cells in the bursa may express BG9 like B cells, but there was no evidence of expression of UG 7 (which was strong!}' expressed in B cells), BG 12 or BG 13 (more weakly expressed in B cells), so these Jailer three may be difihentiation antigens in the U cell lineage. Comparison of stromal cell populations prepared by gradient centrifugation from precursor cells or by killing most of the rapidly dividing precursor cells with cyclophosphamide suggests tha t in the U 12 haplotn>e the various stromal cells of th}~nus may express DG3, BG7, DGB, BG9, B 12 and BG 13, while the stromal cells of bursa may express BG3, BG4, BG9 and BG 10.

In other tissues, the expression pallems were complex, which may be the result of a single cell t}1>e expressing several BG molecules, or may reflect the presence of multiple cellt}ves each of which expresses ce11ain BG molecules. Notably, we only detected BG I expression in intestine (adult duodenum but not embryonic enteroq•tes}, even though the original 8.5 genomic fi11gment was fo und to hybridise to Ri\!A from chicken th}mus, liver, a T cell line and a B cell line [72J. Also, we found intestinal expression of BG I 0, which has a nearly identical sequence of the C)1oplasmic tail to the prc\~ously identified zipper prote in (Figure S3), a protein described to regulate actin-m yosin interaction in the intestinal epithelium f70l- Duodenum strongly expressed BG I, BG3, BG4, BGG and BG I 0. By contrast, embl)'Onic enterocytes expressed only BG4. Brain strongly expressed BG9 (if macrophage-like microglia arc the source, then they diOcr from peripheral macrophages which strongly express both BG7 and UG9), while kidney strongly expressed BG I 0 Oike stromal cells of th}~nus and bursa).

O ve111ll, we were able to discern two types of expression, one primarily in haemopoietic cells and the other primarily in tissues (presumably from other ccllt}1JCS, at least some of which we expect to be epithelial/stromal cells). Hacmopoietic BG genes include BG5, IlG7, BGB, IlG9, BG II , IlG I2 and BG 13; while the tissue IlG genes include BG I, BG3, IlG4, BGG and IlG I 0. As described below, the assignments of haemopoictic- and tissuc-l)ve BG genes correlate pe,{ectly with the relationship of the presumed promoter and 5' UTR of these genes.

All the BG genes have similar structure, but there are many hybrid genes with the 5' end determining cell and tissue expression

In order to beller define these 14 IlG genes, we compared them with published eDNA clones [3B,44), all of which were from other haplotypes, so we could not be sure whether we were comparing a eDNA with the appropriate gene. Using CLUSTLx, we found the same gcneml organisation for evc1y IlG gene in the B 12 haplot}1>C (Figure 1): a first exon composed of roughly 200 nucleotide 5'UTR followed by a short signal sequence, a second exon encoding the immunoglobulin variable-like (Jg V -like.} extracellular domain, a third exon encoding a connecting peptide and transmemb111nc region, a large number of small exons encoding 7 (or sometimes 8) amino acids which altogether would result in a cytoplasmic region with the potential to produce a coiled-coil, fo llowed by an cxon of the 3'UTR.

We then compared the predicted intron-cxon stmcturc with our authentic cDNAs of BGO isolated fi·01n a Bl 2 library or amplified

PLOS Genetics I \WIW.plosgenetics.org 6

The Dynamically Evolving BG Family

from tmnsfcctants with BG I, BG I 0 and BG II genes of the Il1 2 haplotype (Figure S7). The exons encoding the V-like region, transmembrane region and the cytoplasmic domains were perfectly predicted, with the exceptions due to alternative splicing or read-through in the cytoplasmic c.xons (and one predicted cytoplasmic exon in BG 10 that was not found). Similarly, the end of the first c.xon and the beginning of the last cxon were perfectly predicted. for BGO, the WGS sequence showed two 3' ends, but they tumed out to be due to mis-assembly in the genome sequence (Figure SB). f or BG I, BG I 0 and BG II , the location of the primers for amplification of the eDNA preclude assignment of the very 5' and 3' ends of the genes. However, sequence comparison strongly suggests that 5' of all the BG genes start at roughly the same place. J\·lorcover, the 3' ends seem clear from the conserved location of the single polyadenylation site in all genes.

In order to detennine the relationship of the 14 JJG genes, we pe1fonned ph)·logenetic analysis by neighbour joining (NJ) on the whole genes as well as portions of tllC genes. Dendrogi11lllS of the whole genes show two groups of pa111logues, one for BG genes expressed in hacmopoietic cells and the other for BG genes expressed in tissue BG genes in the BG region (Figure •l). As will become clear below, the topolog}' of these trees depend on the relative amount of sequence from d iOerent parts of the gene.

T he dendrogmms of the presmned promoter (as defined by the sequence of the first 500 nucleotidcs upstream of the 5'UTR) and the 5 'UTR (as defined by sequence si.milaril)• to published eDNA sequences) showed the same topolog}' as the whole genes, a topolog}' with long branches and strongly supported by the bootst'~'P values (Figure 5). Comparison of these trees to the expression patterns in Figure 2 indicates that the promoter region (and possibly the 5'UTR} of a BG gene is the prima1y detenninant(s) of cell and tissue-specific expression.

There is almost no sequence identity between the two groups of promoters out to 1000 nucleotides before the 5'UTR. However, the analyses were carried out with 500 nucleotides corresponding to the proximal promoters, because the distal promoters of two BG genes conta in sequence brought in fi·om neighbouring genes (Figure S9). The distal promoter region of BG I includes a duplication of a p011ion of the promoter region in between the neighbouring BNK and Blec genes, followed by a region of sequence of unknown evolutional)' origin, and finally the proximal promoter region that is similar to other tissue BG genes. Similarly, the distal promoter oft he UG9 gene. in the B 12 haplotn>e is unlike the consensus BG genes, and appears to have been derived from a h)1Jothetical protein gene which is present in the red junglefowl haplot}Ve but which has been deleted in the B 12 haplol)')le (as shown below). This dista l promoter sequence contains seveml brain-specific tmnscription f:<ctor binding sites (Figure S9), consonant \\~th the expression in brain of BG9 in the 1112 haplot)1JC.

T he sequences of the 5'UTR of the BG genes also fall into two groups (Figure G), with a large specific deletion in the haemopoietic BG genes compared to the tissue BG genes. It seems likely that the dificrencc is a true deletion, since two 27 nucleotide direct repeats a rc found in the tissue 5' UTRs which upon recombina tion would yield the deletion found in hacmopoictic 5 'UTRs. The simplest inteq>rctation is that all the hacmopoi.etic llG genes in the B 12 haplot)')le are descended from a single ancestor, but it is also possible that concerted evolution between BG genes could lead to the same result.

In contmst to the unambiguous dendrograms of the 5 ' end, the region corresponding to the signal sequence and the extracellular V-like region fanned dendrograms "~th completely dific rent topologies, which had short branches and were gcnemlly poorly

June2014 l Volume 10 !Issue 6 1 e1004417

Page 10: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

The Dynamically Evolving BG Family

Promoter Signal V I TM 21nt S'UTR Sequence -reg on Domain Repeat Region 3'UTR

BGO

BG13

BG9

BG12

m BG8

BG7

BG11

BGS

BG6

BG3

931 BG4 1000

BGlO

BG2

BGl

Figure 4. Phylogenetic analysis re veals six kinds of BG genes in the 812 haplotype: The two sin gletons each separately, and the twelve BG genes of the BG region in four groups indicating the presence of hybrid genes. left, relationships of whole BG gene sequences (from 500 bp upstream to near the "end of the 3'UTR as determined by the predicted polyadenyiation site) as assessed by phylogenetic ana lysis (numbers at nodes indicate boot strap values determined from 1000 replicates). Right, relationships of different regions of BG genes indicated by colour, as determined by separate phylogenetic analyses in Figure 5. doi:10.1371/journal.pgen.1 004417.g004

supported b)' the bootstrap values (Figure 5). Howe1•er, there are three groups each consisting of vet)' similar V sequences: BG3, BG -! and llG5; llG7 and BG II; and BGS, llG9 and BG 12. The last group of genes alSo share a deletion in intron I, resulting in an intron of 11 3-144 nucleotides for BGS, BG9 and BG 12 compared to 352-35·~ nucleotides for all the other IIG genes. All the signal sequences and V -like regions have the expected sequence features described for BG genes, including the lack of !\'-linked glycosylation sites in the extracellular domain. This means that, contrat)' to almost eve!)' other type I membrane protein, all BG molecules lack !\'-linked glycans (as p reviously shmm for BG molecules from C1)1hroq1es, LGOJ), a curious property that has not yet been explained.

T he dendrogram s of the connecting peptide/transm embrane cxon also yielded a tree (l•igure 5) with short branches and low bootstrap values, but are separated into two broad groups. The sequences of these regions (Joigurc 7) arc virtually identical muong the llG genes, with a helical wheel depiction suggesting a flattened side for interaction between the two chains. In addition, some polar residues arc found in most sequences, which in transmembrane regions can indicate specific interaction with polar residues of other chains. For all but three of these IIG genes, the polar residues include two basic runino acids Qtistidine and lysine) near the start of the

PLOS Genetics I www.plosgenetics.org 7

transmembrrute region, but there is a well-supported group of three BG genes (BGS, BG9 and BG 12) with hydrophobic leucine a nd polar threonine in those positions. All three are haemopoietic genes in which the V-like regions also fonn a group, perhaps indicating rclatil"cly recent duplication Cl'ents. One gene (BG2) has a proline in the transmembrane region, which is most unusual. Fina lly, at the end of the transmembrane there is a trrosine in all of the BG sequences except BG6 and JIG7 which have a cysteine perhaps indicating a palmitylation site, and BGS, BG9 and BG 12 which have a histidine.

The cytoplasmic tail is encoded by sm all exons, 2 1 (or sometimes 24) nucleotide cxons long. The predicted number of such exons varies between BG genes in the B 12 haplot>ve, ranging from 13 in the BG I (8.5) gene to 36 in the BG 10 (zipper protein­like) gene, with a mean number of 26 (Figure S2). O ut of 358 total exons, we identified 57 different groups of nucleotide sequences (Figure S lO). Removing exons present only once among in the 11 BG genes, we could discem clear patterns, particularly if the first roughly 20% of the exons were removed from ana l)osis. The dendrogram (Figure 4, Figure S 10) based on this last 80% (including exollS p resent only once in the 14 IIG genes) shows two groups separated by long b ranches and with strong bootstrap suppm1, along with separate branches for BGO and BG 1.

June 2014 I Volume 10 I Issue 6 I e1004417

Page 11: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

Promoters

...... 0.01 9118

Signal Sequences

.------------------------ BGO

1--l 0.01

Transmembrane Domains

1--l 0.01

BG6

BG7

PLOS Genetics I www.plosgenetics.org

BGO

S'UTRs

.--------- BG4 L_ _______ BGO

BG5

The Dynamically Evolving BG Family

1-i 0,01

Variable Regions .-------------------------- BGO

I----t 0.01

998

998

8

1000

BGl3

3'UTRs

BG7

BGll

BG3

BG4

BG5

BGlO

BGO

BGl

BGll

Bul

June 2014 I Volume 10 I Issue 6 I e100441 7

Page 12: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

The Dynamically Evolving BG Family

Figure 5. Phylogenetic analysis of nucleot ide sequences for different reg ions of BG genes indicates sep arate evolutionary histories, consistent with recombination andfor deletion lead ing to hybrid genes in the BG region. The proximal promoters (500 bp upstream of the presumed transcriptional start si te) and S'UTRs fall into two well-supported groups that correlate with hemopoietic (blue) and tissue (green) expression as determined in Figure 2 (the separation of the BGO, BGT and BG2 are due to short deletions in the S' UTR, as seen by sequence alig nment in Figure 6). In contrast, short branches with generally poor bootstrap support cha racterise the signal sequences and variable lg-like regions. Transmembrane regions fall into two groups (as seen by sequence alignment in Figure 7), except for BGO. 3'UTRs fall into two well-supported groups, except for the two singletons, which include sequence of apparently distinct evolutionary origin at the very 3' end. doi:10.1371/journal.pgen.1 004417.g005

However, the dendrogram has a difle rent topology than the whole gene, the promoter and the 5 ' UTR, and much like that for the 3'UTR (as shown below).

Some cD NAs show unspliced (or retained) in trons for which the sequence remains in fi«me, and we SU!,'b'ested [ 78] that these extra stretches coding for protein have the potential for interesting functions. The original cD NAs fo r BG l (8.5) have a long stretch of contiguous C -ten n inal sequence (accession numbers KC955 132 to KC955 !36, f59]), but our a nalysis shows that this region is in f.1.ct d ue to what was originally an unspliced intron, beca use the flanking 21 nucleotide repeat cxons can be identified which have apparently reasonable splice sites (Figure 2, Figure S l l). Interest­ingly, this region has recently been identified as containing a functional immunoreceptor tyrosine-based inhibito ry motif(ITL\·1) (71], fulfilling our original prediction. We examined all of the sequences for the possibility of unspliced introns with in-fi·ame sequence, and found between one a nd five per gene (Figure S II )­We found !T!Ms in translated intron sequences of six other ge nes (BG3, BG6, BG8, BG9, BG 12 and BG !3), but translation of all of these introns gave stop co dons almost immediately after the !T l"-·1, which would lead to truncated cytoplasmic tails (Figure S l l).

T he 3 'UTRs mnge from 465 to 48 1 nucleotides in length, encoded by BG 13 and BG l l respectively. Dc ndrogmms (Figure 5) show two groups with the same topology as the cytoplasmic exons, with long branches and good bootstmp support.

O vemll, the 5 ' end of the gene clea rly defines two groups that reflect the tissue distribution, the 3' end defines two different groups, and the region in between does not f.1.ll into simple groups. Phylogenetic trees constmcted by Bayesian analysis and by

"- laximum Parsitnony (MPJ give compamble topologies as the i'{J method (I'igurcs S l2 and S l3), a nd AU and SH tests aller MP analysis provide statistical support fo r the presence of the two &"·oups a t the 5 ' end and the two groups at the 3' end, but no clear groups in between (Figu re S 13). T his result is most easily explained by the presence of hybrid genes (in tlte sense used in refe rence [20]) fon ned by recombination between the two ends, in which the m iddle of some (and maybe all) genes has been so randomiscd by recombination tha t no phylogenetic signal is left. Neighbour network analysis by SplitsTrce, a Phi test and an automated parti tioning algorithm all support a histmy of extensive recombi­nation across the BG genes, ";th independent h istories for tlte 5'UT R, the V-like region and the 3 'UTR (Figures S l4 and S l5). Recombination is cet1ainly a plausible explanation for the sequence rela tionships found, since the 12 BG genes in the BG region are all close together in the same tra nscriptional orientation, so hybrid genes could be produced either by unequal crossing-over (through intcrchromosomal recombination, a lso kno\\11 as non-allelic homologous recombination or NAHR) or by deletion (through intrachromosomal recombination) during meiosis. O ne of the com equences of such unequal crossing-over or deletion is expansion a nd contraction of this pat1 of the multigene famil}', leading to copy number variation (Cl\'\0 in the BG region.

Jn this view, haemopoietic genes have either their m;ginal hacmopoietic 3' end o r a tissue 3' end, and tissue genes have either their original tissue 3' end or a haemopoietic 3' end. Unfortuna tely, with the data a t our d isposal, we cannot be absolute!}' sure which is which, so for the tun e being we will refer to the 3' cuds as t}1JC I and 2. T hus, BG8, BG9, BG 12 and BG 13

Repeat '1 ~ v .,, • 'r ,

tissue BG genes

I<" CC rc. I C. CT<C r ACACiCTCCTTCC tGC U A TTC t fCC t C.U.C. t TTTt< U U t CTl'C n fCC AU TCT t C ff((C( A TC HOC t ( ((N( lCC fCC U<lCC.A TC TC.Ct I CCC U .... l ( ICC I CC 1 IOU. HC CC. 1 JCCCCA_.t. TC T

.UI' ,, , , , , ,, ,,( ,,,,, , ,,,, , ,, , , , ,, , , . , ,, , ,G.,.,,,,, ,, ,,,,,,,,,,, •• • •••••••••••• .G ... T . . .. A . . .. ,,,,, . ,, . , , ,, .. ,, . , •· •• · · •••• · • · •• • • , , , , . .. . · ••••• · • • "' ' , , , , , , TT , • • ,, • ••••• T . , . f , ( f . .. , , • • ••• A( A, .A , •••• ,( . C • •• ACA • • • • •••• • , . , •• , ••• , , • • • T • • • • A • • , ,, . ,, ,. , ,, , • • oAT , ,, , ,T,, , T , C, , , T ,C

lo:>l • •• • •• ••• • • • • •••• • ••• • • •••• ••·· · · · • • • G .. .......... . .... . . . .. . .... • •·•• • • • • • • • ··• • • • • II A •• ••••• • • •• • AG • •• • • • • • • • •• •• • • • • G • • • GT •• • • • • •• • •• •• • •• • • 141 A • •••• Tr ••••• •• • • T . T •• • • • c r . c •• • • T. c ... G. T • •••• c.c c • •• •• •• •• •••• • • A • •• Ac • • • • • • • • r • .. • AT • • r . . .. .. . .. ..... . . . . . . A .

1------~ : .. ....... ............................ o . . . ...... . .. . ... . ..• . .. ... . . ..•. . .. . • . ... . •. . . A • •• • ••• ••• • •• • •••••••• • ••••••••• • • • •••• • •• • •••••• • •• • •• • •

hemopoietic BGgenes

lt;jl\ G , • • • TC •• , • • •• • • • CT • • G •• CT •••• • • • • C •• CAT • • • • C • • • CCC • • A •• • •• • ••• • • •• C . , • • ICiS G, , ,,,T f, •••• · • • • T .l • • G • • CT •• • • • • , .(,, (AC ••• , C •• , CCC.,A • ••••• • , ,,, , , ( , , ,, r« G .. ,,, TC, ,,,, . , , , , , l, . G, , CT,, , ,,,, , ( .. CAC, , , , ( , , .CCC, , A,., , ,. , , , •• • , ( • • ,,

IG" G • •• • • TC , • • • • •• • • •• t . . G . • CT •• •• • • • • C • • CAC • •• • C • • , CCC: • • A . .. . . . . . . . .. . c . • •. llll.l o .. T • • t r • •••• • ••• T. tr . o • • c r • • • •• A . • c • • c&c . .. . c ccc .. A . . . . . . .. . ... . c • •••

II:OP G • • • •• TC ••• •• • • • • • C T •• G • • CT •• •• •• •• C •• CAT ••• • C . , . C:CC: •• A • • • , • • , ,. ,,.,( ,, • •

L--------1 IG) G •• • •• TC .. .. ... . .. . T • • G •• CT ••• • •• • • C •• CAC ••• • c ... CCC: •• A ...... . ... , •• ( •• , .

ttssue BG genes

Repeat I I 'r 1i1

I I 1fJ

teo. C< TTCC<CCACCACC TT C TCC TAT< A TC TTC fCTC A TC T T TT .&CCC AT f fT C fA CCC AC AT Tc: T«CCC A TC'JCC TCCATC.A T( TCCt TC TCAGTC J (( ft CC'TC J ( TCT CC TTTCC TTCCCCCCTCC TC f TCTCCAQCACAG ~10 •• • • • r .. . .. . r .... . .. . ..... .. .. . ...... . c . ... .. T ••••••• • ••• • • c. ... . . . . . . . . ...... . . ..... . .. . . .. . ...... . . . .. o • ••• • • • •• ••••••••••• o o•· · · ·· tG-4 o•,. , • • •• To(( , JG •••• • • , •• • , • • • ,, .• •• T • • • , • • • • • , •• , , • • • • • · • • • •• • • • , • •• • • •• • •• • •• • • • • • • ••• • •• • • • • • •••• • • • • •• • • • • • • • • •••••• •• • • • • • • ••• •••

~1 • o · · · ' · · ·· · · ' ·· · ······ · · · ·· ' ··"·· · · ·· · ···· · ··· · ··· · ·· · · · ··· ' • o • o•• • o • • • o• • • • ·• • ••••• • oo o• • ••• o• o• oo o o• •• o• •• o• oo o•• • ooo • •o • • • oo • •• • • •• •········ ~~ . . . r r . .. . . c . . . . .. r .. o . ... ... . .. .. .. ... . . .. . r .trr A • •• • •• ••• c ... . . . . .. . ... . . .. . A • • T •••• • •••• • • • T •• •••• • cr .. .. . c . . cr . . .. . r . ..... •

1-------t ~ ..... r ...... r . . .. T. . .... . . • ....... T .. . A ••••• A ... ........ c .. . .. : : ;: :: : : ' ·: :: : :::: : : ::~:: : ~: · ~~ : ;:: : : : :: : ~:: : ;:~:: ;: :: : :: : ; ::: : :: : : : :: : ::::::: IGU ,,.,,, , ,, . ,,,, ( , ,, . ,, . , . ,, • •• AG, • • , • • ,o• • o· •• · • < • • To• •• • · •o • oo• • • · ••• •••••••• •

hemopoietic BG genes

~Go\ .AG .• •• • T • • ••• C. , •• • •• • • ••• •• . A •• • • • •• • ••• •• ••. C • • T • • • • 1' • • ••• 1' • • • • •• • • ••• T •• • • K4 ••• ••• • • •• • ••• C. G • • • ••••• • •• • AG ••• • • •• o •••• , , • • C • • T ••• • •• •• • o • •• • • • o•o• oooo ooo ..,u "'" ''" ...._ _____ __, ""

. . .. .. ...... .. c.o . . .. r .. . .. . CAO • •• • • • • •• • • •• ••• c • • r •• • •• •• • • • • •• • • ••• • •• • • •• ••

..... . .. r ..•.. c .... . ... . . ... .. .... . . ...... .... .. c •• r .. . . . . . . ..... . . . .... . . . . o • •

.•..•.. . . . •... c . . • . • •• • •.• . • • AG •••••• • ••••• A •• • c •• r • •• • •• . o• ·· o••• • • • o•••• o• o o • o • • • •• • •• o • • • e .G . . • •. •. . •. . •• c . .• ••. . •• . .• . . •• c • . r •• .. or • •• •• • •••• •• • ••••••••

Figure 6. Sequence a lignments fo r the S'UTR of 8 12 BG genes, showing the separa tion into genes expressed in hemopoie tic cells and in tissu es. A large gap in genes expressed in hemopoietic cells was presumably created by deletion between two direct repeats indicated by boxes, and smaller gaps are found in the ge nes expressed in tissues. doi:10.1371/journal.pgen.1 00441 7.g006

PLOS Genetics I www.plosgenetics.org 9 June 2014 I Volume 10 I Issue 6 I e1004417

Page 13: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

Transmembrane domains containin1

lysine

•.•••••:: *·**: ***: ' '"' DPFSQIVHPKKVALAVIY'i'LLV FVIIAFLYRXK ' '" DPFSQILRPKKVALAVIVTLLVGSFVIIIU'LYRXK r<iiO DPFSQIVHPKKVALAVIY'i'ILVGSFVIIIU'LYRXK ' ' " DPFSQI'lRPKKVALAVIV'iiLVGSFVITVFLYRXK EGII DLFSQIVHPHKVALAVVY'i'ILVGSFVIIIU'LYRXK ~ DPFSQIVHPHKVALAVVY'i'ILVGSFVIIVFLCRKK «> DPFSQIVRFHKVALAVVVTILVGSSVIIVFLCRXK £GJ DLFSQIVRPHKVALAVVY'i'LLVGSFVIINFLYRXK ~ DPFSHIVRFHKVALAVVL'lLLFASFVIIVFLHRHQ t<l DPFSQIIHPHKVALAVVIIPLAGSFVItiVFLYRKR eoo DPFSHIIIIPHKVALAVIY'i'LLVGSFVItiVFLYRXK

35 35 35 35 35 35 35 35 35 35 35

o<.• DPPSHIILYII'rYALAVIITLLVGSPVVRVFLHRXK 35 EGI DPFSHIILYHTVALAVIITLLVGSFVVtM'LHRKK 35 «> U DPFS IILYliTVALAVIITLLVGSFVVINFLHRXK 35

1 .•••... 10 .•.•••.. 20 .••••••. 30 •••.•

The Dynamically Evolving BG Family

Figure 7 . Sequence relationships for the connecting peptide to transmembrane region of B12 BG genes show two groups, those which have histidine and lysine near the N-terminus of the transmembrane region, and those with a leucine and threonine (arrows). A helical wheel shows that one side of an alpha helix through transmembrane region is primarily composed of larger residues (F, phenylalanine; I, isoleucine; l,leucine; W, tryptophan) along with a smaller residue (5, serine), while the other side is composed of smaller residues (A, alanine; G glycine; T, threonine; V, valine). This arrangement suggests that one side of the helix forms a flattened surface fo r interaction as a dimer, with the signature charged residue (K, lysine) near the edge of this interaction zone. doi:l 0.1371/journal.pgen.1004417.g007

might be pure haemopoietic genes, IIG5, IIG7 and IIG II might be haemopoietic genes with a tissue 3' end, BG2 and BG 10 might be pure tissue genes, and BG3, BG4 and BG6 might be tissue genes with a haemopoietic 3' end (Figure 8). Altemativcly, BG 7 and BG II might be pure haemopoietic genes, BG5, BG8, BG9, BG 12 and BG 13 might be haemopoietic genes with a tissue 3 ' end, BG3, BG4 and BG6 might be pure tissue genes, and BG2 and BGIO might be tissue genes with a haemopoietic 3' end (Figure S I6).

Definition of BG genes in a red junglefowl haplotype and comparison with the B 12 and other haplotypes shows evidence of expansion and contraction of the multigene family through deletion of genes and swapping of whole BG clusters

T he WGS sequence was created from a chicken of the UCDOOI line, an inbred red junglefowl line with the BQ haplotype, closely related to the standard B21 haplot)1>c in experimental lines of chickens derived fi·om egg layers [79]. Other than BGO, BG I, llG2 and BG I 0 (with fiG I 0 being zipper protein-like), no BG genes were con·ectly identified by ENSEMBL in this genome sequence.

lly using BLAST to probe with a 3'UTR sequence, seven IIG genes arranged in tandem and in the same transcriptional orientation were identified on a supercontig (covering contigs 318.1 to 318.6) in the contiguous sequence for chromosome 16 (Figure S17). 1l1e automatic annotation progranune GENSCA1\I utilised by ENSEi\·ffiL apparent!)' did not recognise the 5' ends of these BG genes, and therefore they were only predicted as producing a single long transcript. llJC position and orientation of this cluster was verified by comparison to a BAC contig from the same chicken [llO), from which th e ftrst two BG genes as well as a lectin-like gene, a kinesin gene and the intetYcning do\\11SU·eam TRIM region had been sequenced (Figure 9). However, in the portion for which there is only the \ VGS sequence, there are gaps in assembly that raise the possibility of an additional two BG genes.

Direct sequence comparison of the red junglefowl sequence from the BACs, the redjunglefowl sequence from the WSG sequence and the B 12 sequence from the cosmids (l'igure 9) shows that there has been a precise replacement of the BG! I gene in the Bl2 haplot)'jJe with at least four genes in the redjunglefowl haplot>ve, with 99.91! and 98.90% sequence identity between the two haplot)'jJCS on the left side and the right side, respectively, of the breakpoints.

PLOS Genetics I www.plosgenetics.org 10

Moreover, the red jungle fowl sequence goes directly into the TRI..\1 region after the lectin-kinesin gene pair, where:tS the fil 2 sequence has two additionalllG genes after the lectin-kinesin gene pair and no indication of the TRi i'vl region. There arc a lso some deletiotlS in the WGS sequence compared to the llAC sequence, which m ay reflect dilferences in the exact haplot)'Pes or in sequence assembly. However, this compatison strong!)' suppot1s the notion that recombination leads to strong difierences between BG haplot)'flCS.

In addition, at least nine red junglcfowl BG genes ammged in tandem were identified in the bin "chromosome 16 mndom", which consists of contigs predicted to be on chromosome 16 but not assembled with the contiguous portiotlS of the WSG sequence (l'igure Sl7). The order of these genes is not known, but on the basis of fibre-FISH they form another cluste r, located next to the fu·st red junglefowl cluster (l'igure I 0). llms, there appears to be in the neighbourhood of 18 BG genes in the llG region of the red junglcfowl haplot)1Je compared to 12 BG genes in the B 12 haplotype, demonstra ting CNV for the BG region.

Phylogenetic comparison of these two BG clusters from the red junglefowl haplot)'Pe with the B 12 haplot)1Je showed tha t the first redjunglefowl cluster is highly related to cluster VI fi·om the Bl 2 haplotn >e, but the second red junglefowl cluster is not closely related to any of the other clusters (l'igure S IB). Fibre-FISH shows that the two red junglcfowl clusters are contiguous, based on their length and hybticlisation to the B 12 clusters (Figure I 0), and the evidence from comparison to the reported BAC sequence locates and orientates the first cluster next to the TRIM region. Thus, the order of the clusters in Bl2 is BG cluster VI-BG cluster V-TRii\1 region-BF/ BL region whereas the order of the clusters in red junglefowl is Second BG cluster-Fit·st BG cluster (related to cluster VI)-TRIM region-BF/ BL region. This remarkable result is most easily explained by large-scale expansion ami contraction events in the BG region, with whole clusters swapping in and out.

To test whether the dificrences between the Bl 2 and red junglefowl sequences were due to one of them being an outl)·ing variant compared to most MHC haplot)1Jes, we pct{onned fibre­FISH on an additional five haplot)1lCS (B2, IH, ll1 5, 1119 and the true B21 haplot)'pe). It is apparent that the order of the BG, TRIM and 131'-llL regions is stable, but that the BG regions vary in size and order of BG genes (Figure 10). Thus it would appear that the expansion and contraction of the BG genes in the BG region is a general phenomenon.

June 2014 I Volume 10 I Issue 6 I e10044 17

Page 14: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

The Dynamically Evolving BG Family

Kinesin Motor .---"'t Protein

~~~~~~~~lrlYfl'l)ln~~~~~~l .. lUvjC-Type Lectin

.... ........ .. ................ .... ... ........... ........... .... ...... ............ ........................ ................... . ··

r •••••••••• •• ••• 1 Kinesin Motor protein :

•. : and C·type le<Uo, shov.n 1, ., ,,, 1 by hybridisation I

·---------------· ··········· ··· ·· ·· ············· ··· ·········· ···· ······· ··· ·· ··· ········· ····· ····· ···· ········ ······· ······ ········· ···· ····

~ Chromosome 2

Figure 8 . The presence of h ybrid 8G genes in the 812 haplotype shows no obvious patte rn, co nsiste nt with a random process of reco mbination in the centre of the genes. The 14 BG genes of the 812 haplotype (as in Figure 1) are depicted with coloured boxes illust rating presumed origin (as in Figure 5). See Figure S 11 for an alternative view. doi: 1 0.1371/jou rnal.pgen.1 004417.g008

Discussion

For the first t ime in the study of BG genes, we have an understanding of the genomic organisat ion of a complete BG haplotr pe, coupled with a comparison to other BG haplot)1JCS and a detennination of cell and tissue expression . Two overarch ing points

emerge among the many new findin!,'S, which together pm1ray BG genes as a m uch more dynamic and complex genetic system than their closest mammalian hom ologucs, the butyrophilin genes.

The first major poin t that we establish in this paper is the very specific cell and tissue expression for each of the RG genes, which overall form two groups (along with one gene that may have a

8Ac-t.OC22S771 8AC·TRl\C 7.2

I---,,/G=S--II',llo<h!.....,,....,.~....,oi-Pr-""'-,n-l-OC4...,-25""n=l--~I\'GS-5<nlar.oTR•!.I7 lOC4finz

I'IGS-NAZ I'.JS-NAJ I'IGS-\IAS I'IGS~ Yo'GS-N\ __ BA_ C_·_BG_J _ ___,.,

~-4 ~ -4 I'IGS-IlAS ~ ~ ~ VIGS·IIA9

-{···/- 0 llroro\'ln$equ<n<ec:I ...W""'"~ h Who'e std.g.n Ge:nome...

Figu re 9. Comp arison of co smid cluster VI fro m the 812 haplotype with the 8Q h aplotype fro m a red j unglefowl, showing re gions o f virtua l identity separated by two large ind els, one in the middle of the seq uences and the other where the red jung lefowl ha plo type (but not th e 812 haplotype) continues into the TRIM re gion. Genomic organisat ion o n bottom line is from cluster VI of this paper (accession number KC955130) compared to two sequences from the BQ haplotype, middle line from the WGS sequence assembly (nucleotides 166492- 252491 on chromosome 16) and top line from the sequence of a BAC from the same individual chicken (accession number AB268588.1). Note that there exist d ifferences betwee n the WGS and BAC sequence s, and further that the WGS assembly has regions of unknown sequence with only approximate length. WGS-NA indicates genes not annotated by ENSEMBL at the time of this analysis. doi:10.13711journal.pgen.1004417.g009

PLOS Genetics I www.plosgenetics.org 11 June 2014 I Volume 10 I Issue 6 I e 1004417

Page 15: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

The Dynamically Evolving 8G Family

B2 . . . ., . . ·--- . -B4 ' ' ~. o o o fl o o ' ' ~ t. I • • o ••• - ---B12

. . . . · ... .. .. ... . . . . ... • . . • . ..:. '.l 0 • • • -· . .... . .

. . . . . . . . - . -..

B15 'o • o ' o I

.. : ., .. . . . ··· ·,·· ' ··.. . . . . . . . . . . . . . . .. .:" .. ... • -

B19 . . oO O O ~ 0 0 ... o .... O M 0 -·--- · · ·

. . . . ..

B21 . . . . . . . . . .

. .. ... . . . . . . ~ . . . . . . . . . . . .: . ' . . ; . . . ·.- .. .. :·· -. ~ . --

Figure 10. The BG regions of six haplotypes are located In the same orientation from the Bf-BL region, but vary In size and composition, as assessed by fibre-FISH using probes corresponding to the cosmids cG43 from BG cluster V (red), cG24 from BG cluster VI (green) and c4.5 from BF-BL cluster I (white). Each panel is representative of several fibre-FISH experiments with genomic DNA from 82 (152 cell line), 84 (identical in 8F-8L region with 813, UGS cell line), 812 (Con A-stimulated spleen cells), 815 (TG15 cell line), 819 (1519 cell line) and 821 (TG21 cell line). doi: 1 0.1371/journal.pgen.1 004417.g0 10

more ubiquitous tissue distribution), strong!)' supported by the phylogenetic analysis of the presumed promoter regions. Although BG molecules were first discovered as a polymorphic antigen on el')1hrocytes, it has been clear for some years that there is a multigene family of BG genes, at least some of which were expressed in o ther cell types, including thrombocytes, B and T cells, bursal and thymic stromal ceUs, and intestina l cells 163,65,78,80-82]. However, there has never been a complete list of IIG genes for a haplotype, nor a comprehensive anal)'Sis of which genes arc expressed in which cells and tissues.

In this paper, we examine all the BG genes of the Bl2 haplotype both by sequence and expression analyses and find that some BG genes are expressed in one or another cell of the haemopoietic lineage while other BG genes are expressed in tissues, likely from non-haemopoietic lineages. These assignments arc strengthened by the fact that the 5' ends (putati\·e promoter and 5'UTR) of the genes from the BG region also r.,u into two groups which fit exactly with the presumed cell and tissue di.stributions (with the exception of the singleton llG genes, discussed below). Interest­ingly, the haemopoietic genes of the B 12 haplot)1>C all have a deletion within the S'UTR, which almost certainly arose by recombination between two 27 nucleotide d irect repeats found in aU tissue BG genes. These data might be interpreted to suggest that all haemopoietic BG genes descended from a single BG gene, with the tissue BG genes being ancestral.

Within these broad categories of haemopoietie and tissue BG genes, the specificit)' of expression of particular BG genes in a

PLOS Genetics I www.plosgenetics.org 12

single cell type is remarkable, with som e genes changing expression dm ing difle rentiation. For instance, only one BG gene in the 1312 haplotype is strongly c:-vressed in T ceUs sorted from peripheral blood . In contrast, two BG genes arc strongly expressed in B cells sorted fi·mn peripheral blood, but one of these was not found in bursa, the primal')' f)111phoid organ for the production of B cells. Changes in expression during differentiation arc also suggested for the IlG3 ge ne, which is strongly expressed in T and B cells, th)1nus and bursa, macrophagcs and dendritic cells, hut not thrombocytes nor bone marrow fi·om which all haemopoietic lineages arc thought to originate. Interestingly, the BG3 gene is also strongly expressed in brain, and at least one transcription f.'lctor binding site specific for neuroncs is found in the putath•c promoter of BG3. Expression of particular genes may also change during activation of a cell type, but for macrophagcs a mm1ber of strong stimuli f.'liled to affect the two strongly-expressed BG genes, llG3 and BG 13. Overall, much more work needs to be done to explore the comtilcx expression pattem s of genes from the BG region.

Of the two genes located outside of the BG region, BGO has an apparently ubiquitous tissue distribution while BG I is expressed in intestine and kidney. 1l1e 5' regions of these two genes are diflcre nt from the other BG genes; the BG I promoter is in f.'lct partly composed of inverted pieces of the promoters of neighbouring genes.

The second major point that we establish in this paper is the presence of BG genes with differe nt evolutionmy histories, some

June 2014 I Volume 10 I Issue 6 I e1004417

Page 16: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

relatively stable and others changing rapidly. As mcntioncc! above, there has long beeu evidence for multiple llG genes, many of which were located to the BG region and one located to the BF-llL region f62- 64,67,78,60J. In this paper, we report two single BG genes with relatively stable evolutionary histories compared to the many BG genes located in a cluster of clusters, for which there is dynamic expansion and contraction and thus a much more complex evolutionaty history.

The singletons located outside of the BG region are the newly described BGO gene on chromosome 2, and the long-studied BG l gene located in the Bf-IlL region. Although they have similar intron- exon stmctures and both have promoters different from other BG genes, man)' other features differ between these genes. \ \'e found BGO as a eDNA fi·om B 12 caecal tonsil (a gut-associated l)1nphoid tissue), and comparison with available sequences showed that it was nearly idcntic<tl to a fragmentary eDNA clone reported <ts isolated from B2 l cmbtyonic erythroc)1es (67]. Specific PC R amplification shows that it has a wide, perhaps even ubiquitous, tissue distribution, and is present in every haplOt)1>e examined as a nc<trly monomoq >hic transcript (Chattaway, Salomonsen and K<tufinan, unpublished). These properlics suggest that BGO has a biological function that is stable and well-conscrYed, perhaps homeostatic. In contrast, llG I was originally described as expressed in li,·er, th)1llus, " T cell line and a ll cell line (72], but our tissue distribution shows that it is well-expressed in thpnus, intestine and kidney. Orthologous sequences nre present in every haplot)pe examined, hut there is allelic sequence variation throughout the gene, including expansion a nd contraction of the cytoplasmic cxons [7 1 ,831(Chnttaway, Salomonsen and Kaufman, unpublished). T he properties of BG I suggest a func tion that is under some selection to chnngc, perhaps in response to changes in pathogens (as has been suggested by genetic evidence, p 1)). H owever, another possibility that has not been mled out is that the genetic variation is due to hitch-hiking on nearby i\ IHC genes which are under strong selection for variation L64-,65].

At a descriptive theoretical level, the presence of llG genes as singletons outside the 13G t'egion is best understood as a consequence of the "birth and death model" of multigcne r.,mily evolution [1 7,21), for which it has been noted that single gene duplicates can arise by replicative translocation Ll5J. ;\foreO\•er, such theoretical considerations suggest that e\·olution of new functions is likelier in such singletons compared to a tightly-linked multigene f.'U1llly (1 5,19]. ;\ lore recent evolutionary dynamics of these BG singletons is more likely to be govcmcd by a model of clivergent e\•olution for a lleles [ 17,211.

In contrast, the multigene family ofllG genes present in the BG region is organised as a cluster of clusters, a nd is undergoing significant expansion and contraction. The identification of clusters is based on the presence of non-BG genes at or near the end of each proposed cluster, a kinesin motor gene, a C -t)1>e lectin-like gene and an unclassified open •·eading frame. 'll1ese genes in a characte ristic order are found ncar the end of each of the 111 2 cosmic! clusters V and VI , and were also identified at the junction of the BG region with the TRl;\1 region in the red junglcfowl haplotype used for the WGS sequence L36,54]. ll1ese genes may be considered as " framework genes" in the sense originall)' proposed by Amadou [66], in which nearly single-copy genes with stable fnt tctions flank regions of expanding and conu·acting multigenc f.,milies of genes with various functions.

In th is paper, we present five p ieces of c\•idcncc consistent with significant ex-pansion and contraction leading to CNV of the BG region by recombination and deletion: presence of eD NA sequences derived fi·01n the 134 haplOt)VC in congenic B 12 chickens, presence of apparent hybrid genes in the n 12 haplotn>c

PLOS Genetics I www.plosgenetics.org 13

The Dynamically Evolving BG Family

by phylogenetic analysis of sequence, deletion of genes evident fi·om sequence comparisons of the proximal clliSter ofi3G genes in the B 12 and red j unglcfowl haplotype, apparent swapping of clusters by comparison of B 12 and red junglefowl hap10t)1JCS by sequence and fibre-FIS H, and diflerences in length and specific hybridisation pattems by fibre-FISH between six BG region haplotwes. Such change with in the BG region is consistent with early biochemical evidence of recombination based on analysis hy two-dimensional gel electrophoresis [60,67].

At a descriptive theoretical level, the appearance of the llG multigene family might be best explained by the "birth and death" model p 7,2 l], in which duplication leads to multiple gene copies out of which some genes may be retnined, while others become nonfunctional. At the moment, there is no obvious evidence for homogenisation of nc genes, so the question docs not arise whether there is birth and death followed by purifying selection or concerted evolution by ongoing sequence exchange resulting in homogcnisation [14, 15,17]. However, one of the hallmarks of the "birth and death" model is considered to be the presence of pseudogenes LJ7,2 1), and there is no evidence that an)' of the 13G genes examined are non-fu nctional, given that they all have long open reading frames and they arc all expressed at the Rl\/A Jc,•el. One possibility is that in other BG haplot>ves there arc pseudoge1ies that have yet to be described. Another possibilit)• is that the maintenance of the BG multige nc r.,mily in the BG region is bitsed on a so-c<tlled "mixed model" fl7,21], perhaps with sequence exchange repairing any pscudogenes. A third possib ility is that a new theoretical model should be considered. In a ny case, the current evidence suggests that changes in BG genes within the BG region occur by unequal crossing-over a nd deletion without the obvious appearance of pseudogenes. 'lltc importance of recombination in the evolution of these fiG genes seems clear, but at the moment there is no evidence to suggest how fast it might he in comparison with other multigene r.,milies [22J, although the appearance of B-1- IIG genes in the congenic CJ3 chicken line during back-crossing suggests that it could occur over a few generations. \ Vhatever the speed of such recombination, without selection the munber of nc genes would like!)' graduall)· reduce to one [91, so th is suggests that selection for fu nction is also an important part of BG evolution.

S uch rapid evolution based on various outcomes of recombi­nation is not easily reconciled with models for simple specified functions of a li the BG molecules encoded by the llG region. We have found that the 5' and 3' ends of BG genes can oficn be switched to make hybrid genes, with the phylogenetic signnl of the middle portion of the gene apparently scrambled. However, we also found that the 5' end of the genes detetmine cell and t issue expression. It has loug been postulated that the extracellular V-likc region in the middle of the gene is involved in some immunological or cell-cell interac tion funct ion, and the cytoplasmic tail at the 3' end in intemctions with the C)'toskelcton [59]. Indeed, there is direct biochemical e\•idencc that the cytoplasmic tail of zipper protein (similar to the llG 10 gene of the B l2 haplot}1>e) can regulate actin-myosin interaction in the intestinal brush border, and an implication that variation in the number of C)10plasmic exons of BG I is involved in resista nce to certain viruses [70, 71 J. The conundrum is how there can be a stable function based on the middle or 3' end of a gene, when expression of this gene can sudden!)' be switched by recombination bringing in a new 5' end to nnother cell or tissue, presumably a random genetic e\•cnt. The resolution o f this apparent conundrum is one of the next important tasks.

One possibilit)• is that there are BG genes within the BG region that are relatively stable with impotiant and specific functions, and

June 2014 I Volume 10 I Issue 6 I e1004417

Page 17: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

that between them there arc expansions and cont ractions of genes whose expression and function can change rapidly. T he existence of such stable "framework genes" nanking regions of genomic change has been particularly well-characterised for the killer inhibitOI)' receptor (Kl R) gene cluster which encodes receptors on natural killer cells l3,20,88,89J. Indeed, the KIR genes show other similarities to the properties of the BG genes, including " tail­swapping" in which the inhibitmy and activating 3' ends of genes with difierent 5' extracellular regions are switched [90]. If this model of genomic evolution of the UG region is correct, we expect to find some 011hologous genes in every BG haplo!)1Je, whose alleles will carry out similar functions and be expressed in similar cells and tissues.

Identifying and characterising such framework genes in the BG region is another of the next important tasks. As an example, the sequence of the zipper protein characterised in an unknown chicken line is near!)' identical with BGIO of the Bl2 haplotype, suggesting that this may be a framework gene with a highly conserved function and cell expression. In fact, the zipper protein was discovered as a protein which can control actin-myosin interactions in intestinal brush borders [70] , like!)' to be a homeostatic control for which rapid evolution would not be achoantageous.

Another possibility is that there is suntcient functional redundancy between different BG proteins to relie\'e the selective pressure to maintain the expression of individual genes. If this was the case, perhaps the same function could be achieved in different haplot)1JeS by structurally and functionally similar but not necessaril)' orthologous genes. At a descriptive level, this kind of evolution might he considered "subfunctio­nalisation", in which newly duplicated genes share and then panition the functions of the original gene fl 4). It has been noted on theoretical grounds that tight linkage increases the probability of subfunctionalisation at the expense of ncofunc­tionalisation [1 5] . Unravelling which genes in dinerent haplo­t)1Jes may serve the same and dific rent functions is a third important task.

At least some of the BG genes that undergo rapid evolutionary change will almost certainly have important functions as well, characterised by the need for diversity and rapid response to changing environmental conditions. 11JC likeliest scenario is a molecular anns race with pathogens, in which the diversity of such fiG genes is dri\•en by the ever-changing variation in (certain) pathogens. As mentioned above, resistance and susceptibili!)• to two oncogenic viruses hm·e recently been ascribed to a retroviral inset1ion in the 3 'UTR of the BG I gene [71] . A fmuth important task is thus to understand the mechanisms of how such BG genes enable an encctive response to a pathogen.

Materials and Methods

Animals Samples were taken from experimental chicken lines kept at

the Basel Institute for Immunology in Switzerland, the Institute for Animal Health in the UK, the Ludwig Maximilians University in Germany, and the University of Cambridge in the UK. T he origins and derivations of the chicken lines used in this work are described in some detail [77 and references cited therein]. All animal work was conducted according to the relevant national guidelines in place at the time of the research. Most recent!)', approval for animal research was through UK Home Ofike Licenses and local Ethics Committees at the Institute for Animal Health at Compton, and the Universi!)• of Cambridge.

PLOS Genetics I www.plosgenetics.org 14

The Dynamically Evolving BG Family

Isolation and sequencing of BG genes (Figures 1, 51 and 52)

Standard molecular biology techniques used in this paper are described in detail [91 ,92J, which t)vically arc referenced along with modifications in the accompanying citations given below. Southem blotting using BG eDNA probes [78] identified the transcribed fragment 8.5 !i·mn various cosmic! clones of the line CB (B 12) chicken [72J as a BG gene (subsequently named BG I). A 6.5 kll!i·agment from cosmid cP12 [72] cut with Nm I ami Hind III was isolated and cloned into Bluescript (cPI2NK6.5BS2a clone), subclones were sequenced as described [93) by clideox­ynucleotide technology using 32P-labelled ATP, some portions after isolation of single stmnds, and the sequence was deposited in Genbank (accession mm1ber KC963<!27).

A cosmid libraty from a line CB (B 12) chicken [72] was screened with eDNA clones Gl , G5, G7 and G8 [53) labelled with 32P by nick-translation. The 50 positive plaques were picked nnd gJ'0\\~1 up for isolation of DNA, which was analysed by Southern blot using various BG eDNA probes isolated from a H.B 19 (B 19) chicken spleen libra•y [78). Double-digestion restriction maps and limited sequencing were used to group the cosmids into clusters [59], with extraneous sequence of chimeric inserts determined by genomic Southem blots. DNA from cosmids cG<!3, eG3 and cG24 was sheared, cloned and sequenced using conunercial Ouorescent dye react ion kits followed by capillaty electrophoresis at the Sanger Institute. T he clusters were linked using the data from Fig~•re S5C in Cambridge, and the whole sequence deposited in Genbank (accession number KC955130).

Polymerase chain reaction Amplification from DNA and eDNA was carried out using

different commercial kits at difie rent times during the work described in this paper. T)1>ically, the amplifications were carried out with commercial kits using manuf.'lcturer's instructions (sometimes optimised for Mg, dimethylsulfoxide or polyethylene­glycol concentration), with 2- 5 min at 9+-96°C for initial ~enaturation , followed by 30 cycles of 0.6 l min at 94-96°C for denaturation, 1--2 min at a lower temperature (depending on the primers) for annealing and 1-5 min (depending on amplicon length) at 72°C for extension, and ending with 10 min at 72°C for final e..xtension followed by storage at 4°C.

Isolation of BG eDNA for comparison to genomic sequences (Figures 53 and 57)

For BGO, clones were isolated from a eDNA lilmuy in Basel. Briefi y, RNA was isolated from caecal tonsil of a CB chicken and cloned into /,ZAP vector to make the CT-2 libraty (much as described in [94]), which was screened with BG eDNA probes as above. Phage clones were picked and converted by plasmid clones b)' in vivo excision with a helper phage VCS~H3 in BB4 cells. DNA was prepared by the CTAB miniprcp method, and the length of insert was determined by digestion and Southern blotting using BG eDNA [78] and genomic clones from the BG I gene as probes, with clones 45A, +7B and 47C eventually picked for full sequencing using a '1'3 thennocycler (Biomctra), a fluorescent dye tenninator kit and a 373A DNA sequencer ~Joth Applied 13iosystems).

For UGI , BGIO and BGII , eDNA clones were isolated from transfectants in Basel. Briefly, mouse L cells with inactivated thymidine kinase gene (Ltk- cells) were transfccted by standard calcium phosphate procedures [91 ,92] with clones cP 12NK6.5BS2a, cG I3 and cG222• Cells expressing BG genes were selected by neomycin, then enriched by fluorescence-activated

June 2014 I Volume 10 I Issue 6 I e1004417

Page 18: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

cell sorting and cloned by limiting dilution followed by screening using flow cytomctty, in both cases using a pool of mAb to BG including 13G2, 3, 4, 5, 6 and 9 [65]. Total RNA was prepared from clones 1.4 and l. 7 (cp l2NK6.5BS2a containing BG l gene), 6N.39 (cG 13 containing BG l 0 and BG!l) and 8N.37 (cG222 containing llG!O and BG!l) using the FastTrack kit (Life Technologies) and eDNA was prepared using Superscript reverse transcriptase (Stratagcnc). BG eDNA was amplified using a T3 themwcyclcr, a commercial Taq polymerase kit (R oche), with an annealing temperature of 45°C. BG l cDNAs were amplified using reverse primer 2773 containing a Not I site (ATATATgcggccgcCTI'TGGTIT­CTITCTI"CCAAT1:"GG) based on the eDNA clones G3 and G8, and a series of forward primers each containing a Sal I site, 2960 (TATATgtcgacTGGCAGAATTACTGGTGC:C), 2961 (TATATgtcgacC:TGGTGATAGC:AGAGAC:C:C) and 2962 (TA­TATgtcgacGGTAGAAGC:TGGGC) designed based on the genomic sequence (accession number KC963427) to establish the 5' end of the BG I transcript. BG l 0 and BG II cDNAs were amplified with forward primer 8081 containing an Nru I site (CAC:GTtcgcgaCCATGSNSTI"YNYATYRRGMTGC) and re­verse primer 2774 (sequence above). Amplified fragments were cut with appropriate restriction enzpnes, gel-purified and cloned into Bluescript plasmid, with BG l clones PCRX l , 3 and 5 from transfectant l. 7 and PC~'{ ll, 29, 31 and 35 from transfectant 1.4, BG!O clones 3+, 42, 47 and 50, and BGJJ clones JK29, 32, 37, 45 and 49 eventually chosen for full sequencing using a T3 thermocycler (Biometra), fluorescent dye tenninators kit and a 373A DNA sequencer QJoth Applied Biosystems).

Preparation, amplification and sequencing of eDNA from cells and tissues (Figures 2 and 54)

The eDNA preparations from T cell, B cell and thrombocyte populations purified by flow cytometry from blood ofC:B chickens in Basel were described in previous publications [76, 77].

The eDNA preparations fi·orn macrophages were prepared from blood rnonocytes isolated from C:B chickens in Mtmich, essentially as described [95]. Briefly, leucocytes fi·orn heparinised blood were separated by centtifugation through Ficoii-Paque. Cells at the interf.1.ce were washed twice in PBS, a<ljusted to 2 xI 07

cells/ml in RPl.\ll 1640 supplemented with 10% FBS, 100 U/ml penicillin and 100 ).lg/ml streptomycin, and then incubated on 90-mm cell culture Petri dishes at 5% C02 and 40°C:. After 48 hours non-adherent cells were removed by vigorous washing. The remaining cells were over 98% positive for KULOI, a macro­phage-specific monoclonal antibody [96) , and were fttrther incubated in the same medium under identical conditions with or \\~thout activation. Cells were either stimulated with LPS from E. coli (0127,B8; Sigma) at a final concentration of 10 ftg/ml or with cell culture supernatant of ehickeniNFy expressing COS cells [97] at a final dilution of l :500 or with a combination of both. After 24 hours cells were washed with PBS and han'CSted into peqGOLD TriFast (Peqlab, Erlangcn, Gcnnany) and RNA extracted according to the mmmf.'lcturer's protocol. In order to confitm macrophage activation cell nitric oxide release into cell culture supematant was quantified by Gt·iess reaction [98].

All other cells and tissues were from C -BI 2 chickens in Compton, with total Rl\'A isolated in TRizol and eDNA made with Superscript III (Invitrogen). Bone mmTow from untreated and chickens rendered anaemic with phenylhydrazine was isolated as described [78]. Dendritic cells were derived from bone man·ow and grown in chicken IL4 and GM-C:SF for 7 days as described [991- Intestinal enteroq1es from embtyonic day 14-15 embt)'OS were isolated from the PBS-45% Percoll interf."lcc as described

PLOS Genetics I www.plosgenetics.org 15

The Dynamically Evolving BG Family

[100, l 0 !]. Lpnphoc>1e depletion was achieved by daily intra­muscular ittiection of 3 mg cyclophosphamide each of the first 4 days after hatching fl 02]. Thymus and bursa tissues were disrupted by passing through a nylon sieve, and l}~ltphocytcs

separated from stromal cells by 70%~15% discontinuous Pcrcoll gradient \\~th a PBS overlay [97].

Amplifications in Copenhagen were performed using a GeneAmp PC:R System 2700 thermal cycler (Applied Biosystems) using either the High Fidelity PC:R Enz>~ne kit (Fenncntas #K0192) or Long PC:R Enz>me kit (Fermentas #K0!82). Samples of eDNA (usually I ftl of a standard prep) were amplified using the commercial bufiers with l.\lg concentration optimised, with 20 ·-30 nmol of each primer, ss•c for annealing and I min for extension. For the data in Figure 2, the primers used for most BG genes were LP Fl (C:C:A GWT TC:R CCC TYC: C:C:T GGA GGA C), LP F2 (CTC: C:TG C:C:T TAT C:TC: RTG GCT CTG C:AC), TM Rl (GAC: ARA TGA CCC AMC: SAG AWK TGT G) and TM R2 (CAC AGC CAG AGC: CAC: YKT C:CA G), used in all four forward and reverse primer combinations. For BG2, the primers used were LP Fl and Ll' f2 as above, 43A Rl (GAC:AAATGAC:C:C:AGC:C:AGAGGAA·n ·ATG) and 43 R2 (CAC:AGC:C:AGAGC:C:AC:C:TTC:CAAG). For BGO, the primers used were LP 1' 1 as above, C:TBG F2 (C:TC: C:TG GC:T TAC: C:TC: GTG GC:T CTC: AAC). C:TBG Rl (C:GA ATG AC:G C:AA AC:A AAA GTG TGA G) and C:TBG R2 (C:C:A C:AG C:CA GAG C:C:A C:C"J" TCC AGG), used in all four forward and reverse primer combinations. Amplicons were isolated from TBE agarose gels using QIAquick Gel Extraction kit (QlAGEN 28706) and cloned using TOPO TA cloning kit (Im~trogcn). Plasmid minipreps with right insert size after Eco RJ digestion were sequenced using Big Dye Tenninator reagent and an ABI automatic sequencer (Applied Biosystems).

Reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR) of eDNA from tissues (Figure 56)

Approximately l 00 mg of spleen, liver, duod enum and bone marrow were collected from two 10 week old female C-Bl2 chickens from the Institute for Animal Health. Tissues were collected into 500 J.l] RNA-Later (Ambion) and stored at -20°C. Rl\/A was isolated from - 100 mg homogenised tissue using the Nucleospin RNA II kit {Machet)·-Nagel), and I ftg RNA was converted to eDNA using oligo-ciT primer and the Verso eDNA synthesis kit (Thenno Scientific), both according to manuf.1.cturer's instructions.

Fom'ard and reverse primers for qPC:R were T GT G­C:TGTGC:AAGATGAT and '!TCCAGGGAT GGATGATG for BG4, TGTGTTGTGC:AAGATGAC: and GAAAAGC:AAT­GATGAC:AA for BG7, TGTGC:TGTGC:AAGATGGT and AC:GATC:TGGGAAAAGGGG for llG 10, and GGAC:­GATC:TGGGAAAAGA and TATGC:AGAAGCTGTGGTGA for BG II. Targets were amplified over 40 cycles (Initial enzpne activation 15 min 95°C, then 40 cycles of 15 s at 9-t•c:, 30 s at 60°C: and 30 s at 72°C) using l 0 pmol each primer in AbsoluteBlue qPC:R Mix (Thermo Scientific). Samples were compared to a 5-point standard curve, and normalised to C)'clophilin A and a reference spleen sample. Fluorescence data were collected and analysed using an iC:ycler (BioRad) with subsequent analyses prefonned in ~,licrosoft Excel.

Further PCR, cloning and sequencing Amplifications in Cambridge were performed using a DNA

Engine Tetrad 2 Peltier thennocycler (BioRad) using three commercial kits with bufiers supplied and according to manuf.1.c­tm·er's instructions.

June 2014 I Volume 10 I Issue 6 I e1004417

Page 19: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

For the experiment in Figu re S5B, fragments were amplified from BACs Pl(2G)F6, 34 and Pl (IBG)BG l52J using 2 ng DNA, 0.6 mM each of the p•imers uc74 (CTCCT GCCT1'ATCTCRTG­GCTC fGCAc:), and uc76 (CACAGCCAGAGCCACYKTC­CAG), with 1 U Velocity pol}111cmse (Bioline BI0-2 1098) in I x GC 1ich butTer containing 2 mi\ I i\ IgC12 and 0.04 mM of each dNTP, with the annealing step being 2 min at 53°C .

For the experiment in Figure SSC, genomic DNA was isolated from erythrocytes using a salting-out procedure, as desc.-ibed Ll 03]. Fragments were amplified fi·01n 2 ng line C-1312 genomic DNA using 0.2 mi\1 each primer uc244 (Fl O: T l'GGGGAAA­TAGTGTGACCG) with uc250 (RI B: GGAGGGATCAG­GAGGGAGC) or uc248 (Rll: GGGGGGAAGAATITAGG­GATJ with 0.5 U recombinant Taq DNA pol}•me1-ase (Invitrogen 10342020) in I x Invitrogen PCR reaction bufier, 2 mM added .MgC12 and 0.25 mM of each dNTP. After initial d enaturation, there were 5 cycles of 0.75 min at 95•c, 0.5 min at 6o•c and 1.5 min at 72•c, followed by 30 cycles of 0.75 min at 95•c, I sec at 60°C and 1.5 min at n •c, followed by a final extension step.

For the experiment in Figure SB, fmgments were amplified from I ~tl (concentmtion unknown) line N genomic DNA using 0.4 mM of each primer LPair 1: uc51 1 (BGO_PlF, T GCCCAGGGAT­GATTGTGAGGCTJ and uc512 (BGO_PIR, T GCAGAA­CTGGGTGAGTCG'lTCC); Pair 2: uc5 13 (BGO_P2F, TGC­CCAGGGAT GA'IT GTGAGGC) and uc514 (BGO_ P2R, TGC­AGAACT GGGT GAGTCGTJ'CCT); Pair 3: uc5 15 (BGO_ P3F, GCCCAGGGATGATTGT GAGGCTJ and uc5 16 (BGO_p3R, GCAGAACTGGGTGAGTCGTTCCTJl , with 20 U Phusion polymerase (New England Biolabs # M0530S) in I x HiFi buller (conta ining 1.5 mi\ I MgC12), L,5% polyethylene glycol and 0.04 mM of each dt'ITP. After initial denatumtion, there were 5 cycles of 0.75 min at 95°C, 0.5 min at 59.8°C and 1.5 min at n •c, followed by 30 cycles of0.75 min a t 95•c, I sec at 59.a•c and 1.5 min at n •c, followed by a final extension step.

T he amplicons were cloned using the Clonc;)ET kit (Fenn entas) following manuf.'!cturer's instructions, and sequenced using commercial fluorescent dye kits followed by capillary electropho­resis at the DNA Sequencing Facility of the Department of Biochemistry, University of Cambridge (www2.bioc.cam.ac.uk/ ~pflgroup/DNA_Facility) .

Sequence data was anal}"Led using C LC DNA workbench (version 5. 7 .I , www.clcbio.com/). Alignments were pe1fonned using Clusta lX (www.clustal.org/) and MAFfT Qlttp:/ / mafli. cbrc.jp/alignment/software/). Some phylogenetic trees were created using the neighbour joining (NJ) method implemented by ClustaLX with 1000 bootstrap seeds. Other trees were created using a Bayesian approach implemented by i\·lrllaycs L I 04] (version 3. 1.2, http:/ / mrbayes.sourceforge.net/), with the GTR substitution model used with gamma-distributed rate variation across sites, and with the i\,ICMC analysis using one chain, 20000 generations and sampling every I 00 generations. Still other trees were created maximum parsimony (i\·IP) phylogenies, using PAUP 4.0b l 0 for Unix (105] (11ww.paup.csit.fsu.edu/). AU and SH tests [ 1 06] on M P trees were perfon ned in CONSEL version 0.2 (www. is.titech.ac.jp/~shimo/prog/consel/) . Finall}', other trees were created using neighbour network analysis implemented by SplitsTree4 [107, 108] Q1ttp:/ / www.splitstree.org/), and tested fo r significance using a Phi test for recombination [I 09]. Phylogenetic trees were visualized using Dendroscopc Qmp:/ I a b. inf. uni-tuebingen.de/software/ dendroscope/). After removal of gaps in the alignment of all 14 BG genes by G-blocks using default stringent parameters [II 0, Ill], an automated partitioning analysis performed using SAGUARO fll 2) (www.sourceforge. net/proj ects/saguarogw/) gave cacti which were handled with

PLOS Genetics I 1'1\WI .plosgenetics.org 16

The Dynamically Evolving BG Family

PHYLIP version I :3.68-2 fl l 3] from the Ubuntu repositories (www. launchpad.net/ ubuntu/ lucid/ +source/ph)•lip/ I :3.68-2). Dotplots were created with a wordsize of 150, using dottup fi·om the EMBOSS package (version 6. 1.0-5, http:/ /emboss. sourceforge.nct/). Helical wheels were created using pepwheel from the Ei\UlOSS package (version 6.1.0-5, http:/ /emboss. sourceforge.net/). Read-through exons were found using an in­house Visual BASIC script, and the J'J'Ji\·ls by an in-house PERL script, both written by J Chattaway.

The 2 1 nucleotide repeats were grouped using an in-house mle based clustering algoritlun (J . ChattaWa}'). T he clustering algo­rithm first created a distance matrix between the nucleotide sequences of all the 21 nucleotide repeats. Then, the algorithm created a list of 2 1 nucleotide repeals that were at least 80% identical to each repeat, and compared the lists and placed repeats tha t were similar to the same sets of repeats into the same group, again with an 80% identity threshold. The number of resulting groups was small enough to be checked manually; some groups were merged or split, and each group was given a number as a label, in order from the largest to the smallest group.

Fibre-FISH (Figures 3 and 1 0) The cell lines IS2 (MHC haplotype B2), UG5 (Bl3), TGI5 (Bl5),

1SJ9 (B l9), and TG21 (B21) arc rcticuloendotheliosis virus (RE\0 u·ansfonned cell lines l l14,1 15,1 16J. Prima•y BJ2 splcnocytcs were isolated from the spleen of a line C-B 12 chicken (IAH , Compton; now rebranded as The Pirbright Institute) and stimulated with concana1•alin A (Con A, Sigma) for 48 h. All cells were cultured at 37•c, 5% C0 2 in RPi\11 1640 supplemented with 10% fCS, L­glutamine and kanamycin (all from GIIICO/Invitrogen).

Fibre-fiSH was performed at Sanger Institute as described [I 17J, except the fibres were not treated "ith acetic acid and pepsin. Approximately I 0 ng of each cos mid was used for amplifica tion by the Genomel'lcx \\'hole Genome Amplifica tion (WGA) 2 kit (Sigma) according to manuf.'!cturer 's inslructions, and probes created by amplification ming a modified WGA3 kit (Sigma) [J IB), with cosmid c4.5 labelled with digoxigenin-11 -dUTP (Roche); cG24 with biotin-16-d UT P (Roche), and cG43 with dinitrophenyl (DNP)-11 -d UTP (Perkin-Elmer). Hybridisation was carried out as described 111 9] except that the probes were allowed to bind 01•crnight. Detection and imaging was canicd out as described L 118J with the DNP-11-dUTP probe detected using a 1:200 dilution of rabbit anti-DNP IgG and a 1:200 dilution of Alexafluor 488 donkey anti-rabbit IgG (both from Molecular Probes, pa11 of Invitrogen).

Supporting Information

Figut·e Sl The cosmids identified by screening with BG probes were characterised by double restriction enZ}1n e digest and Southem blot, and could be organised into the previous!}' described cluster I (now kno\111 as the BF-BL region), and two novel clusters named cluster V and cluster VI. T hick lines represent clusters, with restriction sites (C, Cia I; K, Kpn I; i\1, i\·llu I; N, Nru I; n, Not I; S, Stu I; P, Pl'll II; X, Xho I), open boxes indicating preswned BG genes based on hyblidisation, and closed boxes indicating similar regions based on hydridisation (now knmm to contain kinesin and lectin-like genes). T hin lines represent indi1•idual cosmids, with dotted lines indicating sequence apparent!}' from outside of the BG region found in chimeric cosmids, and arrows indicating fmiher sequence. Bar indicates approximately 5 kb. (PDf)

Figure S2 Alignment of the genomic seq uence fi·om all of the Bl 2 BG genes, showing the locations of the coding sequence

June 201 4 I Volume 10 I Issue 6 I e 10044 17

Page 20: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

(yellow background), UTRs (green background) ;mel the introns/ intergcnic sequence (white background). Residues which diOer from a m;Uority consensus sequence (not shm\11) are red; exons are highlighted in g1·een; dashes are gaps incluclecl to maximise alig11111ent. Each Genomic FraglHCnt (G I•) was cut 76 nucleotides after the prcclictccl end of the 3'UTR, with the start of the 5' end just after the breakpoint of the previous 3' end (except for IIG 10, preceded by the h)1JOthetical protein, which was trimmed to match the 5' ends of the other genes). All features of the BG genes were initially predicted by reference to known BG cDNAs (as shown in Fig11rc S5), and most have been validated by comparison to cDNAs fi·om the particular gene. While not shown in this fig11re, the features of the C-type lectin-like, kinesin motor protein and hypothetical genes were established by reference to ESTs which span at least two exons and arc over 90% identical to the genomic sequence Qectin: CD21610 1,CN21769+; kinesin: BU·~71750, BU333462, BU251676, BUII2055, BUI08596, A.M068009, AM063 157, CV8928+3, CD2l7632, BU487705, BU330950, 1\)442873, 1\)398++6, AJ398368, DT659179, DT655924, DR419449, C0507363, CN232628, CD217592, BU386334, BU354226, Bl\1491516, Bl\1490485, Bl\1489471, Bl\1425939, BG62542+, AJ453838, AJ444817, DT657232, CD738123, CD72835•1, CD215461 , BUI99828, BUI08016, BI393544, BI391565, t\)452709, 1\)398223, 1\}393429, BU328290, AJ399432, 1\)393414, 1\)393212 and the h)1JOthetical protein: BU207055, BU209867, CN232293, BU374683, AJ442408, Cll017432, AJ445065, AJ399 199, BU45141 7, BU299883, BU357505, BU333629. (PDF)

FigUl'e S3 BGO is identical in two haplot)lJCS, and BG 10 is most similar to the zipper protein, particularly in the cytoplasmic tail. A. Phylogenetic tree of nucleotide sequences. Alig1tments of B. BG I 0 with zipper protein, C. BGO from red junglefowl and line C ll. Exons alternately coloured g1·cy and white. (TIFJ

Figm·c S4 Phylogenetic tree and alignment of sequences of coding region from "sig11al sequence to transmembrane" of B 12 genes compared to the sequences amplified by the "universal primers" from eDNA of CB and CC chickens, as described in Figltre 2. ( f!F)

Figm·e S5 PCR, cloning and sequencing links and orients cosmic! clusters VI and V, and cosmic! cluster V with the TRIM region, validating and extending the interpretations of the fibre­FISH experiments. A. Representation of fi,·e BAC clones with the right ends in the Bf-BL region, extending to the left into the TRIM cluster (by sequence) or to the BG region by hybridisation indicated by black boxes. Fif.'l!re from Ruby et al 2005, with permission fi·orn Springer 2013. B. Agarose gel of PC R products from three BACs using "universal primers" for BG genes, along with names of BG genes identified following cloning and sequencing. C. Sequence alignment of clones recovered after PCR from genomic DNA (C-BI2 chicken) using primers on the right end of the cosmid cG24 from duster VI and from the left end of cosmic! cG•I3 fi·orn cluster V, compared to the ends of the cosmid clusters as detennined by sequences of cG43, cG3 and cGN. (flfF)

Figut·c S6 The two haematopoietic BG genes BG 7 and BG II arc well expressed in bone marrow but absent from liver, whereas the two tissue BG genes BG•l and BG I 0 are well expressed in duodenum and poorly expressed in bone marrow. RNA

PLOS Genetics I www.plosgenet ics.org 17

The Dynamically Evolving BG Family

expression levels of BG7, BG II , BG•l and BG 10 were compared by RT-qPCR. BG levels across tissues were normalised to the reference gene cydophilin A, and relative fold diOerence in expression was calculated hy normalising to spleen (spleen = I; fold values given above the bars; standard errors arc from experimental triplicates on one plate). Data is representative of three experiments. (fiF)

Figure S7 cDNAs isolated from B 12 chickens and transfec­tattts with B 12 genes (black) validate the intron/ exon structures predicted (blue) based on comparison with BG cDNAs from other haplot}1Jes. A. BGO (CTIIG) eDNA clones from caecal tonsil (accession number KC955 131 ), B. BG I (11.5) clones after PCR from eDNA from L cells transfected with ll.S gene (arrows indicate open reading frames) (accession ntnnbers KC955132 to KC955136), C. and D. IIGIO (1313, zipper protein-like) and BG II ( 13A, 22E) clones after PCR from eDNA from L cells transfccted with cosmids cG 13 and cG222.

( I'll·)

Figut·e SO There is only one 3' UTR exon for the BGO gene in B21 chickens Qine 1\~ . A. Dot plot analysis qf the sequence around the 3'UTR fi·orn the WGS sequence shows duplication including the 3'UTR, with an insertion in the second copy. 1ltrec pairs of primers were desig11ed just outside the apparent duplication and used for PCR from genomic DNA from a line N chicken, with the sizes expected for amplicons from the region with and without a duplication indicated below the dot plot. B. Picture of the amplification products separated by agarose gel electrophoresis, showing m;Uor amplified band below 3 kB compared to markers, as c"pcctcd if the duplication is not found in the genome. C . Alig11ment between the \VGS sequence 2.1 and the end sequences from the 3 kb band from PCR I . (PDJo)

Figu•·c S9 The promoters of BGl (8.5) and BG9 (F8) genes contain sequence deri\•ed fi·om other genes. A. Sequence alignment of the BG I promoter with the promoter of the neighbouring Blcc Qectin·like) gene and with the consensus of other BG genes, showing that the upstremn portion of the BG I promoter is derived from the Blec promoter, followed by a region of unknown origin, finishing with the proximal region being similar to other BG genes. B. Sequence alig1t.ment of the BG9 promoter with the sequence of an upstre;un gene Loc425771 Q1)1Jothetical protein) found in the red junglefowl WGS sequence (positions 182878- 183380) and in the sequence of BAC AB268588.1 (positions 29633- 30097, sequence used in fig11re), which presumably became part of the BG9 promoter of the B 12 haplot)1Je upon deletion of the intetvening sequences, as depicted in Figure 9. Interesting!)', this novel sequence contains some tissue specific transcription f.'lctor binding sites (such as Gfi-1 described as the repressor induced by T-cell activation, Lyfl described as important in the maturation ofT-cells, and Nkx-2 found only in the brain and spinal cord), which correlate with the expression in T cells and brain found in Figure 2. (1'1 flo)

Figut·e SIO Classification of cytoplasmic exons reveals that the downstream IJO% follow the same pattern as the 3'UTR, but the upstream 20% do not. Each cytoplasmic exon of all BG genes from the B 12 haplol)1Je was given an individual identification number, a distance matrix for the sequences of all cxons against all other exons was constructed, and then a rule-based algorithm was employed to identify gm ups of exons with sequences with at least

June 2014 I Volum e 10 I Issue 6 I e 100441 7

Page 21: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

80% nucleotide identity. Visual inspection led to a few instances of splitting or merging groups, based on maximising shared nuclcotidcs. A total of 57 groups were fonncd, with 18 groups having on I}• one member and the remaining 39 groups having up to 39 members each. The figure shows the C)'IOplasmic cxons (identified by group number) for each BG gene of the D 12 haplotype, stacked for maximum alignment. A. Arrangement of all the cytoplasmic exons. 13. Arrangement without those cxons which are unique (that is, arc in groups with only one member). Phylogenetic trees comparing the nucleic acid sequences of cxons in common between each pair of IJG genes fi·om lll 2 haplotype for C. fi rst 20% and D. last 80% of C)'toplasmic tails. ( riFI•)

Flgut·e Sll Six BG genes have the potential to conditionally express ITThi motifs, and all BG genes contain 1- 5 introns which could be expressed as protein sequences without disnapting the continuit)' of the C)'toplasmic tail. A. Depicted arc ITI I\ ls which are not preceded by a stop codon, introns which arc in-frame wilh the two flanking C)'toplasmic exons, in-frame stop codons and the resulting in-fi-amc intron sequences without in-frame stop codons. II. The amino acid sequences of all predicted introns that contain an lTil\lmotif which is not preceded by a stop codon are shown as an alignment. ITIMs arc highlighted in red. All numbers arc relative to the start of the preceding heptad repeat. The sequences arc named by the BG gene of origin and the ITII\1 number, starting with the 5' most ITIM which is not preceded by a stop codon. The amino acid sequence which is encoded by the C)•toplasmic repeat exons is highlighted in blue. The first and last amino acids <Jrc omitted because they are partially encoded by the neighbouring exons, which may be altcm<~th·cly spliced. The reading fmmc of the second short cytoplasmic exon can be altered by the ITII\1 containing intron, which can introduce in-fmmc stop codons. ( rii·l ·)

Figtll'C S12 Trees created using a Bayesian approach are consistent with the trees created using a neighbour joining (Nj) approach. The trees are based on alignments of the promoter, 5'UTR, signal sequence, V-like region, transmembrane region or 3'UTR, and were created using l\lrBayes (version 3. 1.2). The GTR substitution model was used with gamma-distributed rate variation across sites. The MCI\!C analysis used one chain, 20000 generations and was sampled every I 00 generations. Nodes with a posterior probability less than 0.5 have been collapsed. The results of the Bayesian approach are consistent with the ne ighbour j oining (Nj) approach, with the same &.-roups found as in Figure 5. (PNG)

Figut·c Sl3 Neither the AU test nor the SH test can support a single topology for the signal sequences, V-like region sequences or transmembrane sequences, in contrast to the analysis of the 5 'UTR for which there are three very closely-related best trees and for 3 'UTR for which there is a single best tree. The AU test cannot reject the null hypothesis for any V-like region, signal sequence or transmembrane region topolog)'. The SH test supports multiple topologies; however these supported topologies have different structures. Topologies were created in PA UP using the l\lP method, then AU and SH tests were p erformed in CONSEL. For the signal sequence the 66 shortest topologies were compared, for the V -like region the 52 shortest topologies, and for the transmembrane region the 60 shm1est topologies. The top ten topologies arc shown for each region. (TIFI•)

PLOS Genetics I www.plosgenetics.org 1B

The Dynamically Evolving BG Family

Figu•·e Sl4 Neighbour networks show a complex recombination pattern across the whole gene, but with topologies similar to trees made \\~th other approaches. Neighbour networks of whole eDNA, promoter, 5'UTR, signal sequence, V-like region, transmembmnc region, and 3'UTR were c reated with SplitsTrce. The groups of genes remain the same as with neighbour joining or Bayesian approaches, save for the whole eDNA of DGIO which clusten> with BG5, BG7 and JIG I 0 rather than with BG2. The complex networked stmcture of the trees indicates substantial past recombination within the exons, which is compatible with the recombination assessed using the Phi test, as implemented in SplitsTree. Tests were petfonned on sequences of the whole gene (p =O), promoter (p=0.86), 5'UTR (p =O.O l), V-like region (p = 0.0017) and 3'UTR (p=9.13e-12). Onl)' the promoter sequences do not appear to be undergoing significant recombina­tion. (PDiry

Figu•·e SIS Cacti produced b)' SAGUARO arc compatible with the topologies of l'U trees in Figure 5. A The genomic fmgmcnts detailed in Figure S2 have been aligned. Sequence is represented as a horizontal line, cxons as black boxes j oined by clte1•rons, and gaps as white space. B. The alignment was processed using Gblocks to ellSure that orthologous bases were in the same column and to remove gapped positions. Retained portions of the ali&'lllllent are identified using a black line, with white space representing portions of the alignment which were excluded. C. The Gblocks alignment was then processed with SAGUARO, and 33 cacti were produced, indicated by white boxes. D. Cacti which cover regions analysed in Figure 5 were selected for further analysis. A neighbour joining (l\U) tree was produced for each cactus using PHYLIP. Cactus 2 covers pa rt of Lhe promoter alignment, and shows same two groups of promoters as in Figure 5. Cactus 5 co,·er-s the first half of the 5'UTR, and shows the same two groups of genes as the 5' UTR in Figure 5. Cactus 19 covers part of the 5'UTR and part of the following intron. This tree has clustered most of the hem atopoietic DG genes together, like the 5'UTR tree in Figure 5. Cactus 18 covers the V-like region and produces a similar tree to the V-like region tree in Figure 5. Cactus 36 covers the tmnsmcmbrane region, and identifies the same groups as the tmnsmcmbt-ane tree in Figure 5. E. The 3'UTR sequence has been split into five cacti: 15, 23, 28, 12 and I which are 53, 176, 78, 70 and 71 nucleotides in length, respective!);· These sequences a re short, and the underlying nucleotide sequence is very similar (Figure S2); therefore one or two unique SNPs can radically alter the position of a particular sequence in the tree. Cactus 15 has the same pattern and groups as the 3'UTR tree in Figure 5. The remainder of the 3'UTR cacti produce the same groups of genes as the 3'UTR tree in Figure 5, but each one has a few genes which have migrated elsewhere in the tree . Therefore the 3'UTR partitioning is broadly supported by the SAGUARO analysis, and overall the SAGUARO analysis is compatible with the i'(J topologies presented in Figure 5. (PDI-)

Figm·c Sl6 The presence of hybrid BG genes in the Bl2 haplot}1Je shows no obvious pattern, consistent with a mndom process of recombination in the centre of the genes. The 14 BG genes of the Dl2 haplot}1Je (as in Figme I) arc depicted with coloured boxes illustmting presumed origin (as in Figure 5 but with the colours for the q'loplasmic ta il and 3'UTR reve rsed). (riFF)

Figu•·e Sl7 Identification of clusters of un-annotatcd BG genes in the WGS sequence (version 2.1 ) of the DQ(D21 -like) haplotype.

June 201 4 I Volume 10 I Issue 6 I e1004417

Page 22: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

Upper panel, fi rs t red junglcfowl cluster found in representation of the ENSEi\ I Ill , analrsis of a region assembled f01· chromosome 16. Lower panel, second red j unglefowl cluster found in representation of the Ei\'SEi\IBL att<ti)'Sis of a region of assembled contigs that are suspected but not shown to be patt of chromosome 16. 'll1ese representations taken from the E NSEi\ IBL website show our location of BG genes as d efined h)' a 131 .AST search with the 3'UTR of IJG genes (red vertical lines labelled BLAT/ BLAST hjts), the location of two identified BG genes (dark red boxes labelled Ensembl), and location of most of the cxons of the BG genes inappropriatcl)' linked (green boxes labelled Unigcne EST clusters). In essence, the prediction progmms r., ilcd to identifr the 5' end of the BG genes. (Pi\' G)

Figuo·c S18 Phrlogenetic tree comparing 3'UTR nucleotide sequences of all genes from the 1312 haplol)1>e and the genomic sequence ofredjunglefowl (Rjf, BQor B21-like haplotype, except BG I from 112 1) showing that manr genes from red junglefowl cluster I (pmv le) a rc the same as B 12 cluster VI (yellow), but red j unglefowl cluster 2 (teal) is not well-related to B 12 cluster VI (omngc). Genes from B 12 and red junglcfowl arc named with the same convention: numbers begin with BG I in the IIF-BL region, and then rise in order of the location of the gene (or apparent location the case of red junglefowl cluster 2) compared to the BF-

References L Doxiacfu GG~I, Ouing N, d< Groot NG, l\oort R, fiontrop RE (2000)

Unprecedentt"<l polpnorpiiU1n of ~lhc-DRil f'("gion ('onfigur.Hions in rhC'.ms m.,caquN.J lnununol 164: 3 193 3199.

2. Doxiacfu GG~I. do Groot N, Ouing N, Bloi:Jnill JH, llontrop RF. (201 l) Genomic plouticity of the ~IHC cl:m I A region in rht.nu rnacaquC"S: ('Xlcrnr..-e h3p!ot)VC din~rsitr :u th~ J>Oi'ubtion len"1 :u rt>w·:t1rcl hr mino;.."\tellilt:J. Inununogcnclks 63: 73 83.

3. J iang W, J ohruon C, J>)Olrnll13n J , Sim..:ck N, Nob!c J , ct al (2012) Copr number \·arbtion )e;,cU to consKI~r.lb!e din~r.)ity for U but not A h~p!Ot)l)('S of 1he hum.:tu KIR gcnr£ tncoding NK cell rc.~eptors. Genomr Rrs 22: 1845--185-L

4. L1nin LL (2005) NK edt recognition. Annu Re\· 111 1111111101 23: 225-274. 5. Sturte\·aut AH (192j) ·n1r C"0«1:s of unequal cro$Sing O\"tr :\t tht bar locus iu Drol<Jhi~•- Gt nttics tO: It 7- 147.

6. llaldu"' J!lS (1933) "!11e p.1rt playro br r<n1rrent mutation in c:\-olurion. Am Nat 67: 5 19.

i. ~liillcr HJ (1935) The orihrination of chrom.'1tin dd iciencif..s 35 1ninutc deletions subjm to in.sen ion elsewhere. Gencticot 17: 237- 252.

8. Hm:lcy j (19-l2J 1::'-olution: tht modem ~ynthcsis. london: t\lle n :m d U1min. 9. Walih Jll (1987) Pcni>tcncc of ~1ndem arr.lp: implieatiOIU for >.1tcllitc and

simp!e·~quenct- D NAs. Genetics 11 5: 553-567. 10. Ota T , Nci ~~ (1994) P h"t'rgent C'\ulmion and c:\·olution by thr binh-aml-dcath

process in the immunoglolmlin VH gene f.m>ilr. ~ lot Bioi Em l t I : 469 482. 11. Nei M, Gu X, SitnikO\"il T (1997) f.H>Iution hy thC' birth -:md-de.lth procf'S.S in

Jttultigcnc families of th~ n~rtehra te i.nununc S)'Stem. Proc N::ul Acad Sci US A 94: 7799- 7806.

12. Freeuun JI, Perry GH, l'ouk I, Rcdon R, ~lcCauoU SA, <t ol (2006) Copy number \ouiation: new imighLS in genome d.i\"Cni[y. Genom~ Res IG: 9-!9-961.

13. Force A, Lynch :--1, Picken FU, AmorC".s A, Yan \'1..., l'onlcthwait J (1999) Presen01tion of duplic~tc gtn~ by comp!cmemary, degenerati\~ rmu:ations. Genetics 151: 153 1- 1545.

H. Lynch :\1, Cow~ry JS (2()(X)) The e\ulution.uy f.1tc and cons.equencC".s of duplicate genN. Scitne< 290: l 151-l 155.

15. Lynch ~ I, O'!!elr ~I, \\'aW1 ll, Force A (200 1) 1l1e probability of prcsernrion of a ncl\iy a risen gene duplic.1te. Genetic• 159: 1789-160 1.

IG. Rooney AP, Piomki\'$b. H, N'ci ~~ (2002) Mo!ecular C\"Olmion of the nontantle-mly rtpC'atcd gcnC"S of the histone 3 wultigcuc Fhmily. ~ fol Oiol F.\"01 19: 68 75.

17. Nei ~ ~. Roo1"'r At' (2005) Concertro and birth-and-death ovolution of multigcnc families. Annu R~· Genet 39: 121- 152.

18. Frcom.,n JL, Perry G H, Feuk L, Redon R, ~lcCarrolt SA, <t aJ (2006) Copr munlx-r \Olriation: new insights in genome di,~rsi t)"· Genome Rt1 16:9-19-961.

19. Hugh<s T, Lilxrks DA (2007) ' !11e pattern of o\-olution of >n~1ller-><ale gene duplicates in rnauunalkm gcno mr-s i.s more consistent ''ith nro· than suufnnctionatisa tion. J ~lol F.,..,, 65: 57+-588.

20. Trnht rnc J •\ , ~ lartin ~ 1, Ward R, Oh<~.~h i ~~. Ptllctt F, ct aJ (2010) Mex-hauism.s: of cop)" number \"Jri;ttion •md t.rtnid gene fonnation in the KIR inununc gcnC' complC'x. l·lltln ~ lol Genet 19: 737 751.

PLOS Genet ics I 1'/Ww.plosgen e tics.org 19

The Dynamically Evolving BG Family

BL region. Note that RJF-BG5 and Rjf-BG·I arc quite different from all other fiG genes. ( l'IF10

Acknowledgments

\\'e th~nk Christi~n D o hring ~ttd Eliz~bcth l ~~ngler (or technic~l

assist~nec, ~nd Karel llal~ fro m lnnsb m ck for three C B chicks so long

ago. We th~nk two anonrmotts re\icwers for their reasonable comments and question s. We a cknowledge the gu est editor a nd on e a n o npnous

r e\iewer for requir ing the ana l)-.eS for Figm cs S6, S l2 , Si3, S H a nd S l 5, a s we ll as the extensi,·c discu;s io n of the dala in the light of c u rrent

evolut ionary theory. W e also ackn owledge a n ew red ewer in the second round o f rc\icw fo r rc<tu ir ing an expl~n~tion of blood , -essels in tissues. Finallr, we honour the m em ory of .\ lorte n Simonsen, who st~ rted us o fr

manr decades ago on the wo rk that fin~ll)' resulted in this story.

Author Contributions

Concei,·ed and desig ned the cxpe rime nts: J S JAC ACYC AI' DA.\1 OV FY FG Ct\ SB KSkJK. Pe rfomtcd the expaim e nts:JSJAC ACYC A I' S H

D t\.\ll'R OV Li\' FG KSkJ K . i\nalrzed the d~ta:JSJAC i\CYC Al' S H l>A:\ 1 OV U\' FY FG K S kjK. Contribuled reagents/m~terials/an~lrsis tools:JSJAC ACYC A I' SI.R Z W ALS KS t C B IlK OKG FY R Z FG Ct\ KSkJ K. \\'rote the papcr:JK J AC. D esigned som e o f the softw:tre u sed in

a nalrsis: j t\C.

21. Eirin -L6p<> J~I. Rcbordino• L, Rooncr AP, Ro1.as J (20 12) "ll1c birth-and­d~ath C\Ulution of multigene familit$ re\"isitcc.l. Genome l>p1 7: 170- 196.

22. 1\.atiu V, lla gthorsson U (20 13) Copr-number ch.:mge.s in t\'Oiution: rates, fitnC"Ss dfC"Cts and adat>ti\'C signiJicJ.nce. Front Gtnt't 4: 273.

23. !lcury J , ~Iiller ~L\1, l'o ntarotti P (1999) Stru<:turc and n -olution of tho C''Ol~nd~d ll i f.-:unilr. lmmnnnl Tocb r 20: 285-288.

2-l. Afrachc H, GourN P, r\inouchc S, l,ontarotti P, Oli\'-e D (20 I 2} 'l11e bUt')TOphilin (!lTl\) gene f.'lnil): from milk f.1 t to tl1e r<g11lo tion of the immune r~ponSt'. Immunogenetics 6~ : 781- 79..J.

25. Ab<lcr-DOrner L, Swamy ~ 1 , \\'ill iauu G, Hayd.1r AC, !l:u A (20 12) But)TOphil im: an emerging f.1111il)" of immune regulators. Trends luununol 33: 3-1 -IL

26. r\rnett HA, E..<cob:u SS, VinoyJI. (2009) Regulation of costimularion in the era ofbut)Tophilins. Cj1okine. 46: 370- 375.

27. i'lg11)"tn T, Lin XK, Zhang Y, Dong C (2006) !ln\!.2, a bnt')Tophilin-like mo!«ulc tlt."l t fu nctions to inhibit T cell :\cti\-ation.J lrnmunol 176: 735 1 7360.

28. Asnett HA, Escob."lr SS, Gonz._tltz·Suara E., Dudd.s\..1· AI.., Steffen L--\ , tt al (2007) lJTNL2, a butyrophilin/U7·likc mo!ccule, is a nC"g<lti\~ conirnulatory mo!('(Uit' modolaled in intcstin:U infl:mnuation. J lnunonol 178: 1523 1533.

29. Smith lA , Kucz.:dc BR, t\rnrnann JU, Rhodes DA, Aw n, et al (2010) llTN lA t , tho m:umnary gl>nu but')·rophitin, and !lT1\2A2 ore botl1 inhibitor> ofT cell acti\'oltion.J luununol 184: 35 14 -3525.

30. Yarno:wki T, Gop I, Gr.~f D, Crai.~ S, ~lartin-Orozco N, et al (2010) A bmyrophilin f.1mily member criticallr inhibits T cell acrhonion.J lmmuno! 185: 5907 5914.

31. Tl:u ,\ , Swarnr ~~. Alxkr-Domcr !., \\'illi:u1u G, !'aug DJ, et al (20 11) Rui)TOphilin-Jike I encodes an enlefOC)tt> protein that ~dccth·d)· regulatr.s functional intcracriom \\ilh T lpnpi10C)tcs. Proc 1\"ad Ar.td Sci U SA 108: 4376 -1381.

32. lbrir C, Guillaume Y, Nrocllec S, !'eigne C-~1. ~lonkkootn II, et al (2012) Key implirntion ofCD277 /lmt)"Tophilin-3 (llTi'\3A) in cellular otr<M seruing br a major human ·rS T-ctll suu;ct. U!ood t 20: 2269-2219.

33. Pab kodeti A, Sandstrom ;\, Sund1rcsan I, H;~~ ly C, Nedelloc S, <tal (2012) 'l11c mo!ccular b.1sis for moc.lulation of hum .. "ln V-f-)VS2 T cell n:.SJXUl.S('.S br C:D277/ but)TOphilin-3 (IITi'/3A)-sp<"CifK antibod its.J Bioi C:hcnt 287: 32780-32790.

34. \'3\-a>Wri S, Kumar A, Wan GS, Ram31tiancyulu GS, Ca\o llari ~ I, et at (20 13) llut')Tophilin 3:\ I UiaH.ls plto.sphol')iatcd a ntig~ns ami stimulates human y5 T cdb. Nat luununol H : 908 9 16.

35. ValrntOn)tC R, Hampe J, Hme K, Ko~mtid P, Albrecht ~I, ct ;:t.) (2005) S.·uco:dosi.s is a..'soci.:ued \,ith a tnu'K"ating splin~ site mutation in BTNL2. Nat Genet 37: 357- 36 1.

36. Sz)id !',Jagiello P, C sernok E, Gross 1\'L, Epp!en JT (2006) On tl1e 1\'egtner grnnulo rnato'is as...<ociatcd region on chromosome 6p21.3. ll~IC mrdical genC' tics 7: 21.

37. Spagnolo P, Sato H, Gnmcrs JC, Ronzoni F.A, Marshall SE, ct al (2007) t\nal)'lis of llTNL2 genetic polpnorphism.s in British and Dutch p:Hit'lllS \\ith ':u co:dosis. Ti.Muc Antigens 70: 2 19- 227.

Jun e 2014 I Volu me 10 I Issue 6 I e 1004417

Page 23: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

38. ~ato H, ~pagno!o 1', Silnoira L, \\'clih Kl , dn llois R~l . <tal (2007) IJTNL2 :11!t"'e as...--.ociati<ms \\ith chronic beryllium disea..~ in IIL-\-DPR I'Giu69-ucgat.iw indi\iduah. T is.suc Amigcn.s 70; 480-!86.

39. Konno S, T ak..·lhashi D, Hiuwa N, Hauori T, Tabhas.hi A, ct al (2009) G enetic imp.,c t of a bttt)Tophi1in-likc 2 (UTNL1j gene \OHiation on ~fie JgF. r~pomiHJtes5 to Dcnn:uophago:des farina(' (Dcr Q iu Jap.lncst•. Alkrgol lnt 58: 29 35.

40. Viken ~ I K , IJ!omhofl· A, OI.son ~ 1 , Ak!elien HE, Pociot F, et ol (2009) Reproducible .wociation ' ' ith l}lJC I diabetes in the extended class I region of the m ajor hinocomp:uibility complex. Genes lnunun 10: 323- 333.

41. Pothan S, Gowd r RE, Cooner R, lkcklr JB, Hancock 14 ct al (2009) Confirm<ltion of the non.·l a&ocia tion at the UTNL.2 locus \\ith ulccr.uin: colitis. T i..'S"ue ,\ntigens 7-J.: 322-329.

42. Hsueh K-C, Lin Y:J, Chang J ·S, \\'an I, T>ai F:J (20 10) BTi\'1.2 gene polrmorphh m.s may be as...~iatcd \\ith .!illiccptibiJity to Kawasaki dis.e.uc ~md fon n.'ltio n of coronary artery lesio m in Taiwanr-::e child ren. P.llr J Pedi..-ur 169: 713-7 19.

-J.3. Uan Y, Yucj, H~m :\1, Uuj, I.iu L (:2010) Aualp .is o f the ;usociation betw een llTNL2 polpnorphkm and tulxrcutosU in Chine!e llan population. Infect Geuet f.l·oiiO: 517- 521.

.J t YanL'lda Y, N'i.5hid.l T, Ichihar.t S, Sawotbe ~ I, ruku N, e t al (201 I) A_~;...~iation

of a polpnorphi!m of UTN2AI \dtlt myoc.1r<lia.l infa rction in E.ast 1\si..ln popubtiom. AtherosdcroJis 2 15: H S- 152.

45. Orozco G, Barton A, Erre S, IJing B, 1\'onh ington j , et ol (20 11) HLA·IJI'Ill­COI.I It\2 and three additional x~IIIC loci are independentlr aW>Ciated \\ith RAin a UK cohort. G cne5 Immun 12: 169-175.

46. Horibc H, Kato K, Ogu ri ~I, Yo;hi,b T , F•tiizuaki T , et al (20 11) t\.'!ociation of a polpnorphism of UTi\2AI \\ith hypertension in jap..1n~ indi\idu<1ls. t\rnj l!)l>ertens 24: 92·1-!)29.

47. Ymhi<b T , Kato K, Horibc H, Oguri M, Fukuda~[ , et al (201 1) A>..<o<:iation of a gell(•tic , ·ariant of llTN2AI \\ith d1ro n ic kidney di:ca..~ iu J apanese indi1iduals. Nephro!ogy 16: 642-{;48.

48. H ir.:unatm ~~~ O guri :\1, Ka to K , Yruh ida T , Fujimaki T , et al (:201 1) Jli..cociation o f a potpnorphhm of BT~2A I with rype 2 diabetes rnelliru.s in ja[J-1n<se indi1id1mh. Diobct ~ led 28: 1381- 1387.

49. Ogtzri ~ I, Kato K, Yoshida T, rttiimaki T , lloribe II, et al (2011) As..<ociation o f a genetic.· vari.3ut of RTK2AI with n1etaOOiic spu.lromc in l-~t Asi.1n popubtions. J ~led Genet 48: 78 7-792.

50. Heid HI\', Wimer S, Bruder G, Keenan T\\', jarasch EIJ (1983) But:)Tophilin, an Olpic01l pla5"ma membrane·associi'ltC'd glrcoprotein characteristic oflan ating manunary gl.1mb of din•ne !!] J«'ie-s. niochim Rio phys Acta 728: 228-238.

5 1. Stannners i\J, Rmwn L, Rhode-s D, T rowsdale J, Heck S (2000) BTL-II: a polpnorphic focus ''ith homo!ogy to the lmtrrophiliu g('nc familr, located at the border of the m;tior hist.ocOm[J-'tibilit:)· comp!ex class II aud cl:m liT regions in human and mo use. ImmunO"£t" l1etiC':S 5 1: 373- 382.

52. Rhodes DA, St1mmcrs :\1, ~lalcherek C , Beck S, T rowroale J (~00 1 ) Tlze duster of nTN gene.s in the extended m ajo r hUtocompatiUility complex. Genomic$ 71: 351-362.

53. lloydcn L\1, L<wisj~I. IJarbcc SIJ, Bas A, Cicardi ~1, et al (2008) Skintl , the protot)"pe of a newly idemitied immunoglobulin superf.-.mily gene duster, po.sirl\·elr selects tpiden nal g-.unmadelt."'1 T cells. Nat Gcuct -tO: G5G--662.

5-t. U:trlxe SD, \\'ooclwilrd ~lj, Turchinmich G, :Ment ion .U, [ .e,\is ~I, e t a l (2011) Skim- I is a llighlr .5ped fic, uniq ue selecting compo nent for c p:dennal T cells. Proc i\'atl Aca d Sci U S t\ I 08: 3330-3335.

55. Turchiuod ch G, Hayd -.y AC (2011) Skint- 1 idcntil1e-s a common rno~('('lllilr

mechanism for the den ... lopmC'nt ofinterferon··r-~c-ret ing ,·enm interleukin-17· sec-reting ·r8 T cells. lmmuni t)' 35: 59-68.

56. llri!es \\'F., ~IcGibbon 1\'H, [min ~IR (1950) On •mtltip!c alleles eficeting cdlulJ r antigcm in the chicken. G ene tics 35: 633 652.

57. Schicnnan LW, Nordskog A\1' (196 1) Relatiomhip of blood type to histocomp.-.tib ility in chickem. Scienre 13-t: 1008--1009.

58. Vilhelmm-.1 ~ 1, ~ l iggiano \'C, PinkJR, llal.1 K, llanrnanOI<iJ (1977) t\nal)~is of the alloimmune propertie-s of a recombinant g~nOt)pc in the ruajor h~tocomp.11ibilit:)· complex of the chicken. F.m J lnununol 7: 6H-6i9.

59. Kaufiuanj, Skjodt K, Salomomen j (199 1) The ll·G mulrigcne f.1milr of the chicken major hi.nocompa tib ility comp!ex. C:rit Re,· lmnnmo! II : 11 3- 1-1-3.

60. Salo momen J, Skjodt K, Crone :\f, Simo nsen ~ I ( 1987) T he ch ickeu el)1hr0<)1e-~pccific MHC antig~n. Characterizatio n and puriiication of the U-G antigen by monoclonal a ntibodic-s. Immunogenet i~ 25: 373- 382.

6 1. Goto R, ~lip1da CG, Young S, Wallace Rll, Abplanalp H, et al (1988) Iso lation of a eDNA clo ne frotn tht' ll-C subreg io n of the chicken h i.s toco mp.1.tibility (ll) com p!ex. Jmm uuogcnctks 2 7: 102-109.

62. ~ Iillcr ~ 1~ 1 , Abplanalp H, Goto R (1988) Genotyping chickens for the ll·G 5ubregion of the major histocompatibility complex using re-striction fragment leng th po lp no rphis.ms. Imm unogenetics 28: 37+-379.

63. ~ filler ~1~ 1 , Coto R, YoungS, [juj, H:udrJ (1990) Antigens similac to major histocomp~tibility comp!cx U-G are c"-pr~({"(l in the intestin :1.l epitheli11m in the chicktn. lnHnunogenetio 32: 45- 50.

64. K"ufrnau J, Salomo u!<uj, Skjodt K, 'llzocpc IJ (1990) Size polpnorphism of chicken majo r htitocompatibility complex--encoded fi-G molc.-cu!es i.s. d u(" to length \';lriatio n in the q1oplasmic hept~d repeat regio n. Proc i'\atl Arnd Sci U S A 87: 8277- 8281.

65. S:tlomomenj, Dunon D, Skjodt K, ll10rpc D , Vaiuio 0 , ct al ( 1991) Chicken m:.jor hiHOCOinpatibility comp!cx-encodcd ll-G antigens arc found o n many

PLOS Genetics I www.plosgenetics.org 20

The Dynamically Evolving BG Family

cell type:1 th:tt a re impon;1nt for the immune- ~ystem. Proc Natl Acad Sci US A 88: 1359- 1363.

66. Salomonsenj, Erimon II, Skjodt K, l.undgce.on T , Simonsen ~I , et al (1991) The uadjm-ant effect" of the polym ollJh ic ll-G antigens o f the chicken major histocompatibili ty complex a n11lp .ed uring purified mo!ecules incorpor.tted in liposomc.s. Eur J l1111nuuol 2 1: 6-1-9- 658.

67. ~filler :\ 1~1 ( 1991) ~n.e ~Jajor H istocomp.:ttibility Complex of the Chicke-n. Ph)1ogcucs.is o fluuuunc Functions.lloca R..·n o n FL: CRC: Press. pp. 15 1 169.

68. IJohring C, Riegcn 1', Salomonsen J , Skjodt K, K•ufimn J (1993) The extr.lcellular Jg V-l ike regions of the po lpnorphic fi·G antigt"ll5 o f the chicken ~ lhc bck s tructural fcaturC'5 cxpectcd for antibody \-ari..1b!c regioru. A\ian lmmuno!ogr in Progn:.ss: lmtintt l\';ttional de Ia Rc-c hnche Agronomiquc. pp. 145-152.

69. Elledrr D, Stepancts V, ~Ieider IJC, Scnigl 1', Ceryk J, ct al (2005) The rece-ptor for the .!iubgroup C ;t\ian ~arcoma and leukoill \ill.lles, Tvc, is rebtt."d to mamm11lian bu()Tophilim, member3i of the immunoglobulin 5uperf.1mily. j Virol 79: 10408- 10419.

70. fiiklc DD, ~ fumou S , Komm "C".s L (1996) Z ippe r p rotciu , a B·G protein with the aUili[y to regulate actin / rnymin I imernctiom in thC' inte.stinal brush border. J Diol Chern 271: 9075-9083.

71. Goto R~f, Wang Y, Ta)ior Rl, \\'akeneiii'S, llosomichi K, et al (2009) BC I l•as a 111ajor ro!c in :\IH C-Iinkcd r C'.s i.stancc to rnalign:un lp npho m3 in the chicken. l'roc Natl Ac.1d Sci US A 106: 16740- 16745.

72. Guillclllot F, llill:mlt A , PourquiC 0 1 llChar G, Chaw....~ J\..\1 , ct al (1988) A mo!erul11r m:1p of the chicken major histocomp.1ribility co rnp!e:o:: the class II beta gene-s arc do.:elr linked to tl•e d as.s I genes and the nucleol:tr orgomizer. ' llze E~lllO journal 7: 2i75- 2785.

73. Kaufiuan J, ~filnc S, Good TW, Walker UA, J aeob JP, ct al (1999) The chicken D locm is a minimal H.Sential majo r histocomp:nibility comp~ex. i\'aturc 40 I: 923--925.

H. :\Iiiier ~~~I, G o to R, Beruo t A, Zoorob R, AuHi-ay C, Uumste.:td N, Urilcs \ \'E ( 199~) . Two ~I he cl:w I •nd '"·o ~ I he class ll genes map to the chicken Rfp-Y ~ystcm ouuide the U comp~ex. Proc Natl Acad Sci USA 91: 4397- ·HOI.

75. Rubr T, Bed'Hom B, \\'ittzcll H, ~ Iorin V, Oudin A, et ol (2005) Characterisltion of" cluster of T RIM·Il30.2 gene's in tht chicken MHC B locu s. InumnuY,s~netics 57: IIG-128.

76. 1\'allnr H:J, A1ila D, Hunt I.G, Powell TJ, Riegert 1', t t al (2006) Peptide 1no tif.s of tlte single dominantly expr~d cl.u.s I mo!ccule explain the ~triking ~IHC--derenniuc-d re5porue to Rom .5.arcoma \i rus in chicktn!.. Proc N'ad Ac-ad Sci US A 103: 143+-1439.

77. Shaw I, Powell TJ, ~larston Dt\ , U:tker K, \lln Hateren A, et al (2007) Oi!Terent t\Ulutio n:uy hi.sto rie.s oftli C' two classical class I gt' IIC'.S uri a nd UF2 illustrate drift and selectio n \\ithin the stable ~IHC h3p!otypcs of chickens. J h rununol 178: 57+t -5752.

78. Kaufm11nj , S.-.Jo rnomenJ, Skjodt K (1989) U-G eDN A clo ne5 ha,·e multip!e ~mall repe.1ts and hybrid ize to both ch icken ~ IHC r~gioru. Immunogenetics 30: 440-45 1.

i 9. lnternatio n.'ll C h icken Geno me Sequencing Con .. tOrtium (200-t) Scqu~ncc and comp;u O\rive anal)'3is o f the ch icken geno me provide unique perspec-tin~ o n ,·erttbr.ue c\ulution. Na tm c -J- 32: 69.}- 716.

80. Miller ~IM, Coto R, t\bp!analp II (19M) t\ualpi> of the ll·G antigens of tire ch icken ~IHC by twu·d imcruional gel elcc trophorc.sis. Jnununogcnetics 20: 373-385.

81. Shiino T , llriles 1\'E, Goto R~l, Hosomichi K, Yanagip K, ct al (2007) F..xtended gene m~p ren •;1b trip:tnite motif, C ·t)pe le-ct·in, and Ig supc rf3.1nily t)l.>C gcucs withiu a wbrcgiou of the d 1ickcn ~IHC·U affecting infectiom dise.uc. J lmmunol 178:7162- 7172.

82. ~ Iiiier ~ 1~1 , Goto R, Young S, Chiriwlla J, Hawke D, ct al (1991) Immunoglobulin \-;ui."tb!e- rcgion-like tlom:.ins o f tliw rse ~e<1uence \\ithin the majo r h i.stoco•npatibilif)· complex of the ch icken. Proc ~atl Acad Sci US A 88: 4377-138 1.

83. Hosomichi K, ~Iiller ~ 1~1, Goto R~l , Wang Y, Suzuki S, et ol (2008) Contribution of m ut:ttion, recombination, and gene' corwers~on to ch icken ~IHC-ll hap'Ot)l>e diwNitr. J lunnuuol 181: 3393-3399.

8-l . Shiin;t. T , Ota :\[, Sl tim izu S, Kat5uyama Y, H a5himo to N 1 e t al (:2006) Rapid c\·o!ution of majorhistocom patibi lity comple'x class I gene.s in primate3 gt>nerates new disea..~ alleles in lnunans via h itchJ.ikiug diwrsitr. G enetics 173: 1555- 15 70.

8j, , .. w Oo.sterhout (2009) A new tl• eOI)' of ~IHC e\·o!ution: be roml selection o n the immune gcnes.Proc Uiol Sci 276: 657- 665.

86. Amado u C (1999) EYo!ution of the ~lhc class I regio n: the' frarne\\Urk hn><>thesis. Irnmuuogt."neti~ 49: 362- 367.

87. ~Iiiier ~IM, Goto R, ll ri les \\'F. (1988) lliochemical confinnation of recombination \\ithin the U-G 5ubre-gion of the chicken major h istoco mp .. -.t· ibility complex. ImmunogenetiC'S 27: 127-1 32.

88. \\'ihon ~U, Tork.u :\I, Jlaude A, :\ filne S,Jones T , et al (2000) Plasticity in the orpni7..atio n a nd !equcncC's- o fl tuuL.'ln KIR/Ier gene f3milie.s. Jlroc Natl Acad Sci u ~" 9 7: 4778-4783.

89. C atta\·ez F, Young 1\ .. T , G uethle in L.r\ , Rajaling<lm R, Khakoo S I, et al (2001) Cornp;~rison o f chimp:tn?.ee and lnunanleu kOC)"lC lg-like rt"'<'eptor g~uc-5 rc-\"cals framework and rnpidlr c\u hing gencs. J lmmunol 167: 5786-579-t.

90. Abi-Rached I') Parham P (2005) Natural selec tiou driw .s rccurreut fonnation o f acth -ating killer cell immunoglobulin-like receptor and l..y-J-9 from inhibitory homologues.J Rxp ~led 20 1: 1319-1332.

June 2014 I Volume 10 I Issue 6 I e1004417

Page 24: Salomonsen, Jan, Chattaway, John A, Chan, Andrew C Y

9 1. ~ lanL1lli T, l'rit><:h F., S.11nbrookj (1982) ~ lol<cul.1r Clouiug. A bboratory ~lanu.'l1. Co!d Spriug Harbor L,bor.\torr. Co!d Spring llarbor, N~w York.

92. S.1m~rook J , FriL«:h t::F, ~13ui;uis T (1989) ;\lo!e..ular C loning: Vol-3: A L1boratoty ;\13nual: ,\pp<ndL«s: CSH l:lboratory Pr<S>.

93. Rirg~rt P, Andc:rs<"n R, Bunut('ad 1\, Dohring C, Dominguf'z·St('glic-h M, n al (1996) ~lltc cllid~rn ~L"\ 2-microglobulin g<'ne U located on a no n·major his·tocomp.nibilit)' comp!e:t microchrommomc: a !'rn:t.JI, Cr+C-rich gene \\ith X ami Y I>OX<J in th< promoter. l'roc l\'atl ,\cad Sci US A 93: 1 243· 1 2~8.

~H. S.'llomomcnJ, ~ I anton n. A\ila U, Uuntstead N,Joh:msson u, Cl al (2003) Tilr propenie-5 of the 5inglc chicken ~ lfiC cla...~ical cla.s.s II alplta d tain {B-L\) gc:uc indkatc 311 ancittlt origin for rhc DR/ E·like i.Wt)l')(" of class II nto!ec-ult.s. Immunogenetics 55: 60.)-614.

95. l'<ck R, ;\ lunhr K K, Vainiu 0 (1982) F.'J' r<'>.lion of ll ·L (la· like) antig<m on macrophages fro 111 chickf n lpnphoid organs.J lnununol 12!): 4--5.

96. ~ l ast J , Godd«ru IJ~ I , p,.tt.,.-s K, Vand<samle F, lkrglnnan LR (1998) Charactcric..;nion of chicken ntOIIOC)1es, macrophagc-s and i111rnligiL1.ting cdls br chr monoclonal anciboc:lr KULJll. Vet luununol lrnmunop.uhol Gl: 343-357.

97. W<ining KC, Sclmltz U, ~ IOrutcr U, K:upers II, Sta<h<li I' (1996) llio!ogical prOfXrties of r«ambin:mt d 1ickcn interfrron-g;unm~L Eur J lrnmunol 26: 2H0-2H 7.

98. Ding M, Zhang ;\I, Wong j l, Rogers l\'E, lgnarro IJ , ct al (1998) Anti...,n;e knockd<J\\11 ofinducib!c nitric oxide ~pathase inhibiu induction of experimental autoimmune cncc-phalomrtlitU in SJUJ wicc.J luwwnol 160: 2j60-2564.

99. \\'u Z, Rothwell L, YouugJR, KaufmanJ, IJuttcr C , ct a\ (2010) Gwcrarion and char.tcteri7ation of chicken bone marrow-derh-cd dendritic cells. lm rnuno!ogy 129: 133-HS.

100. Goodman T , Lefmn~ois I. (1988) F.,..,r..sion of tl1c gamma-tld ta '!'-cell receptor on inte-stinal CUS+ imracpit1Jcli."'lll)1nphoc)1C'S. N'amre 333: 85:»- 838.

101. SalomomcuJ, Sorcnseu ~IR, ;\I anton UA, Rogers Sf, C:ollon T , <t al (200j) T wo CDJ gcnc.s m.1p 10 1hc chk-kC'n ~IHC, indic~uiug th:u CD 1 gt>ncs are ancicru and likdr to h:wc been prl".scut in 1hr primordial ~ IHC. Pnx i\"atl Acad Sci U S A 102: 8668 8673.

102. l.iuna TJ , Fronund U, Good RA (1972) t::ITccn of carlr rp:lophmph:unide trcatmcm on 1he de:vcloprncnt of l)1nphoid organs and inununo!ogical functions in the cltickcns. Im Arch Allcrgr Appl lnummol 42: 20- 39.

103. KaufinanJ, Andcrs<n R, Atila D, F.ngb<rgJ , l...1muril j , eta\ (1992) Di£rcrcm fC"atures of lhC' ~lfiC cl3.iS I hcterodirnrr ha,-c evoln~d at diHhent r.lte$.

Chkkcu ll~F ami bc1.1 2·microglobulin £Cquencts rC'\"tal invariant !\lrf.1.Cf' rffidu<S. J lnununol H8: 1532·1516.

104. Ro nqui51 F, Hudscnbcck Jl' (2003) ~lrD~r.,.. 3: lJJ)-oian phrlogeneric inference under mL"d modds. IJioinfonnJ tio 19: 1572 15H .

PLOS Genetics I www.plosgenetics.org 21

The Dynamically Evolving BG Family

IOj, Swofli:ml Dl. (2003) PAUl''. l,t)'logcn<Jic Anai)>U U1ing 1'.11-siniOII)' ('ancl Other :\lct.luxls). Vc~ion 4. Sinautr A'-<c:K"i.1tM, Sunderland, ~fas_Qchuce-tts.

106. Shimod.1irn H, H:u<g:~wa ~I (200 1) CO!\SEI" for as..«"S>i ng the confidence of phrlog~nctic tre-e ~kerion. llioinfonnatia 17: 1246- 124 7.

107. Hu!On 1)11 (1998) SplitsTr.c: analrzing and t'i,ualizing e\ulutional)' d.1Ll. Bioinfonnarics 1-1: 68- 73.

108. Htoon Dll, ll l)"nt lJ (2006) ,\pplicarion of l'h)logeneti<' l\'et"urk< in t::mluti<l n>l)' Sntdie.. ~lol lliol E\'0\ 23: 251 267.

109. flruon T , l'hillipe H, Dl)"lll lJ (2006) A •1uick and robrnt !L1ti.stical test to detect t11c pres.ence of rccomUination. Gene1ic.s 172: 266.)-268 1.

11 0. C:u1re-~;maj. (2000). Selection ofcomen"«< b!ocks from multip!e alignments for thei r us< in ph)'IO'~wctic :u!al)">is. ;\ lo!ec IJ:ol E\'01 17: 510 552.

111. Talawr.:t G, Castr~anaj . t2007). lrnprO\Ymem of phylogl'llie.s after rC"moving div'!'rgent and ambiguomlr align«! b!oc"-s from protein ~quC'nC( alignmenu. Sptru"" fli<ll56: 51H-577.

112. Zamani l\', Russell P, l:lntz H, HO<IJJ>II<r MP, ~l<ado\\-s JRS ct al (2013) UmUJ)f'ni.~ gt"nome-\\idc recognition of local relationship p:tttt"nu. ll~ IC 8'-'IIOIIIiCS J-1: 3-17.

11 3. Fclicnstcin J (2005) PHYLI I' (l,t )'logenr lnfer<nce Package) tYr:<iun 3.6. Uin rilmted hr the author. Dt p.1r1tnent of Genome Scit iiCC'11 Unh-tnit)' of \ \'ashington, s,e.,,tdc

I 14. ~ I an-nor :\10, HcnatarT , RatcliflC :\U (1993) Retro\ir.tl lr.:ttu_fonnatiou in \i tro of chicken T cells cxprCMing t·ither a!ph.1/bcta or KZHIIJn:tlddta T cell receptors br reticulocndothelirui< tims strain T. J l'.<p Mrd 177:617- 656.

115. ~ faccubbin lJ, Schiennan L (1986) ~IHC-reslrictC'd C)lotoxic n ·::JXU\SC f.lf chid:cn T cells: c-xprn.sion, augmentation, and donal cllaracttri7ation. J lnununol 136: 12- IG.

116. Walker D.-\, Hunt l.G, So"" AK, Skjodt K, Good TW, et al ·n,e dominantlr C~l>r<'!...ccd class I mo!C'CU!c of 1he chicken ~ IIIC is cxplai.ncc.l Lr roe'"hnion \dtl t 1he po1)1norphic JXptide trnmporter (ft\PJ gene-s. Proc Nail Acad Sci U S A 108: 8396 8101.

117. Korbei J O , Uru>n AE, Aflounitjl', G<xh,in II, Gmbert F, ct al (2007) l'.,ir<d· End :\ lapping Re,"'C'~lli E..xten~i re, StnKnu·al Varia1ion in the llurni1n Genomr. Science 318: ~20-426 .

118. l'erry GH, Yang r, ~larqun·llonct T, ~ lurphy C, ~itzg<'r"ld T, et al (2008) Copy number \-ariation aud c\"olution in humJ.n.s aud chimp.1nUC'5. Genome R<S 18: 1698 171.0.

119. Yang F, ~Hiller S, Jwt R, Ferguson-Smith ~1;\ , \\'icnberg J (1997) ComJ>.1r.lth-c chromoromc painting in 111J11U11:Us: hum.;tn and the lndiJ.n 1nuntiac l~funtiarus muuljJ.k \-agina1U). Ccnomks 39: 396-101.

June 20 14 I Volume 10 I Issue 6 I e100441 7