evidence of recent inter kingdom horizontal gene transfer between bacteria and candida is
TRANSCRIPT
BioMed CentralBMC Evolutionary Biology
ss
Open AcceResearch articleEvidence of recent interkingdom horizontal gene transfer between bacteria and Candida parapsilosisDavid A Fitzpatrick*, Mary E Logue and Geraldine ButlerAddress: School of Biomolecular and Biomedical Science, Conway Institute, University College, Dublin, Belfield, Dublin 4, Ireland
Email: David A Fitzpatrick* - [email protected]; Mary E Logue - [email protected]; Geraldine Butler - [email protected]
* Corresponding author
AbstractBackground: To date very few incidences of interdomain gene transfer into fungi have beenidentified. Here, we used the emerging genome sequences of Candida albicans WO-1, Candidatropicalis, Candida parapsilosis, Clavispora lusitaniae, Pichia guilliermondii, and Lodderomyces elongisporusto identify recent interdomain HGT events. We refer to these as CTG species because theytranslate the CTG codon as serine rather than leucine, and share a recent common ancestor.
Results: Phylogenetic and syntenic information infer that two C. parapsilosis genes originate frombacterial sources. One encodes a putative proline racemase (PR). Phylogenetic analysis also infersthat there were independent transfers of bacterial PR enzymes into members of thePezizomycotina, and protists. The second HGT gene in C. parapsilosis belongs to the phenazine F(PhzF) superfamily. Most CTG species also contain a fungal PhzF homolog. Our phylogeny suggeststhat the CTG homolog originated from an ancient HGT event, from a member of theproteobacteria. An analysis of synteny suggests that C. parapsilosis has lost the endogenous fungalform of PhzF, and subsequently reacquired it from a proteobacterial source. There is evidence thatSchizosaccharomyces pombe and Basidiomycotina also obtained a PhzF homolog through HGT.
Conclusion: Our search revealed two instances of well-supported HGT from bacteria into theCTG clade, both specific to C. parapsilosis. Therefore, while recent interkingdom gene transfer hastaken place in the CTG lineage, its occurrence is rare. However, our analysis will not detect ancientgene transfers, and we may have underestimated the global extent of HGT into CTG species.
BackgroundLateral or horizontal gene transfer (HGT) is defined as theexchange of genes between different strains or species [1].HGT introduces new genes into a recipient genome thatare either homologous to existing genes, or belong toentirely new sequence families. Large-scale genomicsequencing of prokaryotes has revealed that gene transferis an important evolutionary mechanism for these organ-isms [2,3]. HGT has been linked to the acquisition of drugresistance by benign bacteria [4], and also to the gain of
genes that confer the ability to catabolize certain aminoacids that are important virulence factors [5]. Howeverthere is much debate as to whether lateral gene transfer isan ubiquitous influence throughout prokaryotic genomeevolution [6]. Until recently, the process of gene transferhas been assumed to be of limited significance to eukary-otes [7]. The availability of diverse eukaryotic genomesequence data is dramatically changing our views on theimportant role gene transfer can play in eukaryotic evolu-tion.
Published: 24 June 2008
BMC Evolutionary Biology 2008, 8:181 doi:10.1186/1471-2148-8-181
Received: 17 January 2008Accepted: 24 June 2008
This article is available from: http://www.biomedcentral.com/1471-2148/8/181
© 2008 Fitzpatrick et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Page 1 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
The rapid increase in fungal sequence data has promotedthis kingdom to the forefront of comparative genomics[8]. Whereas there is some documented evidence for HGTbetween fungal species [9-17] or from bacteria to fungi[18-28] [see additional file 1], overall very few incidenceshave been identified. There are two possible explanations:either gene transfer is indeed extremely rare amongstfungi, or it has not yet been thoroughly studied. Toaddress this question we investigated the frequency of suc-cessful recent interdomain HGT events between prokary-otes and yeast species belonging to the CTG clade. Wechose this course of action as we expect recent interdo-main HGT events to be more readily identified and sup-ported than more ancient transfers.
For the purposes of this study, we define CTG species asthe immediate relatives of C. albicans, including C. tropica-lis, C. parapsilosis, Clavispora lusitaniae, Pichia guilliermon-dii, and Lodderomyces elongisporus. These species have beencompletely sequenced, share a relatively recent commonancestor [29], and the codon CUG is translated as serinerather than leucine [30].
We used syntenic, phylogenetic and sequence based anal-yses to identify two cases of interdomain HGT betweenprokaryotes and C. parapsilosis, most likely involving theproteobacteria phylum. Our results suggest that extantCTG species do not readily take up exogenous DNA.
Results and discussionIdentification of horizontal gene transfer candidates through Blast database searchWe compared all available CTG gene sets against UniProtusing BlastP [31]. CTG genes with top database hits tobacterial species were identified as putative horizontallytransferred genes and the resultant Blast files wereinspected manually. A D. hansenii gene (protein ydhR pre-cursor) with a top database hit to a bacterial sequence wasnot considered for further analsyes as it has previouslybeen described [22]. After this process two genes from C.parapsilosis were considered for further analysis; oneencodes a putative proline racemase, and the secondencodes a member of the phenazine F superfamily.Related family members were identified by a secondround of database searching against GenBank to ensureall available genomic data was utilized.
Proline racemase phylogeny and characterizationThe C. parapsilosis gene (designated CPAG_02038) is mostsimilar to a proline racemase homolog from Burkholderiacenocepacia AU 1054 protein (66% pairwise identity; Fig-ure 1A). Amino acid racemases catalyze the interconver-sion of L- and D-amino acids by abstraction of the α-amino proton of the enzyme bound substrate [32].CPAG_02038 lies within a large contig and is also present
in a previously published genome survey of C. parapsilosis[33], suggesting its presence does not the result from con-tamination. We could not locate any related genes in anyother CTG genome (using BlastP or TBlastN). Familymembers are widely distributed throughout the prokaryo-tes however, and are also located within the Pezizomy-cotina.
We extracted 321 putative proline racemases from 207organisms, including members of the α, β, γ, and δ-pro-teobacteria, Actinobacteria, Fungi, Protozoa and Metazoa.Numerous species were found to have several familymembers [see additional file 2]; all were included forcomplete comparative purposes. A maximum likelihood(ML) phylogeny was reconstructed from an alignment ofall the PR proteins (Figure 2).
There are a large number of polytomies displayed in Fig-ure 2. These probably result from duplication of PR genesfollowed by diversifying selection, leading to a highdegree of sequence heterogeneity. For example, Agrobacte-rium tumefaciens str. C58 contains three PR homologs [seeadditional file 2], with an average amino acid pairwisepercentage identity of ~31%. Burkholderia cenocepacia AU1054 contains 2 proline racemase homologs [see addi-tional file 2], which are only 28% identical. To helpresolve the evolutionary history amongst PR homologs wereconstructed an additional ML phylogeny based on areduced dataset (Figure 3). We also reconstructed a Baye-sian phylogeny using the heterogeneous CAT site model.The CAT model can account for site-specific features ofsequence evolution and has been found to be more robustthan other methods against phylogenetic artifacts such aslong branch attraction [34]. The resultant Bayesian phyl-ogeny is highly congruent with the ML phylogeny (notshown).
The putative C. parapsilosis PR homolog lies in a stronglysupported (100% Bootstrap support (BP)) clade with Bur-kholderia species (Figures 2 &3 clade-A). Burkholderia are β-proteobacteria. However, no other β-proteobacteria, orindeed any other bacterial genus were found within clade-A (Figures 2 &3).
Although no PR homologs were identified in other CTGspecies, or indeed in any other of the Saccharomycotina,there are homologs in family members of the Pezizomy-cotina. A Pezizomycotina specific subclade is evident inour phylogeny containing Phaeosphaeria nodorum, Aspergil-lus niger and Gibberella zeae (Figures 2 &3 clade-B 100%BP). This subclade is found in a strongly supported cladewith members of the Actinobacteria (Figure 2 100% BP),containing Brevibacterium linens and an unclassifiedmarine actinobacterium and excluding Rubrobacter xylano-philus (Figure 2 87% BP). This suggests that these Pezizo-
Page 2 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
mycotina species obtained their PR gene from theActinobacteridae subclass rather than the Rubrobacteri-dae subclass. This transfer event is another independentHGT event of a PR gene into fungi, and we hypothesize itoccurred early in the Pezizomycotina lineage, as it isshared by three distantly related species. Its patchyphyletic distribution suggests it has been subsequentlylost in other Pezizomycotina species.
There are also PR homologs in the Metazoans. These arefound in a eukaryote clade that also contains a number ofPezizomycotina representatives (Figures 2 &3 clade-C93% BP). Several scenarios can explain this phylogeneticpositioning. Firstly, the PR gene may have been present inthe last universal common ancestor of all eukaryotes buthas been differentially lost in all lineages except thoseleading to modern day Metazoa and Pezizomycotina.Alternatively, an ancient gene transfer from bacteria to thelast common ancestor (LCA) of Metazoa and Fungi couldhave occurred, with subsequent gene loss amongst differ-ent Metazoan and Fungal lineages. A third hypothesis is
that two independent gene transfers have occurred intothe Metazoan and Pezizomycotina lineages from unsam-pled bacterial donors. Finally, a transfer from unsampledbacteria into one of the eukaryote clades (either Metazoaor Pezizomycotina) may have occurred with subsequenttransfer from one eukaryotic group to the other.
A. niger, A. oryzae and G. zeae all contain multiple PRhomologs [see additional file 2]. One A. niger, one G. zeaeand the three A. oryzae PR homologs are nested in astrongly supported Pezizomycotina specific subclade (Fig-ures 2 &3 clade-D 100% BP). This subclade if foundwithin a larger predominately proteobacterial clade (Fig-ure 2 74% BP). This infers that there was an independentgene transfer event of a bacterial PR homolog into anancestral Pezizomycotina species.
The phylogenetic position of the C. parapsilosis PRhomolog (Figures 2 &3) resemble that described for theadenosine deaminase (ADA) gene in the Dekkera bruxel-lensis genome [21]. In that analysis, the authors suggest
A) An alignment of PR proteins from C. parapsilosis (CpPR CPAG_02038) and Burkholderia cenocepacia (BcPR) was generated with MUSCLEFigure 1A) An alignment of PR proteins from C. parapsilosis (CpPR CPAG_02038) and Burkholderia cenocepacia (BcPR) was generated with MUSCLE. These proteins are 66% identical B) An alignment of PhzF proteins from Candida parapsilosis (CpPhzF CPAG_03462) and Photorhabdus luminescens (PlPhzF), these are 61% identical.
(A) (B)
LSTEQMQTIANWTNLSETTFVFPATSDKALSSIQMQGIANWTNLSETTFILPAENPLA
YYVRIFTPQSELPFAGHPTIGTCHALLESYRVRIFTPGSELPFAGHPTIGTAHALLEA
LISAKDGVVVQECGAGLVKLTIVPG----LIQAREGRIVQECGAGLITLNVTERDEGQ
STSFELPDPVITPLSDTQINSLETDLGCALITFELPEPTITPLSSEQIDRLESILDCP
DKNLRPALVDVGARWIIARVADAKTVLSA DRALTPALIDVGARWIVAHTTGAEAVLAT
PAFPQLAKHNNELKATGVSIYGNWSD---PDYARLLEHDTQMNITGVCLYGAYHEGAE
HIEVRSFAPACGVDEDPVCGSGNGAVAAF DIEVRSFAPSCGVNEDPVCGSGNGSVAAF
RSHTN---EDKILKSSQGSVVNREGALKL RHHKVAMIDDKIVHSSQGKKLGRQGSVWL
ISNKKVLVGGDAVTCIEGKIRITHSDGKIFVGGSAVTCINGTITI-
MSSA-FKQVDVFTSKPFKGNPVAVIMDANMSLVPFKQVDVFTHRPFKGNPVAVVMDAQ
0 0
CpPhzFPlPhzF
89 90
CpPhzFPlPhzF
59 60
CpPhzFPlPhzF
29 30
CpPhzFPlPhzF
145 150
CpPhzFPlPhzF
115 120
CpPhzFPlPhzF
202 210
CpPhzFPlPhzF
175 180
CpPhzFPlPhzF
232 240
CpPhzFPlPhzF
259 270
CpPhzFPlPhzF
MNQDRLISTIETHTGGEPFRIVTSGLPRLKMKISRSLSTVEVHTGGEAFRIVTSGLPRLP
GDTIVAKRTWIKTHHDEIRKFLMYEPRGHAGDTIVQRRAWLKAHADEIRRALMFEPRGHA
DMYGGYLVDSVSDDADFGVIFLHNEGYSDHDMYGGYLTEPVSPNADFGVIFVHNEGYSDH
CGHGIIALASTAVKLGWVERTQPKTRVGIDCGHGVIALSTAAVELGWVQRTVPETRVGID
APCGFIEAFVKWDGEKVGNVRFVNVPSFMYAPCGFIEAFVQWDGEHAGPVRFVNVPSFIW
IKDATVDTPSFGEVIGDIAFGGAFYFYMNSRRDVSVDTPSFGTVTGDIAYGGAFYFYVDG
ASLDIPIGLSQVETLRRLGNEVKIAANKKYAPFDLPVRESAVEKLIRFGAEVKAAANATY
KVVHPEIAEINHVYGTIIDNISPDGGASQSPVVHPEIPEINHIYGTIIANAPRHAGSTQA
NVCIFADRQVDRSPTGSGTAGRAAQLFARGNCCVFADREVDRSPTGSGTGGRVAQLYQRG
ELQVGQVFTNESIVGSVFSAKVVKQVKYHGLLAAGDTLVNESIVGTVFKGRVLRETTVGD
FDAVIPEVEGNANVIGYANWIVDPDDEIGKFPAVIPEVEGSAHICGFANWIVDERDPLTY
GFLVREGFLVR-
0 0
CpPRBcPR
120120
CpPRBcPR
90 90
CpPRBcPR
60 60
CpPRBcPR
30 30
CpPRBcPR
210210
CpPRBcPR
180180
CpPRBcPR
150150
CpPRBcPR
270270
CpPRBcPR
240240
CpPRBcPR
300300
CpPRBcPR
330330
CpPRBcPR
Page 3 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
that D. bruxellensis and Burkholderia species received theADA gene from a species not yet represented in the publicsequence databases. Our PR phylogeny suggests a similarevent may have occurred within clade-A, which contains
only C. parapsilosis and Burkholderia species (Figures 2 &3).Burkholderia species are known to have a genomic reper-toire that allows the transfer and receipt of exogenousDNA [35] and a number of studies have reported success-
Proline racemase maximum likelihood phylogenyFigure 2Proline racemase maximum likelihood phylogeny. The optimum model of protein substitution was found to be WAG+G. The number of gamma rate categories was 4 (alpha = 1.163). Bootstrap resampling (100 iterations) was undertaken and are displayed. For display purposes branches with less than 50% support were collapsed. Letters (A-E) in parentheses are used to distinguish clades and are discussed in the text. Branches are colored according to their taxonomy. Fungal branches and species names are colored green.
Burkholderia sp 383
Erwinia carotovora SCRI1043
Pseudomonas chlororaphis
Pseudomonas aeruginosa C3719
Ferroplasma acidarmanus fer1
Bacillus cereus W
Bacillus cereus E
33L
Bac
illus
ant
hrac
is s
tr A
2012
Bac
illus
ant
hrac
is s
tr A
mes
Bac
illus
cer
eus
AH
820
Bac
illus
thur
ingi
ensi
s st
r A
l Hak
amB
acill
us th
urin
gien
sis
str
9727
Bacillus cereus A
H187
Bacillus cereus G
9241
Bacillus cereus A
TC
C 10987
Bacillus cereus A
H1134
Bacillus cereus B
4264B
acillus cereus AT
CC
14579B
acillus cereus G9842
Bacillus w
eihenstephanensis KB
AB
4
Bacillus sp B
14905
Lysinibacillus sphaericus C3-41
Trypanosoma cruzi strain CL Brener
Trypanosoma cruzi strain C
L Brener
Clostridium
botulinum A str. ATC
C 3502
Clostridium
botulinum Bf
Clostridium
difficile 630
Clostridium
sp OhILA
s
Alkaliphilus m
etalliredigenes QY
MF
Clostridium
sticklandii
Dorea longicatena D
SM 13814
Clostridium
scindens ATCC
35704
Rhodobacter sphaeroides ATCC 17029
Rhodobacter sphaeroides 241
Rubrobacter xylanophilus DSM 9941
Brevibacterium linens BL2
Saccharopolyspora erythraea NRRL 2338
Streptomyces coelicolor A3(2)
Streptomyces averm
itilis MA4680
Clavibacter michiganensis strain NCPPB 382
Clavibacter michiganensis
Renibacterium salm
oninarum ATCC 33209
Brevibacterium linens BL2
Haloarcula marismortui ATCC 43049
Methylobacterium nodulans ORS 2060
Methylobacterium sp 446
Bur
khol
deria
am
bifa
ria M
C40
6B
urkh
olde
ria m
ultiv
oran
s A
TC
C 1
7616
Mar
icau
lis m
aris
MC
S10
Rho
dosp
irillu
m r
ubru
m A
TC
C 1
1170
Bur
khol
deria
xen
ovor
ans
LB40
0
Bor
dete
lla p
etrii
DS
M 1
2804
Mes
orhi
zobi
um lo
ti M
AFF
3030
99
Bruc
ella
mel
itens
is 1
6M
Bruc
ella
abo
rtus
biov
ar 1
str
9941
Bruc
ella
sui
s 13
30
Bru
cella
sui
s A
TCC
2344
5
Och
roba
ctru
m a
nthr
opi A
TCC
4918
8
Aura
ntim
onas
sp
SI85
9A1
Hoe
flea
phot
otro
phic
a D
FL43
Oce
anib
ulbu
s in
dolife
x HEL4
5
Oce
anico
la g
ranu
losu
s HTC
C2516
Fulvi
mar
ina
pela
gi H
TCC25
06
Sinor
hizo
bium
med
icae
WSM
419
Sinorh
izobiu
m m
elilot
i 102
1
Parac
occu
s de
nitri
fican
s PD12
22
Mic
rosc
illa
mar
ina
AT
CC
231
34
Alg
orip
hagu
s sp
PR
1
Rob
igin
itale
a bi
form
ata
HT
CC
2501
Fla
voba
cter
iale
s ba
cter
ium
HT
CC
2170
Kor
dia
algi
cida
OT
1
Psy
chro
flexu
s to
rqui
s A
TCC
700
755
Flav
obac
teria
les
bact
eriu
m A
LC-1
Leeu
wen
hoek
iella
bla
nden
sis
ME
D21
7
Dok
doni
a do
ngha
ensi
s M
ED
134
Met
hylo
bact
eriu
m s
p 44
6
Par
acoc
cus
deni
trifi
cans
PD
1222
Agr
obac
teriu
m tu
mef
acie
ns s
tr C
58R
hizo
bium
legu
min
osar
um W
SM
2304
Rhi
zobi
um e
tli C
FN
42
Rhi
zobi
um le
gum
inos
arum
WS
M13
25
Rhi
zobi
um le
gum
inos
arum
bv
vici
ae 3
841
Mar
inom
onas
sp M
ED121
Hahell
a ch
ejuen
sis K
CTC 239
6
Vibrio
para
haem
olytic
us R
IMD 2
2106
33
Vibrio para
haemolyticu
s AQ3810
Vibrio sp
Ex2
5
Vibrio alginolyt
icus 1
2G01
Shewanella lo
ihica PV4
Shewanella amazonensis
SB2B
Shewanella pealeana ATCC700345
Shewanella halifaxensis HAW-EB4
Shewanella woodyi ATCC 51908
Shewanella sediminis HAWEB3
Shewanella benthica KT99Colw
ellia psy
chreryt
hraea 34H
Pseudoalte
romonas tunica
ta D2
Alteromonadales b
acteriu
m TW7
Bacillus cereus A
TC
C 10987
Bacillus cereus A
H187
Bacillus cereus A
H820
Bacillus anthracis str A
mes
Bacillus thuringiensis str 9727
Bacillus cereus N
VH
0597-99
Bacillus thuringiensis str A
l Hakam
Bacillus cereus E
33L
Bacillus cereus G
9241
Bacillus cereus A
TCC
14579
Bacillus cereus B
4264
Bacillus cereus A
H1134
Bacillus cereus G
9842
Bacillus w
eihenstephanensis KB
AB
4
Paracoccus denitrificans PD1222
Brevibacterium linens BL2
Rhodococcus sp RHA1
Rhodococcus sp RHA1
Xanthomonas axonopodis pv citri str 3
06
Aspergillus oryzae
Aspergillus oryzae
Roseobacter sp MED193
Roseobacter sp SK20926
Rhizobium leguminosarum WSM2304
Solibacter usitatus Ellin6076
Roseiflexus castenholzii DSM13941
Mesorhizobium loti MAFF303099Rhodobacterales bacterium HTCC2150
Brucella suis 1330Brucella canis ATCC 23365
Brucella melitensis 16M
Brucella ovis ATCC25840
Ochrobactrum anthropi ATCC49188
Paracoccus denitrificans PD1222Rhizobium leguminosarum WSM2304
alpha proteobacterium BAL199
Pseudomonas aeruginosa PAO1
Pseudomonas aeruginosa C3719
Pseudomonas aeruginosa 2192
Pseudomonas aeruginosa UCBPPPA14
Pseudomonas aeruginosa PA7Silicibacter pomeroyi DSS3
Jannaschia sp CCS1
Psychromonas ingrahamii 37
Rhodobacterales bacterium HTCC2255
Roseovarius sp. TM1035
Roseovarius sp 217Roseobacter sp Azw3B
Sinorhizobium medicae WSM419
Sinorhizobium meliloti 1021
Agrobacterium tumefaciens str C58
Oceanibulbus indolifex HEL45
Oceanicola granulosus HTCC2516
Roseovarius nubinhibens ISM
Stappia aggregata IAM 12614
Silicibacter sp TM1040
Phaeobacter gallaeciensis 2.10
Phaeobacter gallaeciensis BS107
Pseudomonas putida W619
Agrobacterium tumefaciens str C58
Rhizobium etli CFN 42Rhizobium leguminosarum WSM2304
Rhizobium leguminosarum WSM1325
Rhizobium leguminosarum bv viciae 3841
Mesorhizobium sp BNC1
Rhizobium leguminosarum WSM2304
Burkholderia ambifaria MC406Burkholderia multivorans ATCC1761 6
Alkalilimnicola ehrlichei MLHE1
Reinekea sp MED297
Chromohalobacter salexigens DSM3043
Marinobacter algicola DG893
Marinobacter aquaeolei VT8
Marinobacter sp ELB17
Plesiocystis pacifica SIR-1
Verminephrobacter eiseniae EF012
Pseudomonas fluorescens Pf5
Serratia proteamaculans 568
Herpetosiphon aurantiacus ATCC23779
Photorhabdus luminescens subsp laum
ondii TTO1
Burkholderia pseudomallei DM
98
Psychrobacter cryohalolentis K5
Blastopirellula marina D
SM 3645
Gem
mata obscuriglobus U
QM
2246
Rhodopirellula baltica SH
1
Pseudom
onas aeruginosa C3719
Pseudom
onas phage D3
Pseudom
onas aeruginosa UC
BP
PP
A14
Pseudom
onas aeruginosa PA
O1
Pseudom
onas aeruginosa PA
7
Acinetobacter baumannii ATC
C 17978
Acinetobacter baumannii
Chrom
ohalobacter salexigens DS
M 3043
Chrom
obacterium violaceum
AT
CC
12472
Pseudom
onas fluorescens PfO
1
Pseudom
onas putida F1
Pseudom
onas putida KT
2440
Pseudom
onas putida GB
1
Pseudom
onas entomophila L48
Burkholderia phym
atum S
TM815
Burkholderia xenovorans LB
400
Burkholderia phytofirm
ans PsJN
Myxococcus xanthus D
K 1622
Xanthom
onas axonopodis str 306
Xanthom
onas campestris str8510
Xanthom
onas oryzae MA
FF
311018
Xanthom
onas oryzae pv oryzae KA
CC
10331X
anthomonas oryzae B
LS256
Xanthom
onas campestris str A
TC
C33913
Bur
khol
deria
thai
land
ensi
s B
t4
Bur
khol
deria
thai
land
ensi
s T
XD
OH
Bur
khol
deria
thai
land
ensi
s E
264
Bur
khol
deria
mal
lei A
TC
C 2
3344
Bur
khol
deria
mal
lei G
B8
hors
e 4
Bur
khol
deria
pse
udom
alle
i 171
0b
Bur
khol
deria
pse
udom
alle
i K96
243
Bur
khol
deria
pse
udom
alle
i 14
Bur
khol
deria
pse
udom
alle
i DM
98B
urkh
olde
ria th
aila
nden
sis
MS
MB
43
Bur
khol
deria
okl
ahom
ensi
s E
O14
7
Bur
khol
deria
cen
ocep
acia
MC
03
Bur
khol
deria
cen
ocep
acia
AU
105
4B
urkh
olde
ria s
p 38
3
Bur
khol
deria
am
bifa
ria M
C40
6
Bur
khol
deria
cep
acia
AM
MD
Bur
khol
deria
vie
tnam
iens
is G
4
Bur
khol
deria
mul
tivor
ans
AT
CC
176
16
Bur
khol
deria
ubo
nens
is B
u
Bur
khol
deria
dol
osa
AU
O15
8
Bur
khol
deria
dol
osa
AU
O15
8
Phaeobacter gallaeciensis B
S107
Phaeobacter gallaeciensis 2.10
Roseobacter sp M
ED
193
Roseobacter sp S
K20926
Stappia aggregata IA
M 12614
Marinom
onas sp MW
YL1
Roseovarius sp 217
Roseovarius sp. T
M1035
Silicibacter pom
eroyi DS
S3
Sili
ciba
cter
sp
TM
1040
Bacillu
s ce
reus
W
Bacillu
s ce
reus
AH82
0
Bacillu
s ce
reus
Bacillu
s cer
eus E
33L
Bacillu
s an
thra
cis s
tr Am
es
Bacillu
s th
urin
gien
sis s
tr Al H
akam
Baci
llus
cere
us A
H18
7
Baci
llus
cere
us A
TCC
145
79
Baci
llus
cere
us A
TCC
109
87
Baci
llus
cere
us G
9842
Baci
llus
thur
ingi
ensi
s AT
CC
356
46
Bac
illus
cer
eus
G92
41
Bac
illus
cer
eus
H30
81.9
7
Bac
illus
cer
eus
AH
1134
Bac
illus
thur
ingi
ensi
s st
r 972
7
Bac
illus
cer
eus
NV
H05
97-9
9
Bac
illus
wei
hens
teph
anen
sis
KB
AB
4
Bac
illus
thur
ingi
ensi
s A
TCC
3564
6
Plesiocystis pacifica SIR-1
Microscilla marina ATCC 23134
Flavobacteriales bacterium HTCC2170
Colwellia psychrerythraea 34H
Shewanella amazonensis SB2B
Shewanella loihica PV4
Shewanella sediminis HAWEB3
Shewanella woodyi ATCC 51908
Shewanella pealeana ATCC 700345
Shewanella halifaxensis HAW-EB4Roseiflexus sp RS1
Chloroflexus aurantiacus J10
Hoeflea phototrophica DFL4243
Nematostella vectensisNematostella vectensisNematostella vectensis
Strongylocentrotus purpuratusStrongylocentrotus purpuratusStrongylocentrotus purpuratus
Tetraodon nigroviridisDanio frankei
Silurana
Monodelphis domestica
Canis familiaris
Bos grunniens x Bos taurus
Equus caballus
Mus musculus x Rattus norvegicus
Mus musculus
Mus musculusMus musculus
Homo sapiens
Macaca mulatta
Pan troglodytes
Gallus gallus
Ornithorhynchus anatinus
Aspergillus nidulans FGSC A4Aspergillus nigerAspergillus terreus NIH2624Aspergillus fumigatus Af293Aspergillus fumigatus A1163
Neosartorya fischeri NRRL 181
Methylobacterium nodulans ORS 2060
Gibberella zeae PH1Aspergillus oryzae
Aspergillus niger
(C)
(D)
(E)
Burkholderia phymatum STM815
Burkholderia xenovorans LB400
Burkholderia
phytofirm
ans PsJ
N
Burkholderia cenocepacia AU 1054
Burkholderia cenocepacia PC184
Burkholderia cenocepacia M
C03
Burkholderia ambifaria MC406
Burkholderia
cepacia
AMMD
Burkholderia sp 383
Burkholderia
multiv
orans ATCC 1761
Burkholderia
ubonensis
Rubro
bacte
r xyla
noph
ilus D
SM 9
941
mar
ine a
ctino
bacte
rium
PHSC20
C1
Brevib
acte
rium
linen
s BL2
100
97
64
99
75
77
99
92
71
99
Phaeosphaeria
nodorum SN15
Asperg
illus n
iger
Gibber
ella z
eae P
H1
Candida parapsilosis
(A)
(B)
Fungiα-proteobacteria
γ-proteobacteriaActinobacteria
β-proteobacteria
miscellaneous bacteria
TrypanosomaMetazoaFirmicutes
Page 4 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
Page 5 of 15(page number not for citation purposes)
Reduced Proline racemase maximum likelihood phylogeny with active site alignmentFigure 3Reduced Proline racemase maximum likelihood phylogeny with active site alignment. Bootstrap resampling (100 iterations) was undertaken and percentages are displayed. Fungal branches are shown in green. An alignment around the active site is also displayed. Clade letters in parentheses correspond to those in Figure 2. The phylogeny is rooted around the Meta-zoan/Pezizomycotina specific clade (clade-C), all members of this clade have a threonine at the active site. C. parapsilosis and its phylogenetic neighbors have a threonine instead of a cysteine at the active site (clade-A). A. oryzae, A. niger and G. zeae all con-tain cysteine at the active site (clade-D). A. flavus, A. oryzae, A. niger, A. nidulans and G. zeae also have cysteine at the active site (clade-C).
Brucella suisBrucella melitensisParacoccus denitrificansAgrobacterium tumefaciens
Silicibacter
Aspergillus oryzae
Aspergillus oryzaeAspergillus oryzae
Aspergillus niger
Gibberella zeae
Psychromonas ingrahamii
Brucella suisBrucella melitensis
Paracoccus denitrificans
Agrobacterium tumefaciens
Burkholderia cenocepaciaBurkholderia cenocepaciaBurkholderia multivorans
Burkholderia ambifariaBurkholderia cepacia
Burkholderia phymatum
Burkholderia xenovoransBurkholderia phytofirmans
Burkholderia cenocepacia
Candida parapsilosis
Burkholderia ambifaria
BurkholderiaBurkholderia cepacia
Burkholderia cenocepaciaBurkholderia cenocepacia
Burkholderia xenovoransBurkholderia phytofirmansBurkholderia phymatum
Pan troglodytesHomo sapiens
Mus musculusRattus norvegicus
Aspergillus niger
Neosartorya fischeriAspergillus fumigatus
Aspergillus nidulans100
100
100
95
55
55
100
100
100
55
95
100
7590
70
100
65
100
100
95
100
95
100
100
100
100
100
100 90
95
100
100
95
100
100
60
95
55
65
100
90
100
Silicibacter
Aspergillus terreus
Aspergillus flavus
100
100 Aspergillus niger Gibberella zeae
Phaeosphaeria nodorum
100
Aspergillus flavus
Aspergillus flavus
marine actinobacteriumBrevibacterium linens
DRSPTGSGVTARIALDRSPTGSGVTARIALDRSPTGSCVTARLALDRSPTGSCVAARVALDRSPTGSCVTARMALDRSPTGSCVTARMAL
DRSPTGSCVTARMALDRSPTGSCVIARMAL
DRSPTGSGTAGRAAQDRSPTGSGTGGRVAQDRSPTGSGTGGRVAQDRSPTGSGTGGRVAQDRSPTGSGTGGRVAQDRSPTGSGTGGRVAQDRSPTGSGTGGRVAQ
DRSPTGSGTGGRVAQDRSPTGSGTGGRVAQDRSPTGSGTAGRVAQ DRSPCGSGTCARIATDRSPCGTGTSARVAADRSPCGSGTASRIAVDRSPCGSGTCARIATDRSPCGSGTSARLAI
DRSPTGTALSARMAVDRSPCGTGTSARMAQDRSPCGTGTSARMAQDRSPCGTGTSARMAQDRSPCGTGSSGFVACDRSPCGTGTSAKLACDRSPCGTGTSAKLAC
DRSPCGTGTSAKLACDRSPCGTGTSAKLACDRSPCGTGTSAKLACDRSPCGTGTSAKVACDRSPCGTGTSAKVACDRSPCGTGTSAKVAC
DRSPTGTGCSARMAV
DRSPTGTGCSARMAV
DRSPTGTGCSARMAVDRCPTGTSVSARMAI
DRSPTGTAVSARMALDRSPCGTGSSSRLAI
DRCPCGTGSCARMAV
DRSPTGSGVTARIALDRSPTGSGVTARIAL
DRCPCGTGSSARMAVDRCPCGTGSSARMAV
DRSPCGTGSSARMAVDRSPCGTGSSARMAVDRSPCGTGSSARMAV
DRSPTGTGCSARMAV
(A)
(C)
(D)
(B)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
ful gene transfers into Burkholderia species [36,37]. It ispossible therefore that there have been other successfulgene transfers into this bacterial lineage.
The vast majority of amino acids found in living cells cor-respond to the L-stereoisomer [38]. However, D-aminoacids are long known to be found in the cell walls of Grampositive and negative bacteria, where they are essentialcomponents of peptidoglycan [39]. Apart from low levelsof D-amino acids derived from spontaneous racemizationas a result of aging [40], it was assumed that only L-aminoacid enantiomers were present in eukaryotes [41]. How-ever, recent studies have reported the presence of numer-ous D-amino acids in an array of organisms, includingmammals [42]. The first eukaryotic (proline) amino acidracemase has recently been described from the humanpathogen Trypanosoma cruzi [43]. A high degree ofsequence similarity was observed between the T. cruzi andbacterial homologs [43]. Our phylogeny infers that T.cruzi obtained its PR homolog through interdomain HGTfrom a member of the Firmicutes (subclass Clostridia), asit is grouped beside members of this group with a highdegree of support (Figure 2 clade-E 96% BP). We per-formed database searches [44], against other Protozoangenomes including Trypanosoma brucei, Trypanosoma con-golense and Trypanosoma annulata. We failed to locate ahomolog in all species except for T. vivax.
Previous analysis has shown that T. cruzi and T. vivax arenot each others closest phylogenetic neighbors, relative tothe other species sampled [45]. This suggests an ancestralTrypanosoma gained the PR gene and multiple losses indifferent Trypanosoma lineages has subsequentlyoccurred.
Gene order around PR homologsThe C. parapsilosis PR homolog lies close to an ortholog(CPAG_02041) of orf19.1135 from C. albicans (Figure 4).The gene order to the left of this ORF is conserved in allCTG species, the order to the right is conserved in mostCTG species apart from C. parapsilosis and L. elongisporus.C. parapsilosis and L. elongisporus are closely related [29],and an examination of synteny suggests that the PR gene(together with a second ORF, cpar5437) were insertedbetween CPAG_2041 and CPAG_2037 (Figure 4).cpar5437 encodes a neutral amino acid (AA) transporter.The presence of an AA transporter beside the PR homologis interesting. If the putative proline racemase has a role inamino acid metabolism, then the presence of the trans-porter may be the result of an adaptive translocation toenhance the activity of the PR gene. Unlike the PR ORF theAA transporter is fungal in origin. Most CTG species con-tain a single neutral AA transporter; however C. parapsilosisand D. hansenii have four.
We located tRNA genes for nearly all CTG species besidethe large conserved syntenic block (Figure 4). It has beenshown that tRNA genes are associated with genomicbreakpoints [46]. We hypothesize that a genomic rear-rangement has occurred at this site in the LCA of C. parap-silosis and L. elongisporus. We cannot determine if thebacterial PR homolog was inserted into the LCA of L.elongisporus/C. parapsilosis and subsequently lost in L.elongisporus, or gained by C. parapsilosis after speciation.
We also investigated the gene order around the Pezizomy-cotina PR homologs [see additional file 3]. Gene syntenyaround the PR homologs found in clade-D (Figures 2 &3)is not conserved (not shown). Interestingly however, bothA. niger and G. zeae in clade-D (Figures 2 &3) have genescontaining a FAD dependent oxidoreductase domain inclose proximity to their PR homologs (not shown).According to Pfam [47], FAD dependent oxidases includeD-amino acid oxidases, that catalyze the oxidation of neu-tral and basic D-amino acids into their correspondingketo acids. The presence of these oxidases may be anotherexample of an adaptive translocation to enhance the activ-ity of the PR gene in these Pezizomycotina species.
A. oryzae has three PR homologs (Figures 2 &3 clade-C).All of these have orthologs in its close relative A. flavus(Figure 3 clade-C), and synteny around these is conserved[see additional file 3 clade-D]. The remaining two speciesin clade-C are A. niger and G. zeae. There is no evidence ofconserved gene order within these species, or with A.oryzae or A. flavus. Gene order around the A. flavus and A.terreus PR homologs found in the Metazoan/Pezizomy-cotina clade (Figures 2 &3) is also conserved [see addi-tional file 3], as is the order between A. fumigatus and N.fishceri [see additional file 3]. We could not locate aminoacid transporters or FAD dependent oxidases beside anyof the PR homologs found in clades B or C (Figure 2).
Proline racemase codon usageIt has been shown that recently acquired genes often dis-play an atypical codon preference when compared toother genes in the genome [48,49]. However, the trans-ferred PR homologs have a codon usage consistent withthe rest of their genomes [see additional file 4]. We under-took an analysis of variation in synonymous codon usageon all PR genes shown in Figure 2. Homologs from relatedspecies cluster together [see additional file 5]. For exam-ple, the Actinobactria, the Firmicutes and the Burkholderiaspecies all inhabit unique areas in two dimensional corre-spondence analysis space [see additional file 5].
The majority of fungal and Metazoan PRs are clusteredtogether [see additional file 5]. The C. parapsilosis PRhomolog has a codon usage distinct from the other Pezi-zomycotina fungal PR homologs [see additional file 5],
Page 6 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
which is unsurprising as C. parapsilosis belongs to the Sac-charomycotina subphylum. The C. parapsilosis homolog isalso separate from the Burkholderia (β-proteobacteria)genes with which it forms a closely related phylogeneticgroup (Figures 2 &3). This suggests that the gene may have
originated from a genome with no other close relativesamong the species analyzed here.
Proline racemase activityThe PR active site from Trypanosoma cruzi, Clostridium stick-landii, Agrobacterium tumefaciens, Brucella melitensis and
Gene order around C. parapsilosis proline racemase geneFigure 4Gene order around C. parapsilosis proline racemase gene. Species names and identifiers are shown in each box. Gene identifiers relate to annotations from the Broad Institute [66]. On the left hand side orthologous genes are stacked under one another in pillars. Relative positions of t-RNA genes are shown and may indicate a breakpoint. After the breakpoint, synteny is conserved between C. albicans, C. dubliniensis, C. tropicalis, D. hansenii and Cl. lusitaniae. Synteny between C. parapsilosis and L. elongisporus is conserved but differs to the other CTG species. C. parapsilosis has a proline racemase (PR CPAG_02038) and a neutral amino acid transporter (AA cpar5437) insertion in this region. cpar5437 is absent from the Broad gene list but present in our manual gene call.
C. albicansorf19.1136
C. parapsilosisCPAG_02042
L. elongisporusLELG_00344
CLUG_03490
P. guillermondiiPGUG_01868
C. tropicalisCTRG_03467
C. dubliniensis
D. hansenii
F11187g
C. albicansorf19.1139
C. parapsilosisCPAG_02044
L. elongisporusLELG_00346
CLUG_03488
P. guillermondiiPGUG_01866
C. tropicalisCTRG_03469
C. dubliniensis
D. hansenii
F11231g
C. albicansorf19.1137
C. parapsilosisCPAG_02043
L. elongisporusLELG_00345
CLUG_03489
P. guillermondiiPGUG_01867
C. tropicalisCTRG_03468
C. dubliniensis
D. hansenii
F11209g
tRNA
C. albicansorf19.1135
C. parapsilosisCPAG_02041
L. elongisporusLELG_00343
CLUG_03491
P. guillermondiiPGUG_01869
C. tropicalisCTRG_03466
C. dubliniensis
D. hansenii
F11165g tRNA
tRNA
tRNA
tRNA
AAPR
C. parapsilosis
CPAG_02037
C. parapsilosis
CPAG_02036
L. elongisporus
LELG_00342
L. elongisporus
LELG_00341
C. albicansorf15282
CLUG_03492
C. tropicalisCTRG_03464
C. dubliniensis
D. hanseniiF11121g
C. albicansorf15281
C. tropicalisCTRG_03463
C. dubliniensis
Cl. lusitaniae Cl. lusitaniaeCl. lusitaniaeCl. lusitaniae
Cl. lusitaniae
Cd11070 Cd11090
Cd11110
Cd11080Cd11060
Cd11100
C. parapsilosis
C. parapsilosis CPAG_02038
cpar5437
Page 7 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
Pseudomonas aeruginosa all contain cysteine at amino acidposition 330 [43,50]. This amino acid is essential forenzymatic function, because substitution with serineabolishes activity [41]. However, PR homologs fromhuman, mouse, Rhizobium and Brucella contain a threo-nine instead of a cysteine at position 330 [41]. Weobserved that cysteine is found in the equivalent positionin many of the bacterial proteins. The Pezizomycotina PRgenes found in clade-B and clade-D contain a cysteine atthe active site (Figure 3). The PR homologs found in theMetazoan/Pezizomycotina clade (clade-B) have a threo-nine at position 330. Similarly, the C. parapsilosis PRhomolog, together with its relatives from Burkholderia allcontain a threonine (Figure 3). However, Burkholderia spe-cies have multiple PR homologs [see additional file 2]with a cysteine as the active site (not shown). It is not clearwhat effect the substitution has on enzyme activity. It hasbeen suggested that homologs containing threonine at theactive site are not true PRs [41], but may instead belong toa superfamily. We cannot detect any difference in the abil-ity of C. parapsilosis, the other CTG species or any of thePezizomycotina species to utilize D-proline as growthmedia (data not shown). We therefore cannot confidentlyinfer the function of the PR homologs in the fungi ana-lyzed here.
Phenazine F phylogeny and characterisationThe C. parapsilosis gene (designated CPAG_03462) is mostsimilar to a Photorhabdus luminescens phenazine F (PhzF)protein with 61% pairwise identity (Figure 1B). Phena-zines are biologically active compounds, all of which havea characteristic tricyclic ring system and have been shownto confer a selective growth advantage to organisms whichsecrete them, as they possess broad-spectrum antibioticactivity towards bacteria, fungi and higher eukaryotes[51]. In Pseudomonas, the best studied phenazine pro-ducer, PhzF is part of an operon required for the conver-sion of chorismic acid to phenazine-1-carboxylate (PCA)[52]. PhzF homologs were identified in most of the CTGspecies tested as well as several other fungal species. How-ever, we could not identify a PhzF homolog in the L. elong-isporus genome, even when multiple TBlastN and BlastNsearches were used.
PhzF homologs were extracted from GenBank for subse-quent phylogenetic analysis. In total 181 representativeprotein coding sequences distributed amongst 154 organ-isms were used. These taxa were distributed amongst α, β,γ and δ-proteobacteria, Actinobacteria, Fungi, Firmicutes awell as other bacterial groups.
We aligned all sequences and reconstructed a PhzF MLphylogeny (Figure 5). The C. parapsilosis PhzF homolog isfound in a clade with members of the β-proteobacteria(Burkholderia multiovorans, Burkholderia cepacia, Burkholde-
ria ambifaria), α-proteobacteria (Roseovarius) and the γ-proteobacteria (Azotobacter vinelandii, Acinetobacter bau-mannii, Shewanella baltica and Photorhabdus luminescens)(81% BP). In contrast, all other PhzF homologs from CTGspecies are in a completely separate clade (Figure 5). Theseform a sister group (63% BP) to PhzF homologs fromother Saccharomycotina species (C. glabrata, Saccharomy-ces cerevisiae, Kluyveromyces lactis and Vanderwaltozymapolyspora). All three clades are grouped together in a largerclade with high support (75% BP).
The sister group relationship between the PhzF homologsfrom the Ascomycota and the proteobacteria clade isintriguing (Figure 5), as it suggests that an ancestral Sac-charomycotina species gained the PhzF homolog from aproteobacteria. The bacterial PhzF gene has subsequentlybeen retained after multiple speciation events, but lost inC. parapsilosis. We hypothesize that C. parapsilosis hasrecently reacquired a bacterial PhzF homolog from a pro-teobacterial source, as it is grouped (81% BP) within aproteobacterial subclade. To test this hypothesis we recon-structed constrained trees that placed C. parapsilosistogether with the remaining Ascomycota species [seeadditional file 6 C-H]. The AU test of phylogenetic treeselection [53], showed that the original unconstrainedtree (groups C. parapsilosis with proteobacteria) receivesthe optimal likelihood tree score, and the differences inlikelihood scores when compared to the constrained trees[see additional file 6], are significant (P < 0.05). This isalso supported by spectral analysis [see additional file 7].
Our phylogeny shows that the Schizosaccharomyces pombePhzF homolog is found in a clade containing all CTGPhzF homologs (Figure 5 99% BP). Furthermore it isgrouped beside D. hansenii (66% BP). S. pombe is not amember of the Saccharomycotina, it belongs to theTaphrinomycotina subphylum. The genome sequences ofSchizosaccharomyces japonicus and Schizosaccharomyces oct-osporus have recently been completed [54]. We could notlocate a PhzF homolog in S. japonicus but did locate ahomolog in S. octosporus using a TBlastN search strategy.Phylogenetic analysis has shown that S. pombe and S. oct-osporus are more closely related to one another than to S.japonicus [55]. Therefore we hypothesize that the LCAancestor of S. pombe and S. octosporus gained the PhzF genefrom an ancestral D. hansenii-like species after speciationfrom S. japonicus. We reconstructed a constrained tree thatplaced S. pombe outside the Saccharomycotina clade [seeadditional file 6B]. The approximately unbiased test ofphylogenetic tree selection (AU test) [53], showed that thephylogenetic inferences of the unconstrained tree are sig-nificantly better (P < 0.05) than the constrained tree [seeadditional file 6]. This infers that S. pombe has obtained aPhzF homolog from a member of the CTG clade.
Page 8 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
A small basidiomycete clade is evident amongst prokary-ote species (Figure 5). Both Ustilago maydis and Malasseziaglobosa belong to the Ustilaginomycotina subphylum.Therefore our phylogeny infers that an ancestral Ustilagi-nomycotina species gained a PhzF gene from an unknown
bacterial source, and both species have retained this afterspeciation.
A correspondence analysis of synonymous codon usagefor all PhzF homologs was also performed and is shown
PhzF maximum likelihood phylogenyFigure 5PhzF maximum likelihood phylogeny. The optimum model of protein substitution was found to be WAG+G. The number of gamma rate categories was 4 (alpha = 0.873). Bootstrap resampling (100 iterations) was undertaken and are dis-played. For display purposes branches with less than 50% support were collapsed. Branches are colored according to their tax-onomy. Fungal branches are shown in green. The S. pombe PhzF homolog is highlighted with a red rectangle.
Saccharopolyspora erythraea
Mycobacterium ulcerans Agy99
Salinispora tropica CNB440
Brevibacterium linens BL2
Corynebacterium glutamicum
Corynebacterium glutamicum
Congregibacter litoralis KT71
Methylibium
petroleiphilum PM
1
Zymom
onas mobilis ZM
4
Gluconacetobacter diazotrophicus PAL5
Kineococcus radiotolerans SRS30216
Arthrobacter sp FB24
Arthrobacter chlorophenolicus A6
Arthrobacter aurescens TC1
Acidovorax sp JS42
Acidovorax avenae A
AC
00
Nocardioides sp JS
614
Janibacter sp HTC
C2649
Pse
udom
onas
syr
inga
e st
r D
C30
00P
seud
omon
as s
yrin
gae
1448
A
Pse
udom
onas
syr
inga
e B
728aC
hromobacterium
violaceumP
seudomonas fluorescens P
f5
Bradyrhizobium
sp BTA
i1
Bradyrhizobium
sp OR
S278 P
seud
omon
as p
utid
a W
619
Pse
udom
onas
aer
ugin
osa
PA
O1
Pse
udom
onas
aer
ugin
osa
PA
7
Pse
udom
onas
aer
ugin
osa
PA
CS
2
Pse
udom
onas
aer
ugin
osa
C37
19
Pse
udom
onas
aer
ugin
osa
UC
BP
PP
A14
Frankia sp E
AN
1pec
Frankia alni A
CN
14aF
rankia sp CcI3
Malassezia globosa C
BS
7966
Ustilago m
aydis 521A
cidiphilium cryptum
JF5
Agrobacterium
tumefaciens str C
58B
urkholderia cenocepacia AU
1054D
elftia acidovorans SP
H1
Rho
dofe
rax
ferr
iredu
cens
T11
8
Pol
arom
onas
nap
htha
leni
vora
ns C
J2
Pol
arom
onas
sp
JS66
6
Del
ftia
acid
ovor
ans
SP
H1
Com
amon
as te
stos
tero
ni K
F1
Burkholderia phymatum STM815
Burkholderia xenovorans LB400
Burkholderia phytofirmans PsJN
Burkholderia oklahomensis C6786
Burkholderia oklahomensis EO
147
Burkholderia thailandensis MSM
B43
Burkholderia pseudomallei 91
Burkholderia pseudomallei B7210
Burkholderia pseudomallei 9
Burkholderia pseudom
allei 7894
Burkholderia pseudom
allei BC
C215
Burkholderia thailandensis E
264
Burkholderia thailandensis T
XD
OH
Burkholderia m
allei ATC
C 23344
Burkholderia pseudom
allei 668
Burkholderia pseudom
allei 305
Burkholderia pseudom
allei K96243
Burkholderia pseudom
allei 14
Burkholderia cenocepacia AU 1054
Burkholderia cenocepacia MC03
Burkholderia cenocepacia PC184
Burkholderia sp 383
Burkholderia ubonensisBurkholderia dolosa AUO158
Burkholderia multivorans ATCC1761
Burkholderia ambifaria MC406
Burkholderia cepacia AMM
D
Burkholderia vietnamiensis G4
Erwinia carotovora SCRI1043
Erwinia chrysanthemi str 3937
Sodalis glossinidius str morsitans
Yersinia frederiksenii ATCC 33641
Roseova
rius s
p 217
Candida parapsilosis
Azotobacter vinelandii AvOP
Burkholderia ambifaria MC406
Burkholderia cepacia AMMD
Burkholderia multivorans ATCC 17616
Photorhabdus luminescens TTO1
Shewanella baltica OS155
Acinetobacter baumannii
Acinetobacter baumannii ATCC 17978
Acinetobacter baumannii
Candida glabrata
Saccharomyces cerevisiae YJM789
Kluyveromyces lactis
Vanderwaltozyma polyspora
Clavispora lusitaniae
Debaryomyces hansenii CBS767
Schizosaccharomyces pombe 972hPichia guilliermondii ATCC 6260Pichia stipitis CBS 6054Candida tropicalisCandida dubliniensisCandida albicans SC5314
Candida albicans
Steno
troph
omon
as m
alto
philia
R55
13
Xanthomonas oryz
ae MAFF 311018
Xanthomonas oryz
ae KACC10331
Xanthomonas oryz
ae BLS256
Xanth
omon
as a
xono
podis
str 3
06
Xantho
monas
campe
stris
str 85
Xanth
omon
as c
ampe
stris
ATC
C3391
3
Xanth
omon
as ca
mpe
stris
pv ca
mpe
stris
Ste
notro
phom
onas
mal
toph
ilia
R55
13
Xan
thom
onas
cam
pest
ris A
TCC
3391
3
Xan
thom
onas
cam
pest
ris s
tr 85
Xant
hom
onas
axo
nopo
dis
str 3
06
Xant
hom
onas
ory
zaeB
LS25
6
Xant
hom
onas
ory
zae
KAC
C10
331
Xanth
omon
as o
ryza
e M
AFF31
1018
Paracoccus denitrificans P
D1222
Zym
omonas m
obilis ZM
4
Alkaliphilus m
etalliredigenes QY
MF
Clostridium
difficile QC
D32g58
Clostridium
difficile 630
Natronom
onas pharaonis DS
M 2160
Halobacterium
sp NR
C1
Cam
pylo
bact
er fe
tus
subs
p fe
tus
8240
Clo
strid
ium
diff
icile
QC
D-6
3q42
Clo
strid
ium
ace
tobu
tylic
um A
TC
C 8
24
Clo
strid
ium
bei
jerin
cki N
CIM
B 8
052
Des
ulfit
obac
teriu
m h
afni
ense
Y51
Clo
strid
ium
bol
teae
AT
CC
BA
A
Bac
illus
sp
B14
905
Clo
strid
ium
sp
OhI
LAs
Bac
illus
pum
ilus H
erpetosiphon aurantiacus
Cal
dice
llulo
siru
ptor
sac
char
olyt
icus
Bla
stop
irellu
la m
arin
a D
SM
364
5
Rho
dopi
rellu
la b
altic
a S
H1
Rho
dopi
rellu
la b
altic
a S
H1
Des
ulfu
rom
onas
ace
toxi
dans
DS
M 6
84
Cya
noth
ece
sp C
CY
0110
Syn
echo
cyst
is s
p P
CC
680
3
Ros
eoba
cter
sp
CC
S2
Rho
doba
cter
ales
bac
teriu
m H
TCC
2654
Salin
ispo
ra a
reni
cola
CN
S205
Salin
ispo
ra tr
opic
a C
NB
440
Reinekea sp MED297Bradyrhizobium japonicum USDA 110Hoeflea phototrophica DFL4343
Rhizobium leguminosarum WSM2304
Rhizobium leguminosarum 3841
Rhizobium leguminosarum WSM1325
Rhizobium etli CFN 42
Sinorhizobium meliloti 1021Sinorhizobium medicae WSM419
Rhodopseudomonas palustris BisB18
Rhodopseudomonas palustris CGA009
Rhodopseudomonas palustris TIE1
alpha proteobacterium BAL199
alpha proteobacterium BAL199
alpha proteobacterium BAL199
Haloba
cteriu
m sp
NRC1
Lyng
bya
sp P
CC 810
6
Trichodesm
ium erythra
eum IMS101
Chloroflexu
s aurantia
cus
Chrom
obac
teriu
m v
iola
ceum
Ralst
onia
eut
roph
a JM
P134
Burkh
olde
ria c
enoc
epac
ia M
C03
Jannaschia sp CCS1
Dinoroseobacter shibae
Sagittula stellata
Methylobacterium populi BJ001
Xanthobacter autotrophicus Py2
Stappia aggregata IAM 12614
Ochrobactrum anthropiBrucella suis 1330
Sinorhizobium meliloti 1021
Sinorhizobium medicae WSM419
Pseudomonas a
eruginosa P
AO1
Erwinia ca
rotovora SCRI1043
Burkholderia sp 383
Burkholderia sp 383
Fungiα-proteobacteria
γ-proteobacteriaActinobacteria
β-proteobacteria
miscellaneous bacteriaArchaea
Page 9 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
in additional information [see additional file 8]. The S.pombe PhzF homolog has a codon usage pattern very sim-ilar to the D. hansenii protein.
Gene order around PhzFAnalysis of the genes adjacent to the PhzF homolog in C.parapsilosis shows that there is a high conservation of gene
synteny and supports our hypothesis that PhzF wasrecently acquired in this species (Figure 6). Homologs inthe other CTG species are located in completely differentregions of the genome relative to C. parapsilosis (notshown). For example, the C. albicans PhzF homolog islocated between orf19.5619 and orf19.5621, whereas theC. parapsilosis homolog is found between orf19.6689 &
Gene order around C. parapsilosis PhzF geneFigure 6Gene order around C. parapsilosis PhzF gene. Species names and gene identifiers are shown in each box. Orthologous genes are stacked under one another in pillars. The C. parapsilosis PhzF homolog (CPAG_03462) is highlighted with a red box. Synteny relative to the C. parapsilosis PhzF homolog is conserved in all species. Other CTG PhzF homologs are found in com-pletely different regions of the genome relative to C. parapsilosis. L. elongisporus is the only CTG species missing a PhzF gene, and there is no evidence for a pseudogene in the genome.
C. albicansorf19.6685
C. parapsilosisCPAG_03465
L. elongisporusLELG_05169
CLUG_01660
P. guillermondiiPGUG_05624
C. tropicalisCTRG_05149
C. dubliniensis
D. hanseniiC14586
C. albicansorf19.6687
C. parapsilosisCPAG_03463
L. elongisporusLELG_05171
CLUG_01662
P. guillermondiiPGUG_05626
C. tropicalisCTRG_05178
C. dubliniensis
D. hanseniiC14630g
C. albicansorf19.6686
C. parapsilosisCPAG_03464
L. elongisporusLELG_05170
CLUG_01661
P. guillermondiiPGUG_05625
C. tropicalisCTRG_05179
C. dubliniensis
D. hanseniiC14608
C. albicansorf19.6684
C. parapsilosisCPAG_03466
L. elongisporusLELG_05168
CLUG_01659
P. guillermondiiPGUG_05623
C. tropicalisCTRG_05150
C. dubliniensis
D. hanseniiC14564
C. albicansorf19.6689
C. parapsilosisCPAG_03460
L. elongisporusLELG_05173
Cl. lusitaniaeCLUG_01663
P. guillermondiiPGUG_05628
C. tropicalisCTRG_05176
C. dubliniensis
D. hanseniiC146740g
C. parapsilosisCPAG_03462
Cl. lusitaniaeCl. lusitaniaeCl. lusitaniaeCl. lusitaniae
Cd73180Cd73190Cd73200Cd73320 Cd73170
Page 10 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
orf19.6687 relative to C. albicans SC5314 (Figure 6).However, the L. elongisporus genome contains no PhzFhomolog, either at a position equivalent to the C. parapsi-losis copy or elsewhere in the genome.
We propose that the LCA of L. elongisporus and C. parapsi-losis lost the PhzF gene present in the other CTG species,and a second (new) copy was subsequently gained by C.parapsilosis after speciation. We have partial sequence data(unpublished) from Candida orthopsilosis, a species soclosely related to C. parapsilosis that it was once designatedC. parapsilosis group II [56]. We located a C. orthopsilosis PRhomolog that is 83% identical (at the amino acid level) tothe C. parapsilosis copy. This implies that the commonancestor of C. parapsilosis and C. orthopsilosis acquired thebacterial PhzF homolog after speciation from L. elongis-porus.
Mechanisms of gene transfer into fungi are poorly under-stood. To date no DNA uptake mechanism has been iden-tified in CTG species. Interkingdom conjugation betweenbacteria and yeast has been observed however [57-59].Similarly, Saccharomyces cerevisiae has been shown to betransformant competent under certain conditions [60].CTG species are known interact with bacteria in vivo [61],and it is therefore possible that interkingdom conjugationand transformation may facilitate DNA transfer in C. par-apsilosis. These mechanisms may also be applicable to thePezizomycotina species examined in this analysis.
ConclusionWe investigated the frequency of recent interkingdomgene transfer between CTG and bacterial species. Welocated two strongly supported incidences of HGT, bothwithin the C. parapsilosis genome. We also located inde-pendent transfers into the Pezizomycotina, Basidiomy-cotina and Protozoan lineages.
We cannot determine the exact origin of the PR homolog(CPAG_02038) found in the C. parapsilosis genome. How-ever, based on its phylogenetic position it either origi-nated from a Burkholderia source, or more likely anorganism not yet represented in the sequence databases.Our PR phylogenetic analysis also suggests there were twoindependent transfers into Pezizomycotina species, onefrom an Actinobacterial source, and the second is from anunknown proteobacterial source. There is also evidencethat T. cruzi has obtained its PR homolog from a Firmi-cutes species. The transferred PR genes analyzed herebelong to a superfamily of proline racemases, althoughwe cannot determine their exact function in the fungalspecies examined. Their proximity to an amino acid trans-porter (in C. parapsilosis) and a FAD dependent oxidore-ductase (in A. niger and G. zeae) suggests they do have arole in amino acid metabolism. Furthermore, evidence of
multiple independent transfers into fungi suggests theprotein does confer a biological advantage, although wecannot determine what is. The bacteria-derived PR genehas the potential to be a novel antifungal drug target asthere would be no undesired host protein-drug interac-tions.
The bacterial PhzF homolog (CPAG_03462) found in C.parapsilosis most likely originated from a proteobacterialsource. Most CTG species examined contained PhzFhomologs, with the exception of L. elongisporus. The crys-tal structure the PhzF homolog in S. cerevisiae has beendetermined and while its function remains unknown, it isnot thought to be involved in phenazine production [62].We postulate that the PhzF homolog present in other CTGspecies was initially lost by the ancestor of C. parapsilosisand L. elongisporus, but subsequently regained by C. parap-silosis through HGT. The loss of eukaryote genes and sub-sequent reacquisition of a prokaryotic copy has previouslybeen described in yeast, and can confer specific metaboliccapabilities. An analysis of the biotin biosynthesis path-way discovered that the ancestor of Candida, Debaryomy-ces, Kluyveromyces and Saccharomyces lost the majority ofthe pathway after the divergence from the ancestor of Y.lipolytica. However, Saccharomyces species have rebuilt thebiotin pathway through gene duplication/neofunctional-ization after horizontal gene transfer from α and γ proteo-bacterial sources [20]. The acquisition of the URA1 gene(encoding dihydroorotate dehydrogenase) from Lactoba-cillus and replacement of the endogenous gene in S. cere-visiae, allowed growth under anaerobic conditions [19].Similarly, acquisition of BDS1 (alkyl-aryl-sulfatase) fromproteobacteria may have enabled the survival of S. cerevi-siae in a harsh soil environment [19]. Our PhzF phylogenysuggests that the PhzF homolog found in most CTG spe-cies originated from an ancient HGT event, from a mem-ber of the proteobacteria. Our analysis also shows that S.pombe has obtained a PhzF homolog from a CTG species,most likely one closely related to D. hansenii. There is alsophylogenetic evidence showing that an ancestral Ustilagi-nomycotina species gained a PhzF gene from an unknownbacterial source. We cannot however, determine the bio-logical advantage to the organisms.
Although it was not the major goal of this study, we didlocate HGT from bacteria into fungal genomes outside theCTG clade, and also inter-fungal transfers. In a previousanalysis of HGT in diplomonads, fifteen genes were foundto have undergone HGT [18]. There is phylogenetic evi-dence that these genes have undergone independenttransfers into other eukaryotic lineages including Fungi.Therfore, in eukaryotes just as HGT has affected some spe-cies more than others [63], there may be groups of genesthat are more likely to be taken up through HGT than oth-ers. We cannot test this directly however, as we have not
Page 11 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
identified all cases of HGT from bacteria to fungi outsidethe CTG clade.
Our analysis indicates that recent interkingdom genetransfer into extant CTG species is negligible. This sup-ports a previous hypothesis that genetic code alterationsblocks horizontal gene transfer [64]. It should be notedhowever that we searched for recent bacterial gene trans-fers into individual CTG species, and not for more ancienttransfers. We took this approach because the presence ofrecently gained bacterial genes in a eukaryote genomeshould be readily detected compared to older transfers.Similarly, we have not investigated eukaryote-to-eukary-ote transfers. It is therefore possible that we have underes-timated the overall rate of HGT into the CTG lineage. Thediscovery of HGT in other fungal lineages implies thatHGT plays an important role in fungal evolution anddeserves further analysis. In particular a strategy which candetect ancient gene transfers would be meaningful.
MethodsSequence dataThe complete C. albicans (SC5314) genome (Assembly19) was obtained from the Candida genome database[65]. The Broad institute have sequenced and annotatedfive CTG species (C. albicans (WO-1), C. tropicalis, L. elong-isporus, P. guilliermondii, and Cl. lusitaniae). Thesegenomes were obtained directly from the Broad Institute[66]. Gene sets for the C. dubliniensis were downloadedfrom GeneDB [44].
The incomplete C. parapsilosis geneome was downloadedfrom the Sanger Institute [67]. Gene annotations wereperformed using two separate approaches. The firstinvolved a reciprocal best BLAST [31] search with a cutoffE- value of 10-7 of Candida albicans SC5314 protein codinggenes against the unannotated C. parapsilosis genome. TopBLAST hits longer than 300 nucleotides were retained asputative open reading frames. The second approachinvolved a pipeline of analysis that combined several dif-ferent gene prediction programs, including ab initio pro-grams SNAP [68], Genezilla [69], and AUGUSTUS [69],with gene models from Exonerate [70] and Genewise [71]based on alignments of proteins and Expressed SequenceTags. Putative gene sets from both approaches wereimported into Artemis [72] and cross corroborated manu-ally. The resultant gene sets contained 5,823 protein-cod-ing genes. The C. parapsilosis genome was also annotatedby the Broad Institute, and where possible we have usedthe gene names they assigned.
The UniProt database (v11.1) was downloaded [73].Database searches against GenBank refer to release 164.0.
Blast based approach to detect potential horizontally transferred genesTaking one CTG species at a time, we located gene familiesof interest by comparing individual protein coding genesagainst the UniProt database (v11.1) using the BlastPalgorithm [31] with a cutoff expectation (E) value of 10-20.To use all available sequence data, CTG proteins with atop database hit to a bacterial protein in UniProt wereextracted for a second round of database searching againstGenBank (E value of 10-20). Proteins which also had a topdatabase hit to a bacterial protein in GenBank were con-sidered as possible incidences of horizontal gene transfer.All putative homologs were extracted from GenBank andsearched against the relevant CTG genome to ensure areciprocal best Blast hit. For completeness, CTG proteinsnot yet deposited in GenBank were added to gene familiesof interest where appropriate.
Accession numbers for all sequences used in this analysiscan be found in additional material [see additional file 2].
Phylogenetic methodsGene families were aligned using MUSCLE (v3.6) [74]using the default settings. Obvious alignment ambiguitieswere corrected manually.
Phylogenetic relationships were inferred using maximumlikelihood methods. Appropriate protein models of sub-stitution were selected for each gene family using Model-Generator [75]. One hundred bootstrap replicates werethen carried out with the appropriate protein model usingthe software program PHYML [76] and summarized usingthe majority-rule consensus method.
We performed the approximately unbiased test of phylo-genetic tree selection [53], to assess whether differences intopology between constrained and unconstrained genetrees are no greater than expected by chance.
Codon usage analysis and spectral analysisTo determine if the putative HGT genes had a differentcodon usage pattern to the host genome an analysis ofvariation in synonymous codon usage was undertakenusing the GCUA software [77]. Individual correspondenceanalyses of raw codon counts for the Candida parapsilosis,Ustilago maydis, Malassezia globosa, Aspergillus flavus,Aspergillus niger, Gibberella zeae, Aspergillus oryzae, Phae-osphaeria nodorum, and Schizosaccharomyces pombegenomes were performed, with the first four principal axesbeing used to evaluate synonymous codon usage patterns.Similar analyses were also carried out on members of theproline racemase and phenazine F gene families displayedin Figures 2 and 4. We used spectrum [78] to perform aspectral analysis on a subset of the phenazine data.
Page 12 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
Authors' contributionsDAF, MEL and GB were involved in the design phase. MELpredicted genes in unannotated genomes. DAF sourcedhomologs, examined synteny and performed phyloge-netic analyses. DAF and GB drafted the manuscript. Allauthors read and approved the final manuscript.
Additional material
AcknowledgementsThe authors wish to acknowledge the Wellcome Trust Sanger Institute and Broad institute of MIT & Harvard for releasing data ahead of publication. We would like to acknowledge the financial support of the Irish Research Council for Science, Engineering and Technology (IRCSET), the Irish Health Research Board (HRB) and Science Foundation Ireland (SFI). We wish to acknowledge the SFI/HEA Irish Centre for High-End Computing (ICHEC) for the provision of computational facilities and support. We thank Mike Lorenz and Paul Dyer for fungal strains. We also thank Jason Stajich for help with gene annotations.
Additional file 1Examples of reported incidences of interkingdom gene transfer between prokaryotes and fungi. One Kluyveromyces lactis gene (KLLA0D19949g) previously highlighted [22], been omitted as it is no longer recognized as an ORF. Y. lipolytica genes denoted with a * and ^ indicate possible gene duplications after HGT.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S1.doc]
Additional file 2GenBank accession numbers for PR (A) and PhzF (B) sequences used in this analysis. Species identified with an * use the accession numbers created by the Broad Institute [66] or the Wellcome Trust Sanger Institute [67].Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S2.doc]
Additional file 3Gene order around Pezizomycotina proline racemase genes. Species names and identifiers are shown in each box. PR genes are labeled. Gene identifiers relate to annotations from the Broad Institute. Clade letters in parentheses correspond to those in Figure 2. There is evidence for con-served gene synteny between some species such as A. oryzae and A. flavus (clade-C). A. flavus/A. terreus and N. fischeri/A. fumigatus in the Metazoan/Pezizomycotina clade (B). The A. flavus gene denoted by a * is absent from the Broad gene set but we were able to locate it with a BlastX search.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S3.eps]
Additional file 4Correspondence analysis of codon usage. Correspondence analysis of codon usage in the C. parapsilosis (1), U. maydis (2), M. globosa (3), A. flavus (4), A. niger (5), G. zeae (6), A. oryzae (7), P. nodorum (8), and S. pombe (9) genomes. Transferred genes are highlighted. All have a codon usage similar to the rest of their genomes which is unsurpris-ing as transferred genes have been shown to ameliorate their codon usage to their hosts [79].Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S4.pdf]
Additional file 5Correspondence analysis of codon usage in the proline racemase gene family analyzed in this study. Major groups are color-coded. The C. par-apsilosis PR gene has a codon usage pattern distinct from other fungal species in this analysis. It is also quite distinct from the Burkholderia (β-proteobacteria) species, which were found to be its phylogenetic neighbors.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S5.eps]
Additional file 6Trees for approximately unbiased test for PhzF homologs. Tree A is the original unconstrained topology, which groups C. parapsilosis with pro-teobacteria. Topology B is a constrained tree that places S. pombe outside the Saccharomycotina clade. Topologies C-H are constrained and place C. parapsilosis amongst the other Saccharomycotina species. Log likelihood scores for each tree are given. To assess the likelihood that any differences in topology between the inferred trees is no more significant that that expected by chance, we performed the approximately unbiased test. The AU test shows that the unconstrained tree receives the optimal likelihood tree score. Furthermore, the differences in likelihood scores when com-pared to the constrained trees are significant (P < 0.05). Therefore based on these results the placement of the C. parapsilosis homolog in the pro-teobacterial clade to the exclusion of the Saccharomycotina and S. pombe within the Saccharomycotina clade is significant.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S6.eps]
Additional file 7PhzF spectral analysis. Analysis was performed on the Saccharomycotina and selected proteobacterial clade. Bars above the x-axis represent fre-quency of support for each split. Bars below the x-axis represent the sum of all corresponding conflicts. Clad grams above columns represent the cor-responding splits in the data. There is no support for the placement of C. parapsilosis with the other Saccharomycotina species.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S7.eps]
Additional file 8Correspondence analysis of codon usage in the PhzF gene family ana-lyzed in this study. Major groups are color-coded. The C. parapsilosis PhzF gene has a codon usage pattern similar to other CTG species ana-lyzed. It is quite distinct from the proteobacterial species that were found to be its phylogenetic neighbors.Click here for file[http://www.biomedcentral.com/content/supplementary/1471-2148-8-181-S8.eps]
Page 13 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
References1. Doolittle WF: Lateral genomics. Trends Cell Biol 1999, 9(12):M5-8.2. Jain R, Rivera MC, Moore JE, Lake JA: Horizontal gene transfer
accelerates genome innovation and evolution. Mol Biol Evol2003, 20(10):1598-1602.
3. Eisen JA: Assessing evolutionary relationships amongmicrobes from whole-genome analysis. Curr Opin Microbiol2000, 3(5):475-480.
4. Woo PC, To AP, Lau SK, Yuen KY: Facilitation of horizontaltransfer of antimicrobial resistance by transformation ofantibiotic-induced cell-wall-deficient bacteria. Med Hypotheses2003, 61(4):503-508.
5. Martin K, Morlin G, Smith A, Nordyke A, Eisenstark A, Golomb M:The tryptophanase gene cluster of Haemophilus influenzaetype b: evidence for horizontal gene transfer. J Bacteriol 1998,180(1):107-118.
6. Kurland CG, Canback B, Berg OG: Horizontal gene transfer: acritical view. Proc Natl Acad Sci U S A 2003, 100(17):9658-9662.
7. Andersson JO: Lateral gene transfer in eukaryotes. Cell Mol LifeSci 2005, 62(11):1182-1197.
8. Dujon B: Hemiascomycetous yeasts at the forefront of com-parative genomics. Curr Opin Genet Dev 2005, 15(6):614-620.
9. Friesen TL, Stukenbrock EH, Liu Z, Meinhardt S, Ling H, Faris JD, Ras-mussen JB, Solomon PS, McDonald BA, Oliver RP: Emergence of anew disease as a result of interspecific virulence gene trans-fer. Nat Genet 2006, 38(8):953-956.
10. Inderbitzin P, Harkness J, Turgeon BG, Berbee ML: Lateral transferof mating system in Stemphylium. Proc Natl Acad Sci U S A 2005,102(32):11390-11395.
11. Kavanaugh LA, Fraser JA, Dietrich FS: Recent evolution of thehuman pathogen Cryptococcus neoformans by intervarietaltransfer of a 14-gene fragment. Mol Biol Evol 2006,23(10):1879-1890.
12. Khaldi N, Collemare J, Lebrun MH, Wolfe KH: Evidence for hori-zontal transfer of a secondary metabolite gene clusterbetween fungi. Genome Biol 2008, 9(1):R18.
13. Paoletti M, Buck KW, Brasier CM: Selective acquisition of novelmating type and vegetative incompatibility genes via inter-species gene transfer in the globally invading eukaryoteOphiostoma novo-ulmi. Mol Ecol 2006, 15(1):249-262.
14. Slot JC, Hallstrom KN, Matheny PB, Hibbett DS: Diversification ofNRT2 and the origin of its fungal homolog. Mol Biol Evol 2007,24(8):1731-1743.
15. Slot JC, Hibbett DS: Horizontal transfer of a nitrate assimila-tion gene cluster and ecological transitions in fungi: a phylo-genetic study. PLoS ONE 2007, 2(10):e1097.
16. Waller RF, Slamovits CH, Keeling PJ: Lateral gene transfer of amultigene region from cyanobacteria to dinoflagellatesresulting in a novel plastid-targeted fusion protein. Mol BiolEvol 2006, 23(7):1437-1443.
17. Wei W, McCusker JH, Hyman RW, Jones T, Ning Y, Cao Z, Gu Z,Bruno D, Miranda M, Nguyen M, Wilhelmy J, Komp C, Tamse R,Wang X, Jia P, Luedi P, Oefner PJ, David L, Dietrich FS, Li Y, DavisRW, Steinmetz LM: Genome sequencing and comparative anal-ysis of Saccharomyces cerevisiae strain YJM789. Proc Natl AcadSci U S A 2007, 104(31):12825-12830.
18. Andersson JO, Sjogren AM, Davis LA, Embley TM, Roger AJ: Phylo-genetic analyses of diplomonad genes reveal frequent lateralgene transfers affecting eukaryotes. Curr Biol 2003,13(2):94-104.
19. Hall C, Brachat S, Dietrich FS: Contribution of horizontal genetransfer to the evolution of Saccharomyces cerevisiae.Eukaryot Cell 2005, 4(6):1102-1115.
20. Hall C, Dietrich FS: The Reacquisition of Biotin Prototrophy inSaccharomyces cerevisiae Involved Horizontal Gene Trans-fer, Gene Duplication and Gene Clustering. Genetics 2007,177(4):2293-2307.
21. Woolfit M, Rozpedowska E, Piskur J, Wolfe KH: Genome surveysequencing of the wine spoilage yeast Dekkera (Brettanomy-ces) bruxellensis. Eukaryot Cell 2007, 6(4):721-733.
22. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, LafontaineI, De Montigny J, Marck C, Neuveglise C, Talla E, Goffard N, FrangeulL, Aigle M, Anthouard V, Babour A, Barbe V, Barnay S, Blanchin S,Beckerich JM, Beyne E, Bleykasten C, Boisrame A, Boyer J, CattolicoL, Confanioleri F, De Daruvar A, Despons L, Fabre E, Fairhead C,Ferry-Dumazet H, Groppi A, Hantraye F, Hennequin C, Jauniaux N,
Joyet P, Kachouri R, Kerrest A, Koszul R, Lemaire M, Lesur I, Ma L,Muller H, Nicaud JM, Nikolski M, Oztas S, Ozier-Kalogeropoulos O,Pellenz S, Potier S, Richard GF, Straub ML, Suleau A, Swennen D,Tekaia F, Wesolowski-Louvel M, Westhof E, Wirth B, Zeniou-MeyerM, Zivanovic I, Bolotin-Fukuhara M, Thierry A, Bouchier C, CaudronB, Scarpelli C, Gaillardin C, Weissenbach J, Wincker P, Souciet JL:Genome evolution in yeasts. Nature 2004, 430(6995):35-44.
23. Gojkovic Z, Knecht W, Zameitat E, Warneboldt J, Coutelis JB, Pyn-yaha Y, Neuveglise C, Moller K, Loffler M, Piskur J: Horizontal genetransfer promoted evolution of the ability to propagateunder anaerobic conditions in yeasts. Mol Genet Genomics 2004,271(4):387-393.
24. Brinkman FS, Macfarlane EL, Warrener P, Hancock RE: Evolution-ary relationships among virulence-associated histidinekinases. Infect Immun 2001, 69(8):5207-5211.
25. Temporini ED, VanEtten HD: An analysis of the phylogenetic dis-tribution of the pea pathogenicity genes of Nectria haema-tococca MPVI supports the hypothesis of their origin byhorizontal transfer and uncovers a potentially new pathogenof garden pea: Neocosmospora boniensis. Curr Genet 2004,46(1):29-36.
26. Wenzl P, Wong L, Kwang-won K, Jefferson RA: A functionalscreen identifies lateral transfer of beta-glucuronidase (gus)from bacteria to fungi. Mol Biol Evol 2005, 22(2):308-316.
27. Garcia-Vallve S, Romeu A, Palau J: Horizontal gene transfer ofglycosyl hydrolases of the rumen fungi. Mol Biol Evol 2000,17(3):352-361.
28. Klotz MG, Klassen GR, Loewen PC: Phylogenetic relationshipsamong prokaryotic and eukaryotic catalases. Mol Biol Evol1997, 14(9):951-958.
29. Fitzpatrick DA, Logue ME, Stajich JE, Butler G: A Fungal phylogenybased on 42 complete genomes derived from supertree andcombined gene analysis. BMC Evol Biol 2006, 6:99.
30. Sugita T, Nakase T: Non-universal usage of the leucine CUGcodon and the molecular phylogeny of the genus Candida.Syst Appl Microbiol 1999, 22(1):79-86.
31. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lip-man DJ: Gapped BLAST and PSI-BLAST: a new generation ofprotein database search programs. Nucleic Acids Res 1997,25(17):3389-3402.
32. Cardinale GJ, Abeles RH: Purification and mechanism of actionof proline racemase. Biochemistry 1968, 7(11):3970-3978.
33. Logue ME, Wong S, Wolfe KH, Butler G: A genome sequence sur-vey shows that the pathogenic yeast Candida parapsilosis hasa defective MTLa1 allele at its mating type locus. Eukaryot Cell2005, 4(6):1009-1017.
34. Lartillot N, Brinkmann H, Philippe H: Suppression of long-branchattraction artefacts in the animal phylogeny using a site-het-erogeneous model. BMC Evol Biol 2007, 7 Suppl 1:S4.
35. Langley R, Kenna DT, Vandamme P, Ure R, Govan JR: Lysogeny andbacteriophage host range within the Burkholderia cepaciacomplex. J Med Microbiol 2003, 52(Pt 6):483-490.
36. Eberl L, Tummler B: Pseudomonas aeruginosa and Burkholde-ria cepacia in cystic fibrosis: genome evolution, interactionsand adaptation. Int J Med Microbiol 2004, 294(2-3):123-131.
37. Tuanyok A, Auerbach RK, Brettin TS, Bruce DC, Munk AC, DetterJC, Pearson T, Hornstra H, Sermswan RW, Wuthiekanun V, PeacockSJ, Currie BJ, Keim P, Wagner DM: A horizontal gene transferevent defines two distinct groups within Burkholderia pseu-domallei that have dissimilar geographic distributions. J Bac-teriol 2007.
38. Buschiazzo A, Goytia M, Schaeffer F, Degrave W, Shepard W, Gre-goire C, Chamond N, Cosson A, Berneman A, Coatnoan N, AlzariPM, Minoprio P: Crystal structure, catalytic mechanism, andmitogenic properties of Trypanosoma cruzi proline race-mase. Proc Natl Acad Sci U S A 2006, 103(6):1705-1710.
39. Lamzin VS, Dauter Z, Wilson KS: How nature deals with stereoi-somers. Curr Opin Struct Biol 1995, 5(6):830-836.
40. Fisher GH: Appearance of D-amino acids during aging: D-amino acids in tumor proteins. Exs 1998, 85:109-118.
41. Chamond N, Gregoire C, Coatnoan N, Rougeot C, Freitas-Junior LH,da Silveira JF, Degrave WM, Minoprio P: Biochemical characteri-zation of proline racemases from the human protozoan par-asite Trypanosoma cruzi and definition of putative proteinsignatures. J Biol Chem 2003, 278(18):15484-15494.
Page 14 of 15(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:181 http://www.biomedcentral.com/1471-2148/8/181
Publish with BioMed Central and every scientist can read your work free of charge
"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."
Sir Paul Nurse, Cancer Research UK
Your research papers will be:
available free of charge to the entire biomedical community
peer reviewed and published immediately upon acceptance
cited in PubMed and archived on PubMed Central
yours — you keep the copyright
Submit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.asp
BioMedcentral
42. Wolosker H, Blackshaw S, Snyder SH: Serine racemase: a glialenzyme synthesizing D-serine to regulate glutamate-N-methyl-D-aspartate neurotransmission. Proc Natl Acad Sci U SA 1999, 96(23):13409-13414.
43. Reina-San-Martin B, Degrave W, Rougeot C, Cosson A, Chamond N,Cordeiro-Da-Silva A, Arala-Chaves M, Coutinho A, Minoprio P: A B-cell mitogen from a pathogenic trypanosome is a eukaryoticproline racemase. Nat Med 2000, 6(8):890-897.
44. GeneDB [http://www.genedb.org]45. Stevens JR, Gibson WC: The evolution of pathogenic trypano-
somes. Cad Saude Publica 1999, 15(4):673-684.46. Fischer G, James SA, Roberts IN, Oliver SG, Louis EJ: Chromo-
somal evolution in Saccharomyces. Nature 2000,405(6785):451-454.
47. Pfam [http://pfam.sanger.ac.uk]48. Medigue C, Rouxel T, Vigier P, Henaut A, Danchin A: Evidence for
horizontal gene transfer in Escherichia coli speciation. J MolBiol 1991, 222(4):851-856.
49. Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer andthe nature of bacterial innovation. In Nature Volume 405. Issue6784 ENGLAND ; 2000:299-304.
50. Rudnick G, Abeles RH: Reaction mechanism and structure ofthe active site of proline racemase. Biochemistry 1975,14(20):4515-4522.
51. Blankenfeldt W, Kuzin AP, Skarina T, Korniyenko Y, Tong L, Bayer P,Janning P, Thomashow LS, Mavrodi DV: Structure and function ofthe phenazine biosynthetic protein PhzF from Pseudomonasfluorescens. Proc Natl Acad Sci U S A 2004, 101(47):16431-16436.
52. Parsons JF, Song F, Parsons L, Calabrese K, Eisenstein E, Ladner JE:Structure and function of the phenazine biosynthesis proteinPhzF from Pseudomonas fluorescens 2-79. Biochemistry 2004,43(39):12427-12435.
53. Shimodaira H: An approximately unbiased test of phylogenetictree selection. Syst Biol 2002, 51(3):492-508.
54. The Schizosaccharomyces group at the Broad Institute[http://www.broad.mit.edu/annotation/genome/schizosaccharomyces_group/MultiHome.html]
55. Bullerwell CE, Leigh J, Forget L, Lang BF: A comparison of threefission yeast mitochondrial genomes. Nucleic Acids Res 2003,31(2):759-768.
56. Lin D, Wu LC, Rinaldi MG, Lehmann PF: Three distinct genotypeswithin Candida parapsilosis from clinical sources. J Clin Micro-biol 1995, 33(7):1815-1821.
57. Heinemann JA, Sprague GF Jr.: Bacterial conjugative plasmidsmobilize DNA transfer between bacteria and yeast. Nature1989, 340(6230):205-209.
58. Inomata K, Nishikawa M, Yoshida K: The yeast Saccharomyceskluyveri as a recipient eukaryote in transkingdom conjuga-tion: behavior of transmitted plasmids in transconjugants. JBacteriol 1994, 176(15):4770-4773.
59. Sawasaki Y, Inomata K, Yoshida K: Trans-kingdom conjugationbetween Agrobacterium tumefaciens and Saccharomyces cere-visiae, a bacterium and a yeast. Plant Cell Physiol 1996,37(1):103-106.
60. Nevoigt E, Fassbender A, Stahl U: Cells of the yeast Saccharomy-ces cerevisiae are transformable by DNA under non-artificialconditions. Yeast 2000, 16(12):1107-1110.
61. Hogan DA, Kolter R: Pseudomonas-Candida interactions: anecological role for virulence factors. Science 2002,296(5576):2229-2232.
62. Liger D, Quevillon-Cheruel S, Sorel I, Bremang M, Blondeau K, Aboul-fath I, Janin J, van Tilbeurgh H, Leulliot N: Crystal structure ofYHI9, the yeast member of the phenazine biosynthesis PhzFenzyme superfamily. Proteins 2005, 60(4):778-786.
63. Keeling PJ, Burger G, Durnford DG, Lang BF, Lee RW, Pearlman RE,Roger AJ, Gray MW: The tree of eukaryotes. Trends Ecol Evol2005, 20(12):670-676.
64. Silva RM, Paredes JA, Moura GR, Manadas B, Lima-Costa T, Rocha R,Miranda I, Gomes AC, Koerkamp MJ, Perrot M, Holstege FC, Bouche-rie H, Santos MA: Critical roles for a genetic code alteration inthe evolution of the genus Candida. Embo J 2007,26(21):4555-4565.
65. The Candida Genome Database [http://www.candidagenome.org]
66. The Candida group at the Broad Institute[http:www.broad.mit.edu/annotation/genome/candida_group/MultiHome.html]
67. The Wellcome Trust Sanger Institute [http://www.sanger.ac.uk/]
68. Korf I: Gene finding in novel genomes. BMC Bioinformatics 2004,5:59.
69. Majoros WH, Pertea M, Salzberg SL: TigrScan and Glimmer-HMM: two open source ab initio eukaryotic gene-finders. Bio-informatics 2004, 20(16):2878-2879.
70. Slater GS, Birney E: Automated generation of heuristics for bio-logical sequence comparison. BMC Bioinformatics 2005, 6(1):31.
71. Birney E, Clamp M, Durbin R: GeneWise and Genomewise.Genome Res 2004, 14(5):988-995.
72. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA,Barrell B: Artemis: sequence visualization and annotation. Bio-informatics 2000, 16(10):944-945.
73. The UniProt database [ftp://ftp.uniprot.org/pub/databases/]74. Edgar RC: MUSCLE: multiple sequence alignment with high
accuracy and high throughput. Nucleic Acids Res 2004,32(5):1792-1797.
75. Keane TM, Creevey CJ, Pentony MM, Naughton TJ, McLnerney JO:Assessment of methods for amino acid matrix selection andtheir use on empirical data shows that ad hoc assumptionsfor choice of matrix are not justified. BMC Evol Biol 2006, 6:29.
76. Guindon S, Gascuel O: A simple, fast, and accurate algorithmto estimate large phylogenies by maximum likelihood. SystBiol 2003, 52(5):696-704.
77. McInerney JO: GCUA: general codon usage analysis. Bioinfor-matics 1998, 14(4):372-373.
78. Charleston MA: Spectrum: spectral analysis of phylogeneticdata. Bioinformatics (Oxford, England) 1998, 14(1):98-99.
79. Lawrence JG, Ochman H: Amelioration of bacterial genomes:rates of change and exchange. J Mol Evol 1997, 44(4):383-397.
Page 15 of 15(page number not for citation purposes)