poseidon - rna.uni-jena.de · contact infection mx1 specific binding no binding species 1 species...
TRANSCRIPT
[email protected] [email protected]
www.rna.uni-jena.de
Contact
InfectionMx1 specificbinding
No binding
Species 1
Species 2
Species 3
Species 4
Mx1 Virus
Fig. 2 Exemplarily shown is an 'arms race' between the host Mx1 gene and a virus that results in high selection pressure on the host to evolve a defence against the pathogen. The virus itself establishes countermeasures to evade the host immune system2. Schematically shown is the hypervariable loop region of Mx1.
When positive selection has occured, the ratio between the non-synonymous (dN) and synonymous (dS) substitution rate became disturbed. In that way, certain amino acid changes are favored if they increase the hosts fitness, for example against an infection.
The dN/dS (ω) ratio may reach values greater than 1 and we call such sites positively selected1.
The detection of such sites allows researchers to gain insights into the evolution of a gene and might also help to develop counter-measurements against pathogens (Fig. 2).
Positive Selection Example: Positive Selected Sites in Bat Mx1
Region % sites withω 1
avg(ω) M8 BEB(PP 0.95/0.99)
Mx1, F3x4, 13 bat species
full (aa 1-649)
aa 1-90
aa 91-183
aa 184-649
101.39
24.05
0.19
112.69
0.001
0.908
0.001
0.001
6.26
21.3
NA
6.59
3.45
2.76
NA
3.83
205; 209; 361; 439; 443; 494; 549; 562; 569; 570; 572; 573; 574; 575; 578; 58116; 17; 19; 22; 23; 25; 26; 27; 31; 38; 40; 44; 46
none
205; 209; 361; 436; 439; 443; 494; 549; 562; 569;570; 572; 573; 574; 575; 578; 581
M7 vs M8-2(ln λ)
M7 vs M8p-value
Tab. 1 Results of the evolutionary analyses for positive selected sites in bat Mx1, exemplarily shown for codon frequency F3x4and paired NS site models M7 and M8 disallowing and allowing for positive selection, respectively.
Using PoSeidon, we were able to identify the loop L4 of Mx1 as a hot spot for positive selection in bats2, as previously also shown for primates4. By splitting the alignment by possible recombination events identified with the pipeline, we also found high evidence of positive selection in the N-terminal region of bat Mx1 (Table 1, fragment aa 1-90).
References[1] Yang, Ziheng. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution 24.8 (2007): 1586-1591.
[2] Fuchs, Jonas, et al. Evolution and antiviral specificity of interferon-induced Mx proteins of bats against Ebola-, Influenza-, and other RNA viruses. Submitted.
[3] Pond, Sergei L. Kosakovsky, et al. GARD: a genetic algorithm for recombination detection. Bioinformatics 22.24 (2006): 3096-3098.
[4] Mitchell, Patrick S., et al. Evolution-guided identification of antiviral specificity determinants in the broadly acting interferon-induced innate immunity factor MxA. Cell host microbe 12.4 (2012): 598-604.
PoSeiDon combines the following software:
TranslatorX (v1.1), Abascal et al. (2010); 20435676 (PMID)Muscle (v3.8.31), Edgar (2004); 15034147RAxML (v8.0.25), Stamatakis (2014); 24451623Newick Utilities (v1.6), Junier and Zdobnov (2010); 20472542MODELTEST, Posada and Crandall (1998); 9918953HyPhy (v2.2), Pond et al. (2005); 15509596GARD , Pond et al. (2006); 17110367PaML4/CodeML (v4.8), Yang (2007); 17483113Inkscape (v0.48.5)Ruby (v2.3.1)
Fig. 4 Workflow of the PoSeiDon pipeline and example output. The PoSeiDon pipeline comprises in-frame alignment of homologous protein-coding sequences, detection of putative recombination events and evolutionary breakpoints, phylogenetic reconstructions and detection of positively selected sites in the full alignment and all possible fragments. Finally, all results are combined and visualized in a user-friendly and clear HTML web page. The resulting alignment fragments are indicated with colored bars in the HTML output.
User Input
Alignment Recombination Tree Positive Selection
PoSeiDon
seq1GTTATGAAG...seq2GTACTGAAA...
FASTA
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
CAA
CAA
CAA
CAA
CAA
CAA
CAA
CAA
CAG
ACG
CAG
CAG
GAG
GAA
GAG
GCG
GCG
GCG
GCG
GCG
GCG
ATG
GTG
GTG
TAT
TAT
TAT
TAC
TAC
TAC
TAC
TAC
TAC
TAT
TAT
TAC
CGG
CGG
CGG
CGG
CGG
CGG
CGG
CGG
CGC
CAG
CGG
CAG
ACT
ACT
ACT
GCT
GCT
GCT
GGT
GCT
ACC
AGA
AAA
AGA
TGG
TGG
CAT
GCG
GCG
GCG
GCG
TCG
GCG
TCG
TCA
TCA
CTG
CTG
CTA
CTG
CTG
CTG
CTG
CTG
CTG
TTA
TTA
TTA
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GGA
CGG
CAG
CAG
AAG
AAG
ATG
AAG
AAG
AAG
AAG
AAG
AAG
AAA
ATA
AAA
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
GTC
GTC
CGA
AGA
AGA
CGA
CGA
CGA
CGA
CGA
CGA
AGG
AGG
AGG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GGG
GAG
GAG
GAG
AAG
AAG
AAG
AAG
AAG
AAG
AAG
AAG
ATG
AAG
AAG
AAG
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAG
GAG
TCA
TTA
TTA
TCT
TCA
TCC
TCC
TCG
GCC
AAA
AAG
AAT
GAA
GAA
GAA
AAA
AAA
AAA
AAA
GAA
GAA
GAG
GAG
GAA
CAA
CAA
GAA
GAA
GAA
GAG
GAG
GAG
GAG
AAG
AAG
GAA
CAG
CAG
AAC
GAG
GAG
CAG
CAG
AAG
AAT
GAA
GTT
CAA
CAG ATG TAC CAG AGT TCA TTA CAG AAA ATC AGG GCG AAG GAG AAG GAG AAGGAT GAA
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
GAC
CAA
CAA
CAA
CAA
CAA
CAA
CAA
CAA
CAG
ACG
CAG
CAG
GAG
GAA
GAG
GCG
GCG
GCG
GCG
GCG
GCG
ATG
GTG
GTG
TAT
TAT
TAT
TAC
TAC
TAC
TAC
TAC
TAC
TAT
TAT
TAC
CGG
CGG
CGG
CGG
CGG
CGG
CGG
CGG
CGC
CAG
CGG
CAG
ACT
ACT
ACT
GCT
GCT
GCT
GGT
GCT
ACC
AGA
AAA
AGA
TGG
TGG
CAT
GCG
GCG
GCG
GCG
TCG
GCG
TCG
TCA
TCA
CTG
CTG
CTA
CTG
CTG
CTG
CTG
CTG
CTG
TTA
TTA
TTA
CAG ATG TAC CAG AGT TCA TTAGAT
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GGA
CGG
CAG
CAG
AAG
AAG
ATG
AAG
AAG
AAG
AAG
AAG
AAG
AAA
ATA
AAA
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
ATC
GTC
GTC
CGA
AGA
AGA
CGA
CGA
CGA
CGA
CGA
CGA
AGG
AGG
AGG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GGG
GAG
GAG
GAG
AAG
AAG
AAG
AAG
AAG
AAG
AAG
AAG
ATG
AAG
AAG
AAG
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAA
GAG
GAG
TCA
TTA
TTA
TCT
TCA
TCC
TCC
TCG
GCC
AAA
AAG
AAT
GAA
GAA
GAA
AAA
AAA
AAA
AAA
GAA
GAA
GAG
GAG
GAA
CAA
CAA
GAA
GAA
GAA
GAG
GAG
GAG
GAG
AAG
AAG
GAA
CAG
CAG
AAC
GAG
GAG
CAG
CAG
AAG
AAT
GAA
GTT
CAA
CAG AAA ATC AGG GCG AAG GAG AAG GAG AAG GAA
558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577
Fragment 1
578 579 580
GAA TCA GAA CAA CAG - - - - - - - - - - - - AAA AGG AAA TCC ACC TTG GTG ACT TCT GAA AGC AGC CAG CGA AAG ATC
GAA TTA GAA GAA AAC - - - - - - - - - - - - AAG AAG AAG TCC GTC TTT GCG CTT TCT GAA AAC AAT CAG AGA ATG ATC
GAA TCC AAA GAG CAG - - - - - - - - - AAG GGG AGT TCT CGC GAG CAG ACG TCC TCT CTG GAG GAT CAG CGA AAG ATC
GAA TCC AAA GAG CAG - - - - - - - - - AAG GGG AGT TCT CGC GAG CAG ACG TCC TCT CTG GAG GAT CAG CGA AAG ATC
GAA TCG GAA GAG AAG AAG GGT TGT TCG CGC CAG CAG AAG GAG CAG AAT TTC TAT CAG GAG GAT CAG CGA AAG ATC
GAA GCC GAA GAG AAT - - - - - - - - - AAG AAG AAG AAG AAG GAG CAT ATT TTC TTT GAA GAG GAC GGA CGA AAG ATC
GAA AAA GAG AAG GAA - - - GAA GAA AGG AAG AGA ACA TTA GGT CGG GCG ATC TGC GAA GAG AGT CGG AGG AAA ATC
GAG AAT GAA GAA CAA AAC AAG AAT AAA TCA AGA GTT TTG GAC CTT GTA CAG AGT- - - - - - TCT CAG AGG AAA GTC
GAG AAG GAG AAG GAA - - - - - - GAA GAA ATG AAG AAG AAA TTT AAT TGT TTG AAC CTT CAA CAG CAG AGG AAA ATC
581 582
0.961 0.986 1.000 0.995 0.985 0.991 0.998
0.966 0.932 1.000 0.998 0.989 0.992 0.991
0.985 0.963 1.000 0.996 0.986 0.977 0.977
0.984
0.978
0.971
0.950 0.686 0.832 0.534 0.655
0.857 0.566 0.916
0.910
0.538
0.722
0.823
0.9340.883 0.762 0.665
F3x4F1x4F61
Fragment 2
D E C M K
YFTSR
N Q G L V
PHWAIResultingfragments
Model test
KH test
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 200 400 600 800 1000 1200 1400 1600 1800 2000
GARD
RAxML
Newick Utilities
Output
seq1GTT ATGAAG ...seq2GTA CTGAAA ...
NT ALN
seq1V M R ...seq2V L R ...
AA ALN
Muscle
TranslatorX
CODEML
NS site models
M0M1aM7
M8a
M2aM8
Codon frequencies
F61 F1x4 F3x4
dN/dS (ω) ratios
BEB
M1a vs M2a M8a vs M8
Chi-squared test
LRT M7 vs M8
PoSeiDon
Here we present PoSeiDon, an easy-to-use web service (Fig. 1) to detect positively selected sites and recombination events in an alignment of coding sequences. Sites that undergo positive selection can give you insights in the evolutionary history of your sequences, for example showing you important mutation hot spots, accumulated as results of virus-host 'arms races' during evolution (Fig. 2).
PoSeiDon is easy to use: just provide your nucleotide coding sequences as one multiple FASTA file and enter your E-Mail address (Fig. 1). The outcome is a user friendly web page, providing all intermediate results and data files and graphically displaying recombination events and positive selected sites (Fig. 4).
Detection of Recombination
Hence recombination can have a profound impact on evolutionary processes and can adversely affect phylogenetic reconstruction and the accurate detection of positive selection, screening for it should be a default step in each comparative evolutionary study.
Within PoSeiDon, we use GARD3 to detect possible breakpoints within an alignment.
Fragments are further screened for positive selection independently (Fig. 4).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Mod
el a
vera
ged
supp
ort
Breakpoint location
1-1947
549
270 549
The Web Server
Fig. 1 Web interface of PoSeiDon. The Web server is freely available at http://www.rna.uni-jena.de/poseidon.
Fig. 3 In this example we detected 2 possible breakpoints in an alignment of 13 Mx1 sequences from bats2.
Many thanks to
Prof. Dr. Manja Marz,Prof. Dr. Georg Kochs andJonas Fuchs
as well as to the RNA bioinformatics group in Jena
and the DFG for funding (SPP-1596)
PoSeiDon - a Web Server for the Detection of Evolutionary Recombination Events and Positive SelectionMartin Hölzer1,2 and Manja Marz1-6
1Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Germany, 2European Virus Bioinformatics Center (EVBC), Friedrich Schiller University Jena, Germany, 3FLI Leibniz Institute for Age Research, Jena, Germany, 4Michael Stifel Center Jena, Germany, 5Aging Research Center (ARC), Jena, Germany, 6German Center for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Germany