genome wide search for type iii secretion system - v. porter

1
Methods Self-Designed Effector Prediction Program: Nine algorithms were trained off of E. tarda EIB202’s gene products to make an effector prediction program specific to our pathogen. Six N-terminal lengths and the whole sequence were used to test 15 putative protein attributes and to see which N-terminal length is truly more accurate. We used a set of positive effectors (n=12) and two sets of non-effectors (n=2600+ and n=40) to train algorithm using the program WEKA. Gram-negative bacteria such as the fish pathogen Edwardsiella tarda EIB202 utilize the type III secretion system (T3SS) to secrete virulent effector proteins into the host 1 . The effector proteins are the most virulent, yet hardest to predict proteins within the secretion system; each individual effector protein exhibits distinct features and are widely distributed through the genome 2 . This triggered the hypothesis that the effectors are derived through horizontal gene transfer separate to the rest of the T3SS, and then are modified to fit the pathogen’s target host and mechanisms 3 . Once the effector proteins are secreted, they can go on to perform specific mechanisms within the host that modulate its normal function 4 . It is speculated that within its genome of 3700+ genes, E. tarda contains approximately 30 T3SS effector proteins, 12 of which Results Discussion and Future Directions References Acknowledgements Putative Effector Features Feature Selection Classify Algorithm Trained Model Unclassified Genes Predicted non-effectors Predicted effectors Prior Knowledge Experimental validation Introduction Effector proteins Translocation pore have been identified experimentally. Prediction of the remaining 15+ effector proteins using bioinformatics could narrow down the search and significantly reduce the time and labour it takes to experimentally verify effectors. This study was the first steps towards the creation of a new species-specific effector prediction program. Further progress is being done in the feature selection and algorithm selection. Once all these factors are set, the algorithm will be retrained and hopefully result in more accurate prediction scores. Verification of the unknown effectors can later on build a better understanding of the T3SS and also help create a better multi-effector combating vaccine against E. tarda infection in Asian aquaculture. The authors are grateful for the financial contributions from NSERC, Canada (USRA and Discovery Grant) and the Open Funding Project of State Key Lab of Bioreactor Engineering, ECUST from Shanghai, China. Figure 1: The type III secretion injectisome secreting effectors into the host (Cornelis, 2010). 1. Leung, K. Y., Siame, B. A., Tenkink, B. J., Noort, R. J., & Mok, Y. K. (2012). Edwardsiella tarda Virulence mechanisms of an emerging gastroenteritis pathogen. Microbes and Infection, 14(1), 26-34. 2. McDermott, J. E., Corrigan, A., Peterson, E., Oehmen, C., Niemann, G., Cambronne, E. D., Sharp, D., Adkins. J. N., Samudrala, R., & Heffron, F. (2011). Computational prediction of type III and IV secreted effectors in gram-negative bacteria. Infection and immunity, 79(1), 23-32. 3. Hajri, A., Brin, C., Hunault, G., Lardeux, F., Lemaire, C., Manceau, C., Bouraeu, T., & Poussier, S. (2009). A repertoire for repertoire hypothesis: Repertoires of type three effectors are candidate determinants of host specificity in Xanthomonas. PLoS One, 4(8), e6632. 4. Dean, P. (2011). Functional domains and motifs of bacterial type III effector proteins and their roles in infection. FEMS microbiology reviews, 35(6), 1100-1125. 5. Cornelis, G. R. (2010). The type III secretion injectisome, a complex nanomachine for intracellular toxin delivery. Biological chemistry, 391(7), 745-751. Figure 2: The flow chart model of creating a new effector prediction program. The process begins with adequate feature selection, then algorithm training, and then finally testing the program on unclassified genes to find new effector genes. Attribute Positive Effectors Negative Effectors Average St.Dev. Average St.Dev. Molecular Weight (kDa) 31.2 ±38.2 38.4 ±23.3 G+C Content 60.0 ±10.7 60.8 ±5.7 pI 6.7 ±1.8 7.2 ±1.8 Instability Index 40.5 ±10.7 39.4 ±10.0 A280 Molar Ext. Coef. 0.56 ±0.41 1.03 ±0.57 CAI 0.66 ±0.1 0.69 ±0.08 GRAVY Score (N20) -0.091 ±0.48 -0.023 ±0.42 Small Peptides (N20) 60.0% ±14.4% 46.4% ±11.7% N-terminal Instability 0.53 ±0.20 0.29 ±0.16 Coiled-Coil Regions 0.25 ±0.45 0.07 ±0.25 Alphiatic Index 91.9 ± 18.7 97.3 ± 16.5 Table 1: Statistical analysis of the significant attributes of effectors compared to non-effectors The top scoring algorithm was Baysian Network, which had a ROC area under the curve of 0.833 The data so far was too discrete to decide which length was best A statistical analysis of the selected attributes was presented in Table 1. Significant attributes were marked in bold.

Upload: tru-ugc

Post on 15-Jul-2015

38 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Genome wide search for type iii secretion system - V. Porter

Methods

Self-Designed Effector Prediction Program: Nine algorithms were trained off

of E. tarda EIB202’s gene products to make an effector prediction program

specific to our pathogen. Six N-terminal lengths and the whole sequence were

used to test 15 putative protein attributes and to see which N-terminal length is

truly more accurate. We used a set of positive effectors (n=12) and two sets of

non-effectors (n=2600+ and n=40) to train algorithm using the program WEKA.

Gram-negative bacteria such as the fish pathogen

Edwardsiella tarda EIB202 utilize the type III secretion

system (T3SS) to secrete virulent effector proteins into

the host1. The effector proteins are the most virulent,

yet hardest to predict proteins within the secretion

system; each individual effector protein exhibits distinct

features and are widely distributed through the

genome2. This triggered the hypothesis that the

effectors are derived through horizontal gene transfer

separate to the rest of the T3SS, and then are modifiedto fit the pathogen’s target host and mechanisms3.Once the effector proteins are secreted, they can go on

to perform specific mechanisms within the host that

modulate its normal function4. It is speculated that

within its genome of 3700+ genes, E. tarda contains

approximately 30 T3SS effector proteins, 12 of which

Results

Discussion and Future Directions

References

Acknowledgements

Putative Effector

Features

Feature Selection

Classify Algorithm

Trained ModelUnclassified

GenesPredicted

non-effectors

Predicted

effectors

Prior Knowledge

Experimental

validation

Introduction

© 2006 Nature Publishing Group

Tip complex:LcrV

Bacterial cell

Translocator

Effector proteins

Translocation pore

Host cell

a b c

Yersinia

in vivo

P. aeruginosa in vitro

S. typhimurium invJ

Yersinia Pseudomonas

. Shigella

Needle length control

Y. enterocolitica

S. typhimurium

Y. enterocolitica (TABLE 3)

(FIG. 6a)

Fig 6a

Y. enterocolitica

yscP

(FIG. 6b)

(FIG. 6c)

a

Figure 5 | Hypothetical model of the function of the LcrV tip complex. a | No

contact with host cell: LcrV forms the complex at the tip of the needle. b | Contact with

the host cell membrane: the tip complex assists with the assembly of the translocation

pore, serving as an assembly platform. c | Anti-LcrV antibodies are protective because

they prevent the formation of the translocation pore59.

REVIEWS

MICROBIOLOGY 819

have been identified experimentally. Prediction of the remaining 15+

effector proteins using bioinformatics could narrow down the search and

significantly reduce the time and labour it takes to experimentally verify

effectors.

This study was the first steps towards the creation of a new species-specific

effector prediction program. Further progress is being done in the feature

selection and algorithm selection. Once all these factors are set, the algorithm

will be retrained and hopefully result in more accurate prediction scores.

Verification of the unknown effectors can later on build a better understanding of

the T3SS and also help create a better multi-effector combating vaccine against

E. tarda infection in Asian aquaculture.

The authors are grateful for the financial contributions from NSERC, Canada (USRA and Discovery Grant) and the

Open Funding Project of State Key Lab of Bioreactor Engineering, ECUST from Shanghai, China.

Figure 1: The type III secretion

injectisome secreting effectors into

the host (Cornelis, 2010).

1. Leung, K. Y., Siame, B. A., Tenkink, B. J., Noort, R. J., & Mok, Y. K. (2012). Edwardsiella tarda – Virulence

mechanisms of an emerging gastroenteritis pathogen. Microbes and Infection, 14(1), 26-34.

2. McDermott, J. E., Corrigan, A., Peterson, E., Oehmen, C., Niemann, G., Cambronne, E. D., Sharp, D., Adkins. J. N.,

Samudrala, R., & Heffron, F. (2011). Computational prediction of type III and IV secreted effectors in gram-negative

bacteria. Infection and immunity, 79(1), 23-32.

3. Hajri, A., Brin, C., Hunault, G., Lardeux, F., Lemaire, C., Manceau, C., Bouraeu, T., & Poussier, S. (2009). A

repertoire for repertoire hypothesis: Repertoires of type three effectors are candidate determinants of host specificity

in Xanthomonas. PLoS One, 4(8), e6632.

4. Dean, P. (2011). Functional domains and motifs of bacterial type III effector proteins and their roles in infection.

FEMS microbiology reviews, 35(6), 1100-1125.

5. Cornelis, G. R. (2010). The type III secretion injectisome, a complex nanomachine for intracellular toxin delivery.

Biological chemistry, 391(7), 745-751.

Figure 2: The flow chart model of creating a new effector prediction program. The process begins with

adequate feature selection, then algorithm training, and then finally testing the program on unclassified genes

to find new effector genes.

Attribute

Positive Effectors Negative Effectors

Average St.Dev. Average St.Dev.

Molecular Weight (kDa) 31.2 ±38.2 38.4 ±23.3

G+C Content 60.0 ±10.7 60.8 ±5.7

pI 6.7 ±1.8 7.2 ±1.8

Instability Index 40.5 ±10.7 39.4 ±10.0

A280 Molar Ext. Coef. 0.56 ±0.41 1.03 ±0.57

CAI 0.66 ±0.1 0.69 ±0.08

GRAVY Score (N20) -0.091 ±0.48 -0.023 ±0.42

Small Peptides (N20) 60.0% ±14.4% 46.4% ±11.7%

N-terminal Instability 0.53 ±0.20 0.29 ±0.16

Coiled-Coil Regions 0.25 ±0.45 0.07 ±0.25

Alphiatic Index 91.9 ±18.7 97.3 ±16.5

Table 1: Statistical analysis of the significant attributes of effectors

compared to non-effectors

• The top scoring algorithm was Baysian Network, which had a ROC area

under the curve of 0.833

• The data so far was too discrete to decide which length was best

• A statistical analysis of the selected attributes was presented in Table 1.

Significant attributes were marked in bold.