using informatics to focus bacterial pathogenicity studies

49
Using informatics to focus bacterial pathogenicity studies

Post on 20-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Using informatics to focusbacterial pathogenicity studies

Goal:

Use informatic analyses to generate new testable hypotheses about pathogen protein function and pathogenicity mechanisms

Test the hypotheses in the laboratory

Using informatics to focusbacterial pathogenicity studies

• Gramicidine S (Consden et al., 1947), partial insulin sequence (Sanger and Tuppy, 1951)

• First codon assignment UUU/phe (Nirenberg and Matthaei, 1961)

• 3.5 kb RNA bacteriophage MS2 (Fiers et al., 1976) 5.4 kb bacteriophage X174 (Sanger et al., 1977)

• Early databases: Dayhoff, 1972; Erdmann, 1978

Need for informatics in biology: origins

(from the National Centre for Biotechnology Information)

Explosion of data

22 of the 33 publicly available microbial genome sequences are for bacterial pathogens

Approximately 18,000 pathogen genes with no known function!

>95 bacterial pathogen genome projects in progress…

- Pseudomonas aeruginosa

- Three dimensional comparative protein modeling

- Phylogenetic analysis of gene families

- Other analyses: Regulatory network complexity

- Pathogenomics Project

- Detecting eukaryote:pathogen homologs

- Detecting pathogenicity islands

Pathogen Informatics

Pseudomonas aeruginosa• Found in soil, water, plants, animals

• Common cause of hospital acquired infection: ICU patients, Burn victims, cancer patients

• Almost all cystic fibrosis (CF) patients infected by age 10

• Intrinsically resistant to many antibiotics

• No vaccine

Outer membrane protein OprF

• Nonspecific porin

• Required for– Maintenance of cell shape – Growth in low-osmolarity environments

• OprF- clinical mutant with multiple antimicrobial resistance being characterized

• Adhesin in plant colonizing Pseudomonas species

• Proposed vaccine component

POREPORIN

Peptidoglycan

LPS Mg++

Outermembrane

Cytoplasmicmembrane

Gram Negative Cell Envelope

Periplasm

Structure of the outer membrane protein A transmembrane domain

Pautsch and Schulz (1998).Nature Structural Biology 5:1013-1017

No channel formation detected

 OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGGSIGYFLTDDVELALSYGEYHOmpA 1 APKDNTWYTGAKLGWSQYHDTGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG * * * * ** * *

OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVGLRPYVSAGLA-HQNITNINSDSQGRQQOmpA 60 RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYSNVYGKNHDT * * * * * * * *

OprF 110 MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVGFNFGOmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFG * * * * *** ** 

 

 

 

OprF and OmpA share only 15% identity

Model of the N-terminus of OprF based on OmpA

Brinkman, Bains and Hancock (2000). Journal of Bacteriology 182:5251-5255

OprF model (yellow and green) aligned with the crystal structure of OmpA (blue)

Many residues are in the same three dimensional environment, though on different strands

 OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGGSIGYFLTDDVELALSYGEYHOmpA 1 APKDNTWYTGAKLGWSQYHDTGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG * * * * ** * *

OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVGLRPYVSAGLA-HQNITNINSDSQGRQQOmpA 60 RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYSNVYGKNHDT * * * * * * * *

OprF 110 MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVGFNFGOmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFG * * * * *** ** 

 

 

 

OprF and OmpA similarity

Residues implicated in blocking channel formation in OmpA are not conserved in OprF

BathingSolution

PlanarBilayer

Membrane

VoltageSource

CurrentAmplifier

Protein

Planar Lipid Bilayer Apparatus

The N-terminus of OprF forms channels in a lipid bilayer membrane 

0

5

10

15

20

25

30

35

40

0.2

0.4

0.6

0.8 1

1.2

1.4

1.6

1.8 2

2.2

2.4

2.6

2.8 3

Single channel conductance (nS)

No

. o

f ev

ents

Upstream of OprF is a probable sigma factor gene, sigX

sigX oprF

Promoter

Transcription terminator

Disruption of sigX reduces expression of OprF

1. Marker2. Wildtype3. sigX- mutant4. oprF- mutant

P. aeruginosa P. fluorescens

oprF

oprF

No SigX expression:

SigX expression:

sigX

sigX

18 ECF sigma factors in the P. aeruginosa genome

0 1000 2000 3000 4000 5000 6000 7000Number of Genes

0

2

4

6

8

10R

egu

lato

rs (

%)

Percent Regulators as a Function of Genome Size

12 3

4 567

89 10

1112

13Specialized environmentsFree-living

Genomes represented: 1, Mycoplasma genitalium; 2, Chlamydia trachomatis; 3, Treponema pallidum; 4, Borrelia burgdorferi; 5, Chlamydia pneumoniae; 6, Helicobacter pylori ---; 7, Helicobacter pylori---; 8, Haemophilus influenzae; 9, Neisseria meningitidis; 10, Mycobacterium tuberculosis; 11, Bacillus subtilis; 12, Escherichia coli; 13, Pseudomonas aeruginosa.

OprM Family of putative Efflux and Type I secretion proteins (18 members)

OprD Family of putative Amino acid, Peptide and Aromatic compound transporters (19 members)

TonB Family of putative iron-siderophore receptors (34 members)

P. aeruginosa Genome Sequence Analysis: Outer Membrane Proteins (OMPs)

Approximately 150 OMPs predicted including three large paralogous families:

AprFOpmM

OpmH

OpmFOpmKOpmL

OpmN

OpmQ

OpmD

OprN

OpmE

OpmJOpmA

OprM OprJ

OpmB

OpmGOpmI

OprMFamily

(MultidrugEfflux?)

ProteinSecretion? TolC

OprM structural model based on TolC

OprM structural model based on TolC

OprM structural model based on TolC

Future Developments

• Modeling of other outer membrane proteins in Neisseria species.

• Developing a better algorithms for secondary structure prediction

Pathogenomics

Goal:

Identify previously unrecognized mechanisms of microbial pathogenicity using a unique combination of informatics, evolutionary biology, microbiology and genetics.

Pathogenicity

Processes of microbial pathogenicity at the molecular level are still minimally understood

Pathogen proteins identified that manipulate host cells by interacting with, or mimicking, host proteins.

Idea: Could we identify novel virulence factors by identifying pathogen genes more similar to host genes than you would expect based on phylogeny?

Eukaryotic-like pathogen genes

- YopH, a protein-tyrosine phosphatase, of Yersinia pestis

- Enoyl-acyl carrier protein reductase (involved in lipid metabolism) of Chlamydia trachomatis

0.1

Aquifex aeolicus

Haemophilus influenza

Escherichia coli

Anabaena

Synechocystis

Chlamydia trachomatis

Petunia x hybrida

Nicotiana tabacum

Brassica napus

Arabidopsis thaliana

Oryza sativa

100

100

100

96

63

64

52

83

99

Pathogens Anthrax Necrotizing fasciitis Cat scratch disease Paratyphoid/enteric feverChancroid Peptic ulcers and gastritisChlamydia Periodontal diseaseCholera PlagueDental caries PneumoniaDiarrhea (E. coli etc.) SalmonellosisDiphtheria Scarlet feverEpidemic typhus ShigellosisMediterranean fever Strep throatGastroenteritis SyphilisGonorrhea Toxic shock syndromeLegionnaires' disease Tuberculosis Leprosy TularemiaLeptospirosis Typhoid feverListeriosis UrethritisLyme disease Urinary Tract InfectionsMeliodosis Whooping cough Meningitis Hospital-acquired infections

Pathogens

Chlamydophila psittaci Respiratory disease, primarily in birdsMycoplasma mycoides Contagious bovine pleuropneumoniaMycoplasma hyopneumoniae Pneumonia in pigsPasteurella haemolytica Cattle shipping feverPasteurella multicoda Cattle septicemia, pig rhinitisRalstonia solanacearum Plant bacterial wiltXanthomonas citri Citrus cankerXylella fastidiosa Citrus variegated chlorosis

Bacterial wilt

Informatics/Bioinformatics

• BC Genome Sequence Centre

• Centre for Molecular Medicine and Therapeutics

Evolutionary Theory

• Dept of Zoology

• Dept of Botany

• Canadian Institute for Advanced Research

Pathogen Functions

• Dept. Microbiology

• Biotechnology Laboratory

• Dept. Medicine

• BC Centre for Disease Control

Host Functions

• Dept. Medical Genetics

• C. elegans Reverse Genetics Facility

• Dept. Biological Sciences SFU

Interdisciplinary group

Prioritize for biological study. - Previously studied biologically? - Can UBC microbiologists study it? - C. elegans homolog?

Screen for candidate genes.Search pathogen genes against sequence databases. Identify those with eukaryotic similarity/motifs

Rank candidates.- how much like host protein?- info available about protein?

Modify screening method /algorithm

Approach

Evolutionary significance.- Horizontal transfer? - Similar by chance?

Bacterium Eukaryote Horizontal Transfer

0.1

Bacillus subtilis

Escherichia coli

Salmonella typhimurium

Staphylococcua aureus

Clostridium perfringens

Clostridium difficile

Trichomonas vaginalis

Haemophilus influenzae

Acinetobacillus actinomycetemcomitans

Pasteurella multocida

N-acetylneuraminate lyase (NanA) of the protozoan Trichomonas vaginalis is 92-95% similar to NanA of Pasteurellaceae bacteria.

N-acetylneuraminate lyase – role in pathogenicity?

Pasteurellaceae

•Mucosal pathogens of the respiratory tract

T. vaginalis

•Mucosal pathogen, causative agent of the STD Trichomonas

N-acetylneuraminate lyase (sialic acid lyase, NanA)

Involved in sialic acid metabolism

Role in Bacteria: Proposed to parasitize the mucous membranes of animals for nutritional purposes

Role in Trichomonas: ?

Hydrolysis of glycosidic linkages of terminal sialic residues in glycoproteins, glycolipids Sialidase

Free sialic acid

Transporter

Free sialic acid NanA

N-acetyl-D-mannosamine + pyruvate

Eukaryote Bacteria Horizontal Transfer?

0.1Rat

Human

Escherichia coli

Caenorhabditis elegans

Pig roundworm

Methanococcus jannaschii

Methanobacterium thermoautotrophicum

Bacillus subtilis

Streptococcus pyogenes

Aquifex aeolicus

Acinetobacter calcoaceticus

Haemophilus influenzae

Chlorobium vibrioforme

GMP reductase of E. coli is 81% similar to the corresponding enzyme studied in humans and rats

Role in virulence not yet investigated

Eukaryote Bacteria Horizontal Transfer?

Ralstonia solanacearum cellulase (ENDO-1,4-BETA-GLUCANASE) is 56% similar to endoglucanase present in a number of fungi.

Demonstrated virulence factor for plant bacterial wilt

Hypocrea jecorina EGLII

Trichoderma viride EGL2

Penicillium janthinellum EGL2

Macrophomina phaseolina EGL2

Cryptococcus flavus CMC1

Ralstonia solanacearum egl

Humicola insolens CMC3

Humicola grisea CMC3

Aspergillus aculeatus CMC2

Aspergillus nidulans EGLA

Macrophomina phaseolina egl1

Aspergillus aculeatus CEL1

Aspergillus niger EGLB

Vibrio species manA

World Research Community

Functional studiesPrioritized candidates

Study function of similar gene in model host, C. elegans.

Study function of gene.

Investigate role of bacterial gene in disease: Infection study in model host

C. elegans

DATABASE

Contact other groups for possible collaborations.

Pathogenicity Islands

• Virulence genes commonly in clusters

• Associated with– tRNA sequences– Transposases, Integrases and other mobility

genes– Flanked by repeats

G+C Analysis: Identifying Pathogenicity Islands

Yellow circle = high %G+C

Pink circle = low %G+C

tRNA gene lies between the two dots

rRNA gene lies between the two dots

Both tRNA and rRNA lie between the two dots

Dot is named a transposase

Dot is named an integrase

Neisseria meningitidis serogroup B strain MC58 Mean %G+C: 51.37 STD DEV: 7.57

%G+C SD Location Strand Product 37.22 -1 1831577..1832527 + pilin gene inverting 39.95 -1 1834676..1835113 + VapD-related 51.96 1835110..1835211 - cryptic plasmid A-related 39.13 -1 1835357..1835701 + hypothetical 40.00 -1 1836009..1836203 + hypothetical 42.86 -1 1836558..1836788 + hypothetical 34.74 -2 1837037..1837249 + hypothetical 43.96 1837432..1838796 + conserved hypothetical 40.83 -1 1839157..1839663 + conserved hypothetical 42.34 -1 1839826..1841079 + conserved hypothetical 47.99 1841404..1843191 - put. hemolysin activ. HecB 45.32 1843246..1843704 - put. toxin-activating 37.14 -1 1843870..1844184 - hypothetical 31.67 -2 1844196..1844495 - hypothetical 37.57 -1 1844476..1845489 - hypothetical 20.38 -2 1845558..1845974 - hypothetical 45.69 1845978..1853522 - hemagglutinin/hemolysin-rel. 51.35 1854101..1855066 + transposase, IS30 family

%G+C of ORFs: Analysis of Variance

• %G+C variance is similar within a given species

• Low %G+C variance correlates with an intracellular lifestyle for the bacterium and a clonal nature (P = 0.004)

• Neisseria meningitidis +/- 7%• Chlamydia species +/- 2%

Intracellular bacteria ecologically isolated?

Future Developments

• Identify eukaryotic motifs and domains in pathogen genes

• Identify further motifs associated with• Pathogenicity islands• Virulence determinants

• Functional tests for new potential virulence factors

www.pathogenomics.bc.ca

Informatics as a focus

• Outer membrane protein modeling: Focus mutational studies and studies of surface exposed sequences

• Phylogenetic analyses: Focus study of gene mutants under certain environmental conditions

• Other analyses - Regulatory network complexity: Change focus of regulation studies

• Eukaryote:pathogen homologs: Focus identification of “mimics”

• Pathogenicity islands: Focus identification of recently obtained virulence determinants

Acknowledgements• Pathogenomics group: Ann Rose, Steven

Jones, Ivan Wan, Hans Greberg, Yossef Av-Gay, David Baillie, Bob Brunham, Stefanie Butland, Rachel Fernandez, Brett Finlay, Patrick Keeling, Audrey de Koning, Sarah Otto, Francis Ouellette, Peter Wall Institute

• Pseudomonas Genome Project: PathoGenesis Corp. (Ken Stover) and University of Washington (Maynard Olsen)

• Outer membrane proteins: Manjeet Bains, Kendy Wong, Canadian Cystic Fibrosis Foundation

• Bob Hancock