comparative genomics final results

35
Comparative Genomics Final Results Ben Dan Deepak Esha Kelley Pramod Raghav Smruthy Vartika Will

Upload: andres

Post on 24-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Ben Dan Deepak Esha Kelley Pramod Raghav Smruthy Vartika Will. Comparative Genomics Final Results. Questions to be Addressed. Sixteen strains clustered with V. navarrensis type strain LMG15976 16S rRNA , pyrH , recA and rpoA Four formed a distinct cluster - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Comparative Genomics Final Results

Comparative GenomicsFinal Results

BenDan

DeepakEsha

KelleyPramodRaghav

SmruthyVartika

Will

Page 2: Comparative Genomics Final Results

Questions to be Addressed

1. Sixteen strains clustered with V. navarrensis type strain LMG15976

• 16S rRNA, pyrH, recA and rpoA• Four formed a distinct cluster• V. vulnificus Closest relative to both lineages of V. navarrensis

“Is it a different species or biotype?”

2. V. navarrensis strains isolated from various sources.• nav_2423 (VN1) : Blood• nav_2462 (VN2) : Surface Wound• nav_2541 (VN3) : Sewage• nav_2756 (VN4) : Water

“Is Vibrio navarrensis pathogenic?”

Page 3: Comparative Genomics Final Results

Genes common/unique to V.vulnificus and V.navarrensis

Page 4: Comparative Genomics Final Results

SPECIATION??

Page 5: Comparative Genomics Final Results

VN3 VN2

VN4 VN1

VV1 VV4 VV2 VV3 VV5

Vp10.02

100% Bootstrap support >0.98 posterior probability support

98.35

97.13

95.58

98.60

97.76

82.64

81.94

Aligned using Clustal-omega. A concatenated alignment was generated and a bootstrapped (100) maximum likelihood phylogenetic tree was generated using the Jones-Taylor-Thornton model of evolution and an assumption of a constant rate of change. The tree was rooted with Vibrio parahaemolyticus as an outgroup. All nodes had 100% bootstrapping support and >0.98 posterior probability support (8 chains for 20,000 generations sampling at every 100th generation). ANI support for each node is shown. ANI values for internal nodes were calculated by taking the average ANI for all pairs of genomes representing the bifurcation.

98.88

98.92

Whole genome super matrix tree

Page 6: Comparative Genomics Final Results

VV1 VV2VN4VN2Vibrio_vulnificus_CMCP6

Vibrio_vulnificus_YJ016 VN1VV4VV5VN3VV3

Vibrio_parahaemolyticus

95

74

11

12

0.0000.0050.0100.015

• 16S is not informative for differentiating closely related Vibrio species.

• Full length 16S rRNA sequences were assembled by mapping to the reference .

• Aligned using PyNAST• Bootstrapped ML tree was

generated using MEGA

• Rooted using V. parahaemolyticus

16S rRNA Tree

Page 7: Comparative Genomics Final Results

PATHOGENECITY??

Page 8: Comparative Genomics Final Results

Annotated Dataset

Existence of

Toxins

Machinery for Incorporation

Presence Absence

Machinery for Incorporation

Potentially Pathogenic

Yes No

Correlation with Pathway(KEGG)

Pathogenicor

Putatively Pathogenic

Connecting the dots

Unlikely Pathogenic

Gene Predictions

OrthoMCL

Generation of presence-absence

matrix

Heatmaps in R to view gene profiles

Reference Strains

Annotation Files from NCBI

Test for group significance

(ANOSIM test)

ID genes associated with

groups(SIMPER test)

Different combos of files

Approach I Approach II

Page 9: Comparative Genomics Final Results

Gene files

Pre-Processing

Filter Fasta

BlastDbAll v/s All Blast

Blast Parser

Markov Clustering

Find Protein Pairs

Upload parsed data to Database

Cluster of Orthologs

Approach II (contd)

Page 10: Comparative Genomics Final Results

Gene Profiles for All Strains

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP06V. vulnificus YJ016

V. parahaemolyticus 0PV. parahaemolyticus 33V. cholerae 95V. cholerae 61V. splendidus 32V. fischeri 11V. fischeri 14

10090807060Similarity

Non-human pathogens

Pathogens

Group Average dendrogram generated from a simple matching resemblance matrix. PresenceAbsence

Page 11: Comparative Genomics Final Results

pathvulnonpathnav

VC61VC95

VP33 VP0P

VVCMVVYJ

VV01VV02

VV03

VV04

VV05

VF11VF14

VS32

VN01VN02

VN03

VN04 2D Stress: 0.09

MNDS plot generated from a simple matching resemblance matrix. The dendrogram is a bit misleading about the relationship between V. splendidus and V. fischeri.

Gene Profiles for All Strains

Page 12: Comparative Genomics Final Results

ANOSIM Statistical Test

ANOSIM is a nonparametric method that tests whether two or more groups of samples are significantly different.

R statistic - A measure of the strength of the difference between two groups. A value closer to +1 signifies more dissimilarity between the groups

Significance Level - tests the significance of the difference. Analogous to p-value.

Groups R statistic Significance Level %Pathogenic, V. vulnificus 0.487 0.6

Pathogenic, Non-Pathogenic 0.37 6Pathogenic, V. navarrensis 0.712 1

V. vulnificus, Non-Pathogenic 1 1.8V. vulnificus, V. navarrensis 1 0.8

Non-Pathogenic, V. navarrensis 1 2.9

Page 13: Comparative Genomics Final Results

Genes significantly different between V. vulnificus and V. navarrensis

10090807060Similarity

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 14: Comparative Genomics Final Results

Genes significantly different between V. vulnificus and V. navarrensis

10090807060Similarity Hypotheticals / Conserved hypotheticals

PresenceAbsenceVN02

VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 15: Comparative Genomics Final Results

10090807060Similarity

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Biofi

lmCh

emot

axis

CPS

Pilli

PTS

rtxC

Vibr

iobac

tin re

lated

Type

1 &

2 S

ecre

tion

Slim

e bios

ynTo

nBTo

xin re

lated

Antib

iotic

res

Simple

suga

r upta

ke

LPS

n-acetyl transferaseacetyl transferaseglucokinaseHeme Biosynthesis / Iron acquisitionAdhesin Chemotaxis

Missing from V.

navarrensis

Unique to V.

navarrensis

PresenceAbsence

Page 16: Comparative Genomics Final Results

10090807060Similarity

Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

A SIMPER test was performed to identify genes that lead to differences between Pathogens (V. cholerae, V. parahaemolyticus, V. vulnificus) and Non-Pathogens (V. fischeri, V. splendidus). Genes were supported by relative abundance in Pathogens, then by relative abundances in non-pathogens. Genomes are arranged based on the clustering pattern identified from the entire gene profile.

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 17: Comparative Genomics Final Results

Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

A SIMPER test was performed to identify genes that lead to differences between Pathogens (V. cholerae, V. parahaemolyticus, V. vulnificus) and Non-Pathogens (V. fischeri, V. splendidus). Genes were supported by relative abundance in Pathogens, then by relative abundances in non-pathogens. Genomes are arranged based on the clustering pattern identified from the entire gene profile.

10090807060Similarity

PresenceAbsence

Transporters, transcription factors, hemolysins, exonucleases, carbohydrate metabolism (enormous gene variation)

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 18: Comparative Genomics Final Results

Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

A SIMPER test was performed to identify genes that lead to differences between Pathogens (V. cholerae, V. parahaemolyticus, V. vulnificus) and Non-Pathogens (V. fischeri, V. splendidus). Genes were supported by relative abundance in Pathogens, then by relative abundances in non-pathogens. Genomes are arranged based on the clustering pattern identified from the entire gene profile.

10090807060Similarity

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 19: Comparative Genomics Final Results

10090807060Similarity

A subset of Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

In yellow: Genes related to type 1 secretion, chemotaxis, permeases, proteases, and LPS synthesis (capsid polysaccharides, lipoproteins,

exopolysacharrides)

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 20: Comparative Genomics Final Results

A subset of Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

Mostly hypotheticals (40), response regulators, glutathione synthase, starvation proteins

10090807060Similarity

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 21: Comparative Genomics Final Results

A subset of Genes significantly enriched in a priori defined “Pathogens” and “Non-pathogens” Groups

Hypotheticals (153), transcription factors (21), urease operon (10), lipoproteins (16), chemotaxis (8), zinc uptake (3), sideophore synthesis & uptake (6 – 2 operons),

luciferase operon (3 genes)

10090807060Similarity

PresenceAbsence

VN02VN03VN04VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 22: Comparative Genomics Final Results

Genes significantly different between the Clinical and Environmental Strains of V. navarrensis

Endonucleases (5), Channel proteins (2), chemotaxis genes (5), permeases

(2), transcriptional regulators (4), dehydratase (4)

Hypotheticals, flagellar proteins

PresenceAbsence

10090807060Similarity

Drives separationVN02VN03VN04

VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 23: Comparative Genomics Final Results

10090807060Similarity

A Subset of Genes significantly different between the Clinical and Environmental Strains of V. navarrensis

ATP dep. endonuclease

ChemotaxisEndonucleases

Channel proteinsPhage tail collar domainFlagellinTranscript. regulatorsdehydratase

PresenceAbsence

VN02VN03VN04

VN01VV05VV04VV03VV02VV01V. vulnificus CMCP6

V. vulnificus YJ016

V. parahaemolytics BB220P

V. parahaemolytics RIMD_2210633

V. cholerae O395

V. cholerae O1 biovarE1 Tor N16961

V. spendidus LGP32V. fischeri MJ11V. fischeri ES114

Page 24: Comparative Genomics Final Results

Previously Discusses Virulence Factors

Virulence Factor

Description

RTX Toxin rtxA gene encodes for the RTX toxin which is related with septicemia and gastroenteritis

Hemolysins Exotoxins that lyse erythrocyte membranes by formation of pores with the liberation of iron binding proteins (transferrin,

lactoferrin and hemoglobin).Four defined classes of Hemoylsins: TDH, TLH, δ – VPH, hlyAExperimental evidence suggests Hemolysins are involved in

disease pathogenesis.

Siderophores Low molecular weight compounds that have high affinity for iron molecules.

Studies show the association of siderophores with virulence in Vibrios.

Attachment Factors

Toxin Co-regulated Pilus (TCP) and Type IV pilus

Secretion Systems

CTX is associated with Type IIRTX is associated with Type I

Page 25: Comparative Genomics Final Results

Capsular Polysaccharides

•The most important virulence factor for V. vulnificus is its capsular polysaccharide (CPS).

•V. vulnificus is an extracellular pathogen that relies on its CPS to avoid phagocytosis by host defense cells and complement (Linkous and Oliver, 1999; Strom and Paranjpye, 2000).

•Unencapsulated mutants are susceptible to bactericidal activity in human serum (Shinoda et al., 1987).

•Presence of capsule is related to the colony morphology (Yoshida et al., 1985; Wright et al., 1999).

Page 26: Comparative Genomics Final Results

Class Function GeneVV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

Capsular Polysaccharide

Involved in subunit transport and flanked by

direct repeat DNA sequence

wzb

wzc

Capsular polysaccharide

biosynthesis

LPS Biosynthesis

Capsular polysaccharide biosynthesis

Capsular polysaccharide biosynthesis

systemSerum

resistance genes Serum resistance trkA

Page 27: Comparative Genomics Final Results

Selected Hemolysins

Class Function GeneVV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

Hemolysins HlyA (E1 Tor haemolysin)

family

vvhA

vvhB

Similar to hemolysin III of

B.sereushlyIII

Hemolysins vllY

Virulence gene regulation

hlyU

Page 28: Comparative Genomics Final Results

Iron Acquisition

•Vibrio vulnificus pathogenecity - increased iron in the host results in increased susceptibility to infection (Weinberg 2000).

•As with other invasive bacterial pathogens, iron-scavenging siderophores and proteins that bind host iron-containing proteins were identified in V. vulnificus.

•A couple of studies indicated that the protease produced by V. vulnificus could be involved in acquisition of iron from heme proteins (Nishina et al., 1992; Okujo et al., 1996).

•Litwin and Calderwood (1993) cloned the V. vulnificus fur gene, which encodes the central regulator in iron metabolism in many bacteria.

•The essential role for vulnibactin in virulence was confirmed by Litwin et al. (1996). V. vulnificus mutant for vuuA, the ferric vulnibactin receptor, could not use vulnibactin and was decreased for virulence in mice.

Page 29: Comparative Genomics Final Results

Class Function GeneVV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

Iron acquisition

Central regulator in iron metabolism

fur

Ferric vulnibactin receptor

vuuA

Vulnibactin utilization protein

viuB

Siderophore synthaseVulnibactin synthase

Page 30: Comparative Genomics Final Results

Flagella and Motility

Class Function GeneVV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

Flagella and Motility

Encodes the flagellar basal

bodyflgC

Encodes flagellar hook protein flgE

Involved in flagellar

biosynthesisfliP

Page 31: Comparative Genomics Final Results

The mystery behind RTX toxin

These following are the hits from the annotation for rtx:

•RTX toxin – Toxin metabolic process; cytolysis •RTX protein – iron regulated protein

When we BLAST these proteins with NCBI we found the following hits,

•M6 family metalloprotease domain protein

•Iron regulated protein frpC

Page 32: Comparative Genomics Final Results

Class Function Gene VV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

RTX

toxin rtxA

ATP Binding cassette

transporter for rtxA

rtxB

Essential acyclase of rtxA rtxC

unknown function in transport rtxD

Type 1 Secretion System

Outer membrane protein

tolC

ABC transporter hlyB

Membrane fusion protein

hlyD

Type IV Pilus Adherence (Present)

RTX machinery

Page 33: Comparative Genomics Final Results

Some other interesting factors!

Function GeneVV1

VV2

VV3

VV4

VV5

VN1

VN2

VN3

VN4

Heme receptor hupADNA binding transcriptional

regulator hupB

Metalloprotease vvpE

Hypothetical protein

vvp15vvp22vvp28

Adherence to human epithelial cells pilD

Relating to loss in cytotoxic activity

purHpyrH

Relating to decreased expression of Hemolysins

toxRtoxS

Autoinducer II production luxS

Page 34: Comparative Genomics Final Results

Conclusions

1. V. navarrensis is unlikely to be a pathogen to healthy human individuals.• Absence of toxins• Absence of CPS• Presence of hemolysins similar to V. vulnificus

2. Very different profile from the compared Vibrios.

3. Vibrio navarrensis is not similar to the non-human pathogenic Vibrios.

4. Blood and environmental strains of V. navarrensis are very similar.• Differences: LPS synthesis, Type-I secretion system, Permeases.

5. We still believe that these will store a similar niche in the environment.

6. Vibrios are difficult to study owing to their metabolic versatility and wide range of animal hosts.

Page 35: Comparative Genomics Final Results

Questions?