web applications for rapid microbial taxonomy identification

30
Web applications for rapid microbial taxonomy identification Ole Lund Center for Genomic Epidemiology

Upload: externalevents

Post on 21-Jan-2018

606 views

Category:

Education


0 download

TRANSCRIPT

Web applications for rapid microbial taxonomy

identification Ole Lund

Center for Genomic Epidemiology

Whole genome sequence based Diagnostics

Infectious diseases are responsible for >25% of all global deaths

An increasing number of infectious diseases have a global epidemiology (e.g. SARS, avian flu, influenza, Salmonella etc.).

Rapid detection, identification and exchange of comparable information between public health laboratories globally, are crucial to avoid or control global and local spread.

Sample Antibioticresistance

Culturing ID Typing

1-2 days 1-2 days 1-2 days 1 – several weeks

Routine microbial diagnostic

Sample Culturing

IDResistance

Typing+

Muchmore

1-2 days ½-1 day

Whole genome sequence based diagnostic

Bacterial genomics

• Sequencing a bacterial genome cost ~$100 (on a desk top sequencer)

• Equipment will cost less than $100 000

• In Denmark 1 million clinical microbiology isolates are handled each year

– EU/USA ~100 million

– Globally ~ 1 billion (10 billion needed)

• Future limiting factor will not be sequencing but handling the sequences

K-mer based method works well for species identification

Benchmarking of methods for genomic taxonomy. Larsen MV, Cosentino S, Lukjancenko O, Saputra D, Rasmussen S, HasmanH, Sicheritz-Pontén T, Aarestrup FM, Ussery DW, Lund O. J Clin Microbiol. 2014 May;52(5):1529-39.

K-mers: Not a new idea

Multilocus Sequence Typing of Total Genome Sequenced Bacteria. Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, Jelsbak L, Pontén TS, Ussery DW, Aarestrup FM, Lund O. J Clin Microbiol. 2012 Apr;50(4):1355-61.

MLST typing

Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing. Zankari E, Hasman H, Kaas RS, Seyfarth AM, Agersø Y, Lund O, Larsen MV, Aarestrup FM. J Antimicrob Chemother. 2013 68:771-7. Identification of acquired antimicrobial resistance genes. Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, Aarestrup FM, Larsen MV. J Antimicrob Chemother. 2012 67:2640-4.

Antimicrobial resistance

Pheno typing by machine learning

• Earlier methods are all based on alignment to a database of genes with known (pheno-) types.

• Andreatta et al. took a radically different approach and sorted genomes of Gamma-Proteobacteria into pathogenic or non-pathogenic, and looked for gene families that were statistically associated with either pathogenic or non-pathogenic bacteria (Andreatta et al. 2010).

• First example of using machine learning techniques to determine the phenotype from WGS.

• Extended to work for all species of bacteria and using raw sequencing data as input (Cosentino et al. 2013).

User Statistics

Until now: 280.000 submissions

To be added:Upload to public repositories(SRA/ENA)

102.000 sequences analyzed in 16.000 submissions

Phylogeny of the isolates by the NDtree method.

Joensen K G et al. J. Clin. Microbiol. 2014;52:1501-1510

3.0

0508R6762_IonTorrent_2

0508R6707_IonTorrent_1

0508R6707_MiSeq_2

0507R6701_HiSeq

0508R6701_IonTorrent_1

0508R6701_IonTorrent_2

0508R6701_MiSeq_2

0508R6762_MiSeq_1

0508R6762_IonTorrent_1

0508R6762_HiSeq

NCTC_13348_IonTorrent_2

0508R6707_IonTorrent_2

NCTC_13348_IonTorrent_1

0508R6762_MiSeq_2

NCTC_13348_MiSeq_1

NCTC_13348_MiSeq_2

0508R6707_HiSeq

0508R6707_MiSeq_1

0508R6701_MiSeq_1

NCTC_13348_MiSeq_3

0.5

0508R6762_IonTorrent_2

0508R6707_MiSeq_1

0508R6707_IonTorrent_1

NCTC_13348_MiSeq_1

NCTC_13348_MiSeq_3

0508R6701_MiSeq_1

0508R6762_MiSeq_2

0508R6701_IonTorrent_1

0508R6707_HiSeq

0508R6762_HiSeq

NCTC_13348_MiSeq_2

0508R6707_IonTorrent_2

NCTC_13348_IonTorrent_1

0508R6701_IonTorrent_2

0508R6762_MiSeq_1

0508R6762_IonTorrent_1

NCTC_13348_IonTorrent_2

0508R6701_MiSeq_2

0508R6707_MiSeq_2

0507R6701_HiSeq

0.08

0508R6762_IonTorrent_2

0508R6762_HiSeq

0507R6701_HiSeq

0508R6707_IonTorrent_1

0508R6701_MiSeq_1

0508R6707_IonTorrent_2

0508R6707_MiSeq_2

NCTC13348_Miseq1

0508R6707_MiSeq_1

0508R6701_MiSeq_2

0508R6707_HiSeq

NCTC13348_IonTorrent1

NCTC13348_IonTorrent2

0508R6762_MiSeq_2

0508R6762_MiSeq_1

0508R6762_IonTorrent_1

NCTC13348_Miseq3

0508R6701_IonTorrent_1

NCTC13348_Miseq2

0508R6701_IonTorrent_2

SNPtree CSIPhylogeny NDtree

Closereference

Remotereference

Salmonella TyphimuriumDT104

PLoS One. 2014 Aug 11;9(8):e104984.

Controlled evolution

Johanne Ahrenfeldt, Submitted

Phylogenetic tree using neighbor

joining

Johanne Ahrenfeldt, Submitted

Outbreak analysis of billions of strains: Real-time tracking of all microbial genomes

• OX values

• O10

– Number of earlier isolates (from within the last year) with less than 10 SNP differences to the current isolate

• Do not need to be updated

• Mapped genomes may be stores as binary files

• Search can/should be restricted to those that cluster to the same template

Evergreen Trees

• User submitted samples compared against all close-matching sample clusters

• Ever growing trees from the clusters

• Users can see all previous samples their sample is closely related to

Global Data Exchange

Global repositories*

* Providers you will bet your life on will provided High bandwidth programic access to deposition/retrieval forever: SRA/ENA/??

Hospital

Food safety agency

National CDC

Analysis www servers

Sequence + Meta data

Animal health

Sample

IDResistance

Type+

Everything

½-1 day

Metagenomic based diagnostic

Sample

IDResistance

Type+

Everything

minutes

Metagenomic based diagnostic with non batch mode sequencing (nanopore technologies)

Rapid whole genome sequencing for the detection and characterization of microorganisms directly from clinical samples. Hasman H, Saputra D, Sicheritz-Ponten T, Lund O, Svendsen CA, Frimodt-Møller N, AarestrupFM. J Clin Microbiol. 2014 Jan;52(1):139-46

It is not important to know where you are but where you are not

• Analysis of absence/presence of specific strains/species may be more important for diagnosis of infectious diseases than the general composition of the microbiomenormally associated with metagenomics

Whole genome sequencing

• Is it a game changer in the combat against infectious diseases

• Game changer? - what is new with WGS?– Typing with ultimate resolution (bar epigenetics?)

• Resolution = 1/mutation rate = 1 year

– Can (soon) be done in a day– Instant deep pheno-typing (e. g. resistance/virulence

genes)– With falling prices surveillance may be ubiquitous

• Everything is under constant surveillance– People, animals, planes, places, doorhandles …

– Information can be shared instantly around the globe

Transmission do not have to be zero

• But R0:

– The number of secondary infections that a case on the average give rise to

• Have to be below 1

Game changer?

• Can WGS + IT be used to set R0 to less than 1 for some pathogens in some areas?

• Which are the best cases?

ThanksDTU Systems Biology/CBS/Lund group

Mette Voldby Larsen

Martin Thomsen

Johanne Ahrenfeldt

Vanessa Jurtz

Jose L. Bellod Cisneros

Johanne Ahrenfeldt

Anna Maria Malberg Tetzschner

Ex members

Salvatore Consentiono

Student helpers

Jamie Neubert Pedersen

Valentin Ibanez

Rosa Allesøe

Camilla Lemvigh

DTU Systems Biology/CBS

Dave Ussery

Thomas Ponten

Dhany Saputra

Simon Rasmussen

Thomas Nordahl Petersen

DTU DMAC

Laurent Gautier

Marlene Dalgaard

DTU Food

Frank Aarestrup

Henrik Hasman

Rene S. Hendriksen

Shinny Leekitcharoenphon

Rolf Sommer Kaas

Marlene Hansen

Katrine Grimstrup Joensen

Oksana Lukjancenko

Copenhagen University/CMP

Thor Theander

Michael Alifrangis

Sidsel Nag

KCMC Moshi, Tanzania

Gibson Kibiki

Happiness Kumburu

Tolbert Sonda