whole genome sequencing (wgs) as a clinical tool · whole genome sequencing (wgs) - there’s a new...

Post on 10-Dec-2018

224 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Whole genome sequencing (WGS) - there’s a new tool in town

Henrik Hasman DTU - Food

Welcome to the NGS world TODAY Welcome

Introduction to Next Generation Sequencing DNA purification (Hands-on)

Lunch (Sandwishes – Hands-on)

DNA quantification for NGS (Hands-on)

Coffee….and….Library preparation (“at the movies”)

Running the MiSeq (Show’n tell) Computer exercises (Hands-on)

Goodbye

EU RL workshop on WGS Advanced NGS on antimicrobial resistant bacteria One day the training course will consist of hands-on and theoretical teaching focusing at NGS. This training aims at introducing relevant tools for the attendees to prepare for the challenges of genomic techniques. This theme will teach you how to prepare genomic DNA from bacterial culture for DNA sequencing, give a theoretical introduction to NGS sequencing including library preparation and running the MiSeq DNA sequencer. Finally, you will also perform an exercise regarding analysis of NGS data in relation to species identification, antimicrobial resistance gene detection and plasmid typing.

DNA sequencing

6

Applied Biosystems (ABI) Genetic analyser “First Generation” Sequencing machine (capillary Sanger sequencing)

Second generation sequencing

9

Illumina HiSeq/GAII systems High throughput systems

454 Life Sciences (Roche) First Next Generation Sequencing machine

Illumina MiSeq system Medium throughput system

Ion Torrent PGM system Low/medium throughput system

Second generation sequencing machines

Workflow today at the clinical laboratory

Workflow with WGS at the clinical laboratory

Didelot et al, 2012.

Rough assembly and compression

Raw DNA sequences

Gene finding Comparison

Identification

Fine assembly

What is already known? Pathogenicity islands Virulence genes Resistance genes MLST type

Google maps like view

• Reports Outbreaks

Summary of: What it is? Has it been seen before? How we can fight/treat? What is new/unusual?

Serv

er si

de

Clie

nt si

de

What is novel? Vaccine targets Virulence genes Resistance genes SNPs

Tutorial on MiSeq workflow

MiSeq Sequencing Chemistry: ca. 20 min http://support.illumina.com/training/courses/MiSeq_Sequencing_Chemistry/index.html?iframe

DNA purification EasyDNA from Invitrogen

Qubit DNA quantification

http://www.youtube.com/watch?v=6HtnVUHMX_8

Questions?

Then “to the dungeons”

Normal Illumina workflow

Video on Sample preparation

http://www.youtube.com/watch?v=fs1A_Ik7Smo

Simplified protocol

Nextera XT sample prep video

http://www.youtube.com/watch?v=ectVoRJ-6HU Manual:

http://supportres.illumina.com/documents/myillumina/900851dc-01cf-4b70-9e95-d590531c5bd4/nextera_xt_sample_preparation_guide_15031942_c.pdf

Nextera XT tutorial

Nextera DNA Sample Prep. Kit: ca. 20 min http://support.illumina.com/training/courses/Nextera_Sample_Prep_Kits/index.html?iframe

Nextera XT library workflow

Adapters added by PCR

Index (barcode)

Multiplexing with Nextera XT

Library building

Library preparation movies

http://support.illumina.com/training/sequencing_training.ilmn (Login might be required)

How many bacteria in a library?

• 16-18 genomes around 5-6 Mb - E. coli - Klebsiella - Salmonella

• 24 genomes around 3 Mb - Enterococcus - Staphylococcus - Campylobacter

The MiSeq principle

http://www.youtube.com/watch?v=l99aKKHcxC4

NGS output

Huge numbers of small fragments (35-500 bp)

Reference vs. de novo assembly

Known genome

Reference assembly

De novo assembly Smaller fragments (Unknown order)

Reference vs. de novo assembly

Data analysis of WGS data

Assembled data

Species confirmation

Resistance genes

MLST Virulence

genes Plasmids

Epidemio-logical

markers

MLST

NGS Illumina PacBio 454..

Resistance genes

Virulence genes

Allele 1 Allele 2 Allele 3 Allele 4

Allele 5 } ST Resistance

gene profile Assembly pipeline

List of genes (100% or >95%) Theoretical resistance phenotype

Species ID Outbreak strain SNP* based typing

*SNP – Single Nucleotide Polymorphism (extreme MLST)

1G bases 2-6 Mb

AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAATAAAAAAAAAAA AATAAATAATAATAAA

Data analysis of WGS data

Assembled data

http://cge.cbs.dtu.dk/services/all.php

Data analysis of WGS data

Assembled data

Species confirmation

KmerFinder SpeciesFinder

Breaks the genome into small 16-mers (k=16) and scans a DB of complete

genomes for best match.

Identifies 16S in the genome and compare to a

database of 16S sequences.

A sorted list of the number of best-matching 16-mers

in a given complete genome (hits in sequence)

The best hit. A “TRUE” value means perfect hit, a “FAIL” value means close

match.

Description

Output

http://cge.cbs.dtu.dk/services/SpeciesFinder/ http://cge.cbs.dtu.dk/services/KmerFinder/

Example: KmerFinder

Example - KmerFinder

Example – Resfinder

VTEC O104:H4 outbreak strain

For publication in: Journal of Antimicrobial Chemotherapy Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing Ea Zankari1,2, Henrik Hasman1, Rolf Sommer Kaas1,2, Anne Mette Seyfarth1, Yvonne Agersø1, Ole Lund2, Mette Voldby Larsen2, Frank M. Aarestrup1,#

200 isolates – reduced to 197 (Salmonella, E. coli, E. faecium, E. faecalis) 3,051 individual susceptibility tests

Table 2. Overview of resistance genes detected in the isolates by ResFinder with an ID ≥ 98.0%

No. of isolates (%)*

Resistance gene S. Typhimurium (n = 49) E. coli (n = 48) E. faecalis (n = 50) E. faecium (n = 50)

Aminoglycoside str 0 (0) 0 (0) 5 (10.0) 1 (2.0)

ant(6)-Ia 0 (0) 0 (0) 18 (36.0) 18 (36.0)

ant(6')-Ii 0 (0) 0 (0) 0 (0) 49 (98.0)

aph(3')-Ia 2 (4.1) 2 (4.2) 0 (0) 0 (0)

aph(3')-Ic 2 (4.1) 0 (0) 0 (0) 0 (0)

aph(3')-III 0 (0) 0 (0) 17 (34.0) 10 (20.0)

aac(6')-aph(2'') 0 (0) 0 (0) 10 (20.0) 0 (0)

strA/strB 19 (38.8) 10 (20.8) 0 (0) 0 (0)

aadA1 5 (10.2) 19 (39.6) 0 (0) 0 (0)

aadA2 2 (4.1) 4 (8.3) 0 (0) 0 (0)

aadA4 0 (0) 1 (2.1) 0 (0) 0 (0)

aadA13 1 (2.0) 0 (0) 0 (0) 0 (0)

Beta-lactam pbp5 0 (0) 0 (0) 0 (0) 49 (98.0)

blaTEM-1 21 (42.9) 12 (25.0) 0 (0) 0 (0)

blaTEM-117 0 (0) 1 (2.1) 0 (0) 0 (0)

blaCTX-M-14 0 (0) 1 (2.1) 0 (0) 0 (0)

blaCARB-2 2 (4.1) 0 (0) 0 (0) 0 (0)

MLS erm(B) 0 (0) 1 (2.1) 25 (50.0) 15 (30.0)

Isa(A) 0 (0) 0 (0) 50 (100.0) 0 (0)

Inu(B) 0 (0) 0 (0) 11 (22.0) 15 (30.0)

msr(C) 0 (0) 0 (0) 0 (0) 44 (88.0)

mph(A) 0 (0) 1 (2.1) 0 (0) 0 (0)

Phenicol catA1 0 (0) 2 (4.2) 0 (0) 0 (0)

floR 2 (4.1) 0 (0) 0 (0) 0 (0)

cmlA1 0 (0) 3 (6.3) 0 (0) 0 (0)

cat(pC194) 0 (0) 0 (0) 0 (0) 1 (2.0)

Sulphonamide sul1 9 (18.4) 8 (16.7) 0 (0) 0 (0)

sul2 20 (40.8) 7 (14.6) 0 (0) 0 (0)

sul3 0 (0) 3 (6.3) 0 (0) 0 (0)

Tetracycline tet(A) 1 (2.0) 11 (22.9) 0 (0) 0 (0)

tet(B) 19 (38.8) 4 (8.3) 0 (0) 0 (0)

tet(G) 2 (4.1) 0 (0) 0 (0) 0 (0)

tet(M) 0 (0) 0 (0) 34 (68.0) 27 (54.0)

tet(L) 0 (0) 0 (0) 24 (48.0) 5 (10.0)

tet(S) 0 (0) 0 (0) 0 (0) 1 (2.0)

tet(O) 0 (0) 0 (0) 0 (0) 1 (2.0)

Trimethoprim dfrA1 1 (2.0) 9 (18.8) 0 (0) 0 (0)

dfrA12 0 (0) 2 (4.2) 0 (0) 0 (0)

dfrA14 1 (2.0) 0 (0) 0 (0) 0 (0)

dfrA21 0 (0) 1 (2.1) 0 (0) 0 (0)

dfrD 0 (0) 0 (0) 0 (0) 1 (2.0)

dfrG 0 (0) 0 (0) 17 (34.0) 0 (0)

Glycopeptide Van-A 0 (0) 0 (0) 0 (0) 1 (2.0) MLS, Macrolide-Lincosamide-StreptograminB, *, Per cent resistance genes was determined by dividing the number of isolates harbouring the gene by the total number of isolates (per species).

Gene blaTEM-1 blaCTX-M-14 blaCARB-2

Salmonella 21 (42.9%) 0 (0%) 1 (2.1%)

E. Coli 12 (25.0%) 1 (2.1%) 2 (4.1%)

Phenotypic

Resistant Susceptible

Resistant 475 7

Susceptible 0 2569

Phenotypic

Resistant Susceptible

Predicted resistant 475 7

Predicted susceptible 16 2553

99.2% concordance

retest

99.8% concordance Spectinomycin in E. coli

Example: VirulenceFinder

Plasmid markers

Gram negative plasmids

Gram positive plasmids 100%

98% 95% 90% 85%

Assembled genome/contigs 454 – single end reads 454 – paired end reads Illumina – single end reads Illumina – paired end reads Ion Torrent SOLiD – single end reads SOLiD – paired end reads SOLiD – mate pair reads

incF plasmid

PlasmidFinder

Workflow with WGS at the clinical laboratory

Modified from Didelot et al., 2012.

4-6 hours

E. coli in Urine samples A: ATCC 8739 reference

_d = Direct sequencing on urine _i = sequencing of isolate from urine

ST409

ST409

SNP tree ST597

ST597

ST227

ST227

Strain KmerFinder SpeciesFinder ResFinder VirulenceFinder PlasmidFinder

C751

24_26

2007-1-12488

E64

Skejby2

https://dl.dropboxusercontent.com/u/51020933/EURL.zip

top related