whole genome sequencing analysis for top seven shiga toxin … · 2019. 7. 31. · whole genome...

2
RESULTS (Continued) Table 1. Comparison of wgMLST and wgSNP wgMLST Allelic Differences wgSNP Analysis with Strict Filtering (Ref Genome: E coli O157:H7 str. Sakai: NCBI - NC_002695.2) METHODS (Continued) DNA Extraction and Sequencing Data Analysis Copyright © 2019 Mérieux NutriSciences. All Rights Reserved. Copying, displaying, downloading, distributing, modifying or reproducing information contained in this document or any portion thereof in any electronic medium or in hard copy, or creating any derivative work based on such documents, is prohibited without the express written consent of Mérieux NutriSciences. July 2019 INTRODUCTION Escherichia coli O157:H7 and six other Shiga toxin-producing E. coli (STEC) serogroups O26, O45, O103, O111, O121, and O145 are often referred to as the Top 7 STEC. Subtyping of STEC is important for outbreak investigation. PFGE, MLVA, and rep-PCR are some of the traditional investigation methods but they are either time consuming or of low resolution. Whole genome sequencing (WGS) technology has been widely applied to speciation, subtyping, and distinguishing closely related strains. The performance of WGS must be examined to ensure appropriate implementation for STEC. Whole Genome Sequencing Analysis for Top Seven Shiga Toxin-producing Escherichia coli Authors: Jiaojie Zheng 1 , Sarita Raengpradub Wheeler 1 , Xuwen Wieneke 1 , Timothy Freier 1 1 Mérieux NutriSciences, Chicago, Illinois, USA International Association for Food Protection Annual Meeting July 21-24, 2019 | Louisville, KY OBJECTIVE To assess the impact of selecting wgMLST or wgSNP pipelines on differentiation of Top 7 STEC strains RESULTS Sequence Data Quality All samples except sample ET18 passed quality control. ET18, which had low average genome coverage (15), was omitted and not analyzed further. wgMLST and wgSNP Analyses The wgMLST similarity coefficient dendrogram (Figure 1) shows the relationship of 30 STEC samples and 5 NCBI genomes (similarity percentage showed on the nodes) Within the Top 7 STEC groups, any two strains with different serotypes shared no more than 88.1% similarity. O157:H7 strains were well distinguished from other non-O157: H7 strains with only 9.7% similarity to other STEC. Six O157:H7 strains shared at least 98.2% similarity. Two groups of two O121 strains had over 98% similarity. Table 1 shows the comparison of wgMLST and wgSNP analyzing strains with the same serotype. Both wgMLST and wgSNP were able to further differentiate strains with the same serotype. The results from two methods agreed to each other, despite the numeric differences. One difference between the two methods is wgMLST is only available for genus with established database while wgSNP requires a reference genome. Isolate Preparation: Cultured confirmed isolates on TSA DNA Extraction: Qiagen DNeasy UltraClean microbial kit Library Preparation: Illumina Nextera XT library preparation kit Sequencing: Illumina MiSeq sequencer CONCLUSIONS wgMLST similarity coefficient clustering (showing percent similarity) shows the relatedness among all strains and is able to separate Top 7 STEC strains belonging to different serotypes without the need of a reference genome. For strains of the same serotype, both wgMLST (showing allelic differences) and wgSNP (showing number of SNPs) analyses can be used to further identify genomic differences between strains. Sequence Data Quality Control Checked raw data quality using BioNumerics software Genome coverage: > 30 Download NCBI genome sequence data to compare E. coli O26: SRR7827094 E. coli O103: SRR7828044 E. coli O111: SRR7827093 E. coli O121: SRR5816151 E. coli O145: SRR7828323 wgMLST Analyzed using BioNumerics software De novo assembly Assembly based calls Assembly free calls wgMLST clustering (UPGMA method) Download NCBI complete genome as reference E. coli O157:H7 str. Sakai: NC_002695.2 wgSNP Analyzed using BioNumerics software Used NCBI complete genome as reference SNP mapping to reference genome wgSNP clustering (UPGMA method) METHODS Isolates A total of 30 E. coli isolates were sequenced and analyzed in this study. The number of isolates for each serogroup are listed below. O26 O45 O103 O111 O121 O145 O157: H7 Other O157 No. of isolates 4 4 4 4 4 1 6 3 Figure 1. wgMLST similarity coefficient dendrogram

Upload: others

Post on 19-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Whole Genome Sequencing Analysis for Top Seven Shiga Toxin … · 2019. 7. 31. · Whole genome sequencing (WGS) technology has been widely applied to speciation, subtyping, and distinguishing

RESULTS (Continued)

Table 1. Comparison of wgMLST and wgSNP

wgMLST Allelic Differences

wgSNP Analysis with

Strict Filtering(Ref Genome: E coli O157:H7 str.

Sakai: NCBI - NC_002695.2)

METHODS (Continued)DNA Extraction and Sequencing

Data Analysis

Copyright © 2019 Mérieux

NutriSciences. All Rights

Reserved. Copying,

displaying, downloading,

distributing, modifying or

reproducing information

contained in this document or

any portion thereof in any

electronic medium or in hard

copy, or creating any

derivative work based on such

documents, is prohibited

without the express written

consent of Mérieux

NutriSciences.

July 2019

INTRODUCTION Escherichia coli O157:H7 and six other Shiga

toxin-producing E. coli (STEC) serogroups O26, O45,

O103, O111, O121, and O145 are often referred to as

the Top 7 STEC.

Subtyping of STEC is important for outbreak

investigation. PFGE, MLVA, and rep-PCR are some

of the traditional investigation methods but they are

either time consuming or of low resolution.

Whole genome sequencing (WGS) technology has

been widely applied to speciation, subtyping, and

distinguishing closely related strains. The

performance of WGS must be examined to ensure

appropriate implementation for STEC.

Whole Genome Sequencing Analysis for Top Seven Shiga Toxin-producing Escherichia coli

Authors: Jiaojie Zheng1, Sarita Raengpradub

Wheeler1, Xuwen Wieneke1, Timothy Freier1

1Mérieux NutriSciences, Chicago, Illinois, USA

International Association for Food Protection

Annual Meeting

July 21-24, 2019 | Louisville, KY

OBJECTIVETo assess the impact of selecting wgMLST or

wgSNP pipelines on differentiation of Top 7 STEC

strains

RESULTSSequence Data Quality All samples except sample ET18 passed quality

control. ET18, which had low average genome

coverage (15), was omitted and not analyzed further.

wgMLST and wgSNP Analyses The wgMLST similarity coefficient dendrogram

(Figure 1) shows the relationship of 30 STEC samples

and 5 NCBI genomes (similarity percentage showed

on the nodes)

• Within the Top 7 STEC groups, any two strains

with different serotypes shared no more than

88.1% similarity.

• O157:H7 strains were well distinguished from

other non-O157: H7 strains with only 9.7%

similarity to other STEC.

• Six O157:H7 strains shared at least 98.2%

similarity. Two groups of two O121 strains had over

98% similarity.

Table 1 shows the comparison of wgMLST and

wgSNP analyzing strains with the same serotype.

• Both wgMLST and wgSNP were able to further

differentiate strains with the same serotype.

• The results from two methods agreed to each

other, despite the numeric differences.

• One difference between the two methods is

wgMLST is only available for genus with

established database while wgSNP requires a

reference genome.

Isolate Preparation:

Cultured confirmed

isolates on TSA

DNA Extraction:

Qiagen DNeasy

UltraClean microbial kit

Library Preparation:

Illumina Nextera XT

library preparation kit

Sequencing:

Illumina MiSeq

sequencer

CONCLUSIONS wgMLST similarity coefficient clustering (showing

percent similarity) shows the relatedness among all

strains and is able to separate Top 7 STEC strains

belonging to different serotypes without the need of a

reference genome.

For strains of the same serotype, both wgMLST

(showing allelic differences) and wgSNP (showing

number of SNPs) analyses can be used to further

identify genomic differences between strains.

Sequence Data Quality Control

● Checked raw data quality using BioNumerics software

● Genome coverage: > 30

● Download NCBI genome sequence data to compare

E. coli O26: SRR7827094

E. coli O103: SRR7828044

E. coli O111: SRR7827093

E. coli O121: SRR5816151

E. coli O145: SRR7828323

wgMLST

● Analyzed using BioNumerics software

● De novo assembly

● Assembly based calls

● Assembly free calls

● wgMLST clustering (UPGMA method)

● Download NCBI complete genome as

reference

E. coli O157:H7 str. Sakai: NC_002695.2

wgSNP

● Analyzed using BioNumerics software

● Used NCBI complete genome as

reference

● SNP mapping to reference genome

● wgSNP clustering (UPGMA method)

METHODSIsolates A total of 30 E. coli isolates were sequenced and

analyzed in this study. The number of isolates for

each serogroup are listed below.

O26 O45 O103 O111 O121 O145O157:

H7

Other

O157

No. of isolates

4 4 4 4 4 1 6 3

Figure 1. wgMLST similarity coefficient dendrogram

Page 2: Whole Genome Sequencing Analysis for Top Seven Shiga Toxin … · 2019. 7. 31. · Whole genome sequencing (WGS) technology has been widely applied to speciation, subtyping, and distinguishing

Whole Genome Sequencing Analysis for Top Seven Shiga Toxin-producing Escherichia coli

Poster

International Association for Food Protection Annual Meeting

Louisville, KY July 21 - 24, 2019

MERIEUX NUTRISCIENCES