in a di erent class? - eapr...binomial nomenclature not designed for large amounts of genome data,...
TRANSCRIPT
![Page 1: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/1.jpg)
In A Different Class?Whole genome classification of bacterialpotato pathogens
Leighton Pritchard1,2,3,4
1Information and Computational Sciences,2Weeds, Pests and Diseases,3Centre for Human and Animal Pathogens in the Environment,4Dundee Effector Consortium,The James Hutton Institute, Invergowrie, Dundee, Scotland, DD2 5DA
![Page 2: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/2.jpg)
Acceptable Use Policy
Recording of this talk, taking photos, discussing the content usingemail, Twitter, blogs, etc. is permitted (and encouraged),providing distraction to others is minimised.
These slides will be made available athttp://www.slideshare.net/leightonp
![Page 3: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/3.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 4: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/4.jpg)
Plant-pathogenic Enterobacteria a
aToth et al. (2006) Annu. Rev. Phytopath. doi:10.1146/annurev.phyto.44.070505.143444
Erwinia, Dickeya, Pantoea and Pectobacterium spp.Plant Cell Wall Degrading Enzymes (PCWDEs)
![Page 5: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/5.jpg)
Global threat a
aEPPO Global Database
Dickeya chrysanthemi Dickeya dianthicola
Erwinia amylovora Pantoea ananatis
−50
0
50
−50
0
50
−100 0 100 −100 0 100long
lat
Status
Absent, invalid record
Absent, confirmed by survey
Absent, unreliable record
Absent, no pest record
Absent, pest no longer present
Absent, pest eradicated
Present, no details
Present, few occurrences
Present, restricted distribution
Present, widespread
![Page 6: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/6.jpg)
A Tangled Taxonomy a b
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
bWilliams et al. (2010) J. Bact. doi:10.1128/JB.01480-09
Enterobacterial taxonomy difficult to resolve, in generalSoft rot enterobacteria (SRE): Pectobacterium and Dickeya
Historical classification mostly polyphasic/phenotypic
SRE originally Erwinia spp., now three distinct genera(Dickeya, Pectobacterium, Erwinia)
Pectobacterium spp. used to be E. carotovora (and E.chrysanthemi)
Dickeya spp. used to be P. chrysanthemi
Not suitable for facultative bacteria?
Binomial nomenclature not designed for large amounts of genomedata, or organised metadata curation
![Page 7: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/7.jpg)
Tangled Taxonomy of SREa
aCzajkowski et al. (2015) Ann. Appl. Biol. doi:10.1111/aab.12166
Old names hold over in the literature, collections, etc.Name discontinuities affect analysis, databases
![Page 8: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/8.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 9: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/9.jpg)
Dickeya spp. moving across Europe a b
aToth et al. (2011) Plant Path. doi:10.1111/j.1365-3059.2011.02427.x
bParkinson et al. (2015) Eur. J. Plant Path. doi:10.1007/s10658-014-0523-5
D. dianthicola is established across EuropeD. solani is an emerging, encroaching threat
![Page 10: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/10.jpg)
Legislation a
aPritchard et al. (2015) Anal. Methods doi:10.1039/c5ay02550h
European and Mediterranean Plant Protection Organisation(EPPO)
Member states should regulate D. dianthicola and E. amylovora asquarantine pests (A2 list)
Seed Potatoes (Scotland) Amendment Regulations (2010)
Zero tolerance policy for all Dickeya spp. on potatoes in Scotlandto ensure production of ‘clean’ (disease-free) seed potatoproduction for export
EUPHRESCO consortium
: Control and epidemiology across Europe
![Page 11: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/11.jpg)
Legislation by taxonomy a
aPritchard et al. (2015) Anal. Methods doi:10.1039/c5ay02550h
“Easy” to incorporate binomial nomenclature into legislation1
Assumption: taxonomy can be determined precisely
Taxonomy is a human-imposed hierarchical classification that trulyreflects nature
Assumption: taxonomy is a proxy for disease risk
Sharing a common ancestor with another pathogen is the primaryfactor that influences virulence
1(easy for policy-makers to understand)
![Page 12: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/12.jpg)
Issues with legislating by taxonomy a
aPritchard et al. (2015) Anal. Methods doi:10.1039/c5ay02550h
Is current bacterial taxonomy objective and correct?
Taxonomy is ‘vertical’, but pathogenicity may be ‘laterally’transferred (plasmid/transposon-borne, etc.) 2
Is a species concept even relevant for bacteria?
Mapping from taxonomy to phenotype is not one-to-one 3
Testing for disease phenotypes not exhaustive (facultativepathogens)
Relationship between genome and disease/risk not fullyunderstood
2Toth et al. (2006) Ann. Rev. Phyto. doi:10.1146/annurev.phyto.44.070505.143444
3Deans et al. (2015) PLoS Biol. doi:10.1371/journal.pbio.1002033
![Page 13: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/13.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 14: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/14.jpg)
Dickeya qPCR diagnostics a b c
aPritchard et al. (2013) Plant Path. doi:10.1111/j.1365-3059.2012.02678.x
bPritchard et al. (2013) Genome Ann. doi:10.1128/genomeA.00087-12
cPritchard et al. (2013) Genome Ann. doi:10.1128/genomeA.00978-13
To legislate on or quarantine contaminated materials, onehas to be able to identify and discriminate the pathogen
Having sequenced 25 Dickeya isolates, we were approached todevelop diagnostics at the species/isolate level
qPCR is cheaper, quicker and easier than bacterial genomesequencing (for now, anyway. . . ) 4
No qPCR primers existed to distinguish among Dickeya spp.
4Czajkowski et al. (2015) Ann. Appl. Biol. doi:10.1111/aab.12166
![Page 15: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/15.jpg)
qPCR Primer Design a
aPritchard et al. (2012) PLoS One doi:10.1371/journal.pone.0034498
1 Bulk predict primer sets on all chromosomes (Primer3)2 Predict cross-amplification in silico (primersearch)3 Evaluate in vitro against panel of previously “unseen” isolates
of known class and report performance metrics
targets
o�-targets
classi�cation
V
IV
III
II
I
genomes
IIIIIIIVV
![Page 16: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/16.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 17: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/17.jpg)
Classification is a problem! a
aPritchard et al. (2013) Plant Path. doi:10.1111/j.1365-3059.2012.02678.x
First qPCR design gave no diagnostic primers for several Dickeya!
5
Misassigned species in GenBank made ‘training’ impossible.5
ML tree, recA
![Page 18: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/18.jpg)
Consequences of misclassification a b
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
bVarghese et al. (2015) Nucl. Acids Res. doi:10.1093/nar/gkv657
Real-world consequences
False positives (type I errors):clean samples rejected: economic costfarms quarantined/close: economic/societal cost
False negatives (type II errors):(irreversible) introduction of infectious materialpotential for novel host jumps and spread
“Gold-standard”, correctly classified training and test sets essentialto estimate classifier error rates.
MiSI: 18% of NCBI bacterial genomes misclassified at species level
Accurate classification is essential!
![Page 19: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/19.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 20: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/20.jpg)
DNA-DNA hybridisation a b
aMorello-Mora and Amann (2001) FEMS Micro. Rev. doi:10.1016/S0168-6445(00)00040-1
bChan et al (2012) BMC Microbiol. doi:10.1186/1471-2180-12-302
“Gold Standard” forprokaryotic taxonomy,since 1960s. “70%identity ≈ same species.”
Denature DNA from twoorganisms.
Allow to anneal.Reassociation ≈ similarity,measured as ∆T ofdenaturation curves.
Proxy for sequence similarity - replace with genome analysis?
![Page 21: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/21.jpg)
Average Nucleotide Identity (ANIm) a
aRichter and Rossello-Mora (2009) Proc. Natl. Acad. Sci. USA doi:10.1073/pnas.0906412106
1. Align genomes(MUMmer)
2. ANIm: Mean% identity of allhomologousregion matches
DDH:ANIm“linear”
70%ID ≈95%ANIm
![Page 22: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/22.jpg)
ANIm
ANIm is. . .
straightforward to apply to genomes
average identity of all ‘homologous’ regions
not dependent on dataset composition (unlike hierarchicalclustering)
(just) another pairwise distance measure
approximate limiting case of MLST/MLSA/multigenecomparisons
![Page 23: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/23.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 24: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/24.jpg)
pyani software a
ahttp://widdowquinn.github.io/pyani
Python ANImodule andscript
Activedevelopment
Paralleliseson clusters
Preprintcoming soon(simulatedgenomes; lotsof bacterialtaxonomy!)
![Page 25: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/25.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 26: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/26.jpg)
34 Dickeya spp. ANIm a
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
Ninespecies-levelgroups (twonovel)
Correctlyplaces threespeciesmisidentifiedin GenBank
Dic
keya_s
ola
ni_
GB
BC
2040
Dic
keya_s
ola
ni_
IPO
2222
Dic
keya_s
ola
ni_
MK
10
Dic
keya_s
ola
ni_
MK
16
Dic
keya_s
ola
ni_
AM
YI0
1D
icke
ya_s
ola
ni_
AM
WE01
Dic
keya_d
ianth
icola
_NC
PPB
_3534
Dic
keya_d
ianth
icola
_GB
BC
2039
Dic
keya_d
ianth
icola
_NC
PPB
_453
Dic
keya_d
ianth
icola
_IPO
980
Dic
keya_s
pp_N
CPPB
_3274
Dic
keya_s
pp_M
K7
Dic
keya_d
adanti
i_N
CPPB
_2976
Dic
keya_d
adanti
i_3937_u
id52537
Dic
keya_d
adanti
i_N
CPPB
_3537
Dic
keya_d
adanti
i_N
CPPB
_898
Dic
keya_z
eae_A
PM
V01
Dic
keya_z
eae_A
JVN
01
Dic
keya_z
eae_N
CPPB
_3531
Dic
keya_z
eae_C
SL_
RW
192
Dic
keya_d
adanti
i_Ech
586_u
id42519
Dic
keya_z
eae_A
PW
M01
Dic
keya_z
eae_M
K19
Dic
keya_z
eae_N
CPPB
_3532
Dic
keya_z
eae_N
CPPB
_2538
Dic
keya_s
pp_N
CPPB
_569
Dic
keya_c
hry
santh
am
i_N
CPPB
_402
Dic
keya_c
hry
santh
am
i_N
CPPB
_516
Dic
keya_z
eae_E
ch1591_u
id59297
Dic
keya_c
hry
santh
am
i_N
CPPB
_3533
Dic
keya_a
quati
ca_D
W_0
440
Dic
keya_a
quati
ca_C
SL_
RW
240
Dic
keya_d
adanti
i_Ech
703_u
id59363
Dic
keya_p
ara
dis
iaca
_NC
PPB
_2511
Dickeya_solani_GBBC2040Dickeya_solani_IPO2222Dickeya_solani_MK10Dickeya_solani_MK16Dickeya_solani_AMYI01Dickeya_solani_AMWE01Dickeya_dianthicola_NCPPB_3534Dickeya_dianthicola_GBBC2039Dickeya_dianthicola_NCPPB_453Dickeya_dianthicola_IPO980Dickeya_spp_NCPPB_3274Dickeya_spp_MK7Dickeya_dadantii_NCPPB_2976Dickeya_dadantii_3937_uid52537Dickeya_dadantii_NCPPB_3537Dickeya_dadantii_NCPPB_898Dickeya_zeae_APMV01Dickeya_zeae_AJVN01Dickeya_zeae_NCPPB_3531Dickeya_zeae_CSL_RW192Dickeya_dadantii_Ech586_uid42519Dickeya_zeae_APWM01Dickeya_zeae_MK19Dickeya_zeae_NCPPB_3532Dickeya_zeae_NCPPB_2538Dickeya_spp_NCPPB_569Dickeya_chrysanthami_NCPPB_402Dickeya_chrysanthami_NCPPB_516Dickeya_zeae_Ech1591_uid59297Dickeya_chrysanthami_NCPPB_3533Dickeya_aquatica_DW_0440Dickeya_aquatica_CSL_RW240Dickeya_dadantii_Ech703_uid59363Dickeya_paradisiaca_NCPPB_2511
0.00
0.25
0.50
0.75
1.00
AN
Im_p
erc
enta
ge_i
denti
ty
![Page 27: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/27.jpg)
55 Pectobacterium spp. ANIm a
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
Tenspecies-levelgroups (fournovel)
P. carotovorum
split: several
species - no
clean Pcb/Pcc
split
P. wasabiae
split: two
species
P. a
tros
eptic
um S
CR
I104
3P
. atr
osep
ticum
NC
PP
B 3
404
P. a
tros
eptic
um J
G10
08
P. a
tros
eptic
um 2
1AP
. atr
osep
ticum
CF
BP
627
6P
. atr
osep
ticum
NC
PP
B 5
49P
. atr
osep
ticum
ICM
P 1
526
P. c
arot
ovor
um P
C1
P. c
arot
ovor
um U
GC
32P
. bet
avas
culo
rum
NC
PP
B 2
793
P. b
etav
ascu
loru
m N
CP
PB
279
5P
. car
otov
orum
M02
2P
. was
abia
e C
FB
P 3
304
P. w
asab
iae
NC
PP
B 3
701
P. w
asab
iae
NC
PP
B37
02P
. was
abia
e C
FIA
1002
P. w
asab
iae
WP
P16
3P
. was
abia
e R
NS
08.4
2.1A
P. s
p. S
CC
3193
SC
C31
93P
. car
otov
orum
BC
D6
P. c
arot
ovor
um Y
C D
49P
. car
otov
orum
BC
S2
P. c
arot
ovor
um Y
C D
29P
. car
otov
orum
YC
D65
P. c
arot
ovor
um C
FIA
1001
P. c
arot
ovor
um P
CC
21P
. car
otov
orum
YC
D46
P. c
arot
ovor
um Y
C T
31P
. car
otov
orum
YC
D62
P. c
arot
ovor
um Y
C T
3P
. car
otov
orum
CF
IA10
09P
. car
otov
orum
YC
D52
P. c
arot
ovor
um Y
C D
21P
. car
otov
orum
YC
D64
P. c
arot
ovor
um Y
C D
60P
. car
otov
orum
CF
IA10
33P
. car
otov
orum
PB
R16
92P
. car
otov
orum
LM
G 2
1371
P. c
arot
ovor
um B
D25
5P
. car
otov
orum
ICM
P 1
9477
P. c
arot
ovor
um L
MG
213
72P
. car
otov
orum
KK
H3
P. c
arot
ovor
um N
CP
PB
3841
P. c
arot
ovor
um N
CP
PB
383
9P
. car
otov
orum
BC
S7
P. c
arot
ovor
um Y
C T
1P
. car
otov
orum
NC
PP
B 3
395
P. c
arot
ovor
um Y
C D
57P
. car
otov
orum
BC
T2
P. c
arot
ovor
um IC
MP
570
2P
. car
otov
orum
NC
PP
B 3
12P
. car
otov
orum
YC
D16
P. c
arot
ovor
um Y
C T
39P
. car
otov
orum
WP
P14
P. c
arot
ovor
um B
C T
5
P. atrosepticum SCRI1043P. atrosepticum NCPPB 3404P. atrosepticum JG1008P. atrosepticum 21AP. atrosepticum CFBP 6276P. atrosepticum NCPPB 549P. atrosepticum ICMP 1526P. carotovorum PC1P. carotovorum UGC32P. betavasculorum NCPPB 2793P. betavasculorum NCPPB 2795P. carotovorum M022P. wasabiae CFBP 3304P. wasabiae NCPPB 3701P. wasabiae NCPPB3702P. wasabiae CFIA1002P. wasabiae WPP163P. wasabiae RNS08.42.1AP. sp. SCC3193 SCC3193P. carotovorum BC D6P. carotovorum YC D49P. carotovorum BC S2P. carotovorum YC D29P. carotovorum YC D65P. carotovorum CFIA1001P. carotovorum PCC21P. carotovorum YC D46P. carotovorum YC T31P. carotovorum YC D62P. carotovorum YC T3P. carotovorum CFIA1009P. carotovorum YC D52P. carotovorum YC D21P. carotovorum YC D64P. carotovorum YC D60P. carotovorum CFIA1033P. carotovorum PBR1692P. carotovorum LMG 21371P. carotovorum BD255P. carotovorum ICMP 19477P. carotovorum LMG 21372P. carotovorum KKH3P. carotovorum NCPPB3841P. carotovorum NCPPB 3839P. carotovorum BC S7P. carotovorum YC T1P. carotovorum NCPPB 3395P. carotovorum YC D57P. carotovorum BC T2P. carotovorum ICMP 5702P. carotovorum NCPPB 312P. carotovorum YC D16P. carotovorum YC T39P. carotovorum WPP14P. carotovorum BC T5
0.00
0.25
0.50
0.75
1.00
AN
Im_p
erce
ntag
e_id
entit
y
![Page 28: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/28.jpg)
Interpreting ANIm a
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
Criticisms of ANIm
95% threshold ‘arbitrary’
Similarity classification, not phylogenetic reconstruction
No functional (or gene-based) interpretation of risk (cf.pangenome classification and analysis)
ANIm only considers ‘homologous’ regions
define ‘homologous’
be misled by HGT/LGT - low total extent of homology?
is homology phenotypically significant for risk assessment?
Coverage plots help interpretation: exclude HGT/LGT bias.
![Page 29: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/29.jpg)
55 Pectobacterium spp. ANIma
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
All isolatesalign over>50% ofwholegenome
P. c
arot
ovor
um Y
C T
31P
. car
otov
orum
YC
D21
P. c
arot
ovor
um Y
C T
3P
. car
otov
orum
CF
IA10
09P
. car
otov
orum
YC
D64
P. c
arot
ovor
um Y
C D
52P
. car
otov
orum
YC
D62
P. c
arot
ovor
um Y
C D
60P
. car
otov
orum
YC
D49
P. c
arot
ovor
um B
C D
6P
. car
otov
orum
YC
D65
P. c
arot
ovor
um Y
C D
46P
. car
otov
orum
BC
S2
P. c
arot
ovor
um Y
C D
29P
. car
otov
orum
PC
C21
P. c
arot
ovor
um C
FIA
1001
P. c
arot
ovor
um Y
C T
1P
. car
otov
orum
NC
PP
B 3
395
P. c
arot
ovor
um U
GC
32P
. car
otov
orum
KK
H3
P. c
arot
ovor
um B
C S
7P
. car
otov
orum
ICM
P 5
702
P. c
arot
ovor
um N
CP
PB
312
P. c
arot
ovor
um B
C T
2P
. car
otov
orum
YC
D16
P. c
arot
ovor
um Y
C T
39P
. car
otov
orum
YC
D57
P. c
arot
ovor
um W
PP
14P
. car
otov
orum
BC
T5
P. c
arot
ovor
um C
FIA
1033
P. c
arot
ovor
um P
C1
P. c
arot
ovor
um L
MG
213
71P
. car
otov
orum
BD
255
P. c
arot
ovor
um P
BR
1692
P. c
arot
ovor
um IC
MP
194
77P
. car
otov
orum
LM
G 2
1372
P. w
asab
iae
CF
BP
330
4P
. was
abia
e N
CP
PB
370
1P
. was
abia
e N
CP
PB
3702
P. s
p. S
CC
3193
SC
C31
93P
. was
abia
e W
PP
163
P. w
asab
iae
RN
S08
.42.
1AP
. was
abia
e C
FIA
1002
P. a
tros
eptic
um C
FB
P 6
276
P. a
tros
eptic
um N
CP
PB
340
4P
. atr
osep
ticum
NC
PP
B 5
49P
. atr
osep
ticum
ICM
P 1
526
P. a
tros
eptic
um S
CR
I104
3P
. atr
osep
ticum
JG
100
8P
. atr
osep
ticum
21A
P. c
arot
ovor
um N
CP
PB
3841
P. c
arot
ovor
um N
CP
PB
383
9P
. car
otov
orum
M02
2P
. bet
avas
culo
rum
NC
PP
B 2
793
P. b
etav
ascu
loru
m N
CP
PB
279
5
P. carotovorum M022P. carotovorum YC T1P. carotovorum KKH3P. carotovorum UGC32P. carotovorum CFIA1033P. carotovorum NCPPB3841P. carotovorum NCPPB 3839P. carotovorum BC S7P. carotovorum ICMP 19477P. carotovorum BD255P. carotovorum LMG 21372P. carotovorum PBR1692P. carotovorum LMG 21371P. carotovorum PC1P. carotovorum ICMP 5702P. carotovorum NCPPB 312P. carotovorum YC D57P. carotovorum BC T2P. carotovorum YC D16P. carotovorum WPP14P. carotovorum BC T5P. carotovorum YC D62P. carotovorum YC T31P. carotovorum YC D52P. carotovorum YC T3P. carotovorum YC D64P. carotovorum YC D21P. carotovorum YC D60P. carotovorum YC T39P. carotovorum CFIA1001P. carotovorum YC D46P. carotovorum YC D49P. carotovorum BC D6P. carotovorum YC D65P. carotovorum BC S2P. carotovorum YC D29P. carotovorum PCC21P. carotovorum CFIA1009P. atrosepticum ICMP 1526P. atrosepticum CFBP 6276P. atrosepticum SCRI1043P. atrosepticum NCPPB 3404P. atrosepticum NCPPB 549P. atrosepticum JG1008P. atrosepticum 21AP. wasabiae WPP163P. sp. SCC3193 SCC3193P. wasabiae RNS08.42.1AP. wasabiae CFIA1002P. wasabiae CFBP 3304P. wasabiae NCPPB 3701P. wasabiae NCPPB3702P. carotovorum NCPPB 3395P. betavasculorum NCPPB 2793P. betavasculorum NCPPB 2795
0.00
0.25
0.50
0.75
1.00
AN
Im_a
lignm
ent_
cove
rage
![Page 30: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/30.jpg)
34 Dickeya spp. ANIma
aPritchard et al. (2016) Anal. Methods doi:10.1039/c5ay02550h
Most isolatesalign over>50% ofwholegenome
Community:two outlierspecies arequestionableassignmentsas Dickeya
Dic
keya_d
adanti
i_Ech
703_u
id59363
Dic
keya_p
ara
dis
iaca
_NC
PPB
_2511
Dic
keya_a
quati
ca_D
W_0
440
Dic
keya_a
quati
ca_C
SL_
RW
240
Dic
keya_s
ola
ni_
MK
10
Dic
keya_s
ola
ni_
AM
YI0
1D
icke
ya_s
ola
ni_
AM
WE01
Dic
keya_s
ola
ni_
MK
16
Dic
keya_s
ola
ni_
GB
BC
2040
Dic
keya_s
ola
ni_
IPO
2222
Dic
keya_d
ianth
icola
_NC
PPB
_453
Dic
keya_d
ianth
icola
_GB
BC
2039
Dic
keya_d
ianth
icola
_IPO
980
Dic
keya_d
ianth
icola
_NC
PPB
_3534
Dic
keya_d
adanti
i_N
CPPB
_2976
Dic
keya_d
adanti
i_3937_u
id52537
Dic
keya_s
pp_N
CPPB
_3274
Dic
keya_s
pp_M
K7
Dic
keya_d
adanti
i_N
CPPB
_3537
Dic
keya_d
adanti
i_N
CPPB
_898
Dic
keya_z
eae_A
PM
V01
Dic
keya_z
eae_A
JVN
01
Dic
keya_z
eae_A
PW
M01
Dic
keya_d
adanti
i_Ech
586_u
id42519
Dic
keya_z
eae_C
SL_
RW
192
Dic
keya_z
eae_N
CPPB
_3532
Dic
keya_z
eae_N
CPPB
_2538
Dic
keya_z
eae_N
CPPB
_3531
Dic
keya_z
eae_M
K19
Dic
keya_s
pp_N
CPPB
_569
Dic
keya_c
hry
santh
am
i_N
CPPB
_402
Dic
keya_c
hry
santh
am
i_N
CPPB
_516
Dic
keya_z
eae_E
ch1591_u
id59297
Dic
keya_c
hry
santh
am
i_N
CPPB
_3533
Dickeya_dadantii_Ech703_uid59363Dickeya_paradisiaca_NCPPB_2511Dickeya_aquatica_DW_0440Dickeya_aquatica_CSL_RW240Dickeya_dianthicola_GBBC2039Dickeya_dianthicola_NCPPB_3534Dickeya_dianthicola_NCPPB_453Dickeya_dianthicola_IPO980Dickeya_solani_GBBC2040Dickeya_solani_IPO2222Dickeya_solani_MK10Dickeya_solani_MK16Dickeya_solani_AMYI01Dickeya_solani_AMWE01Dickeya_dadantii_NCPPB_3537Dickeya_dadantii_NCPPB_898Dickeya_spp_NCPPB_3274Dickeya_spp_MK7Dickeya_dadantii_NCPPB_2976Dickeya_dadantii_3937_uid52537Dickeya_zeae_APMV01Dickeya_zeae_AJVN01Dickeya_dadantii_Ech586_uid42519Dickeya_zeae_APWM01Dickeya_zeae_MK19Dickeya_zeae_NCPPB_3532Dickeya_zeae_NCPPB_2538Dickeya_zeae_NCPPB_3531Dickeya_zeae_CSL_RW192Dickeya_spp_NCPPB_569Dickeya_zeae_Ech1591_uid59297Dickeya_chrysanthami_NCPPB_3533Dickeya_chrysanthami_NCPPB_402Dickeya_chrysanthami_NCPPB_516
0.00
0.25
0.50
0.75
1.00
AN
Im_a
lignm
ent_
covera
ge
![Page 31: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/31.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 32: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/32.jpg)
A new classification scheme a
aBaltrus (2016) Trends Microbiol. doi:10.1016/j.tim.2016.02.004
![Page 33: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/33.jpg)
ANIm graphs
ANIm identity/coverage scores define networks (143 genomes)
![Page 34: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/34.jpg)
Natural clusterings
Natural clusterings occur in the data - ‘cliques’‘Clique’ membership varies with %identity‘Clique’ membership at given %identity a permanent classification
0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00%identity
0
10
20
30
40
50
60
conf
used
clu
ster
mem
bers
Cluster confusion
0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00%identity
0
5
10
15
20
25
30
35
40Natural clusterings
‘Genus’, ‘species’, ‘subspecies’ and ‘clonal’ genomotypes indicated
![Page 35: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/35.jpg)
Genus genomotype
17 genus-level genomotypes indicated
![Page 36: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/36.jpg)
Genus reclassification
Erwinia splits into 13 genomotypesDickeya splits into 3 genomotypes
![Page 37: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/37.jpg)
Species genomotype
39 species-level genomotypes indicated
![Page 38: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/38.jpg)
Species genomotype
Most species classifications remain unaffectedP. carotovorum subspecies confusion/splits
![Page 39: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/39.jpg)
Suggested reclassifications
![Page 40: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/40.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 41: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/41.jpg)
Final thoughts
Diagnostics and large-scale analyses need accurateclassification
Historical collections/public databases have inaccuracies
Bacterial taxonomy can be messy!
Whole-genome classification with ANI works
Retrospective sequencing and classification: clean things up
Misclassification and hidden diversity may be difficult news formetagenomics. . .
Accurate MinION classification in-the-field with ANIm ispossible?
![Page 42: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/42.jpg)
Table of Contents
1 IntroductionA Tangled TaxonomyThe Insidious Dickeya Menace
2 Diagnostic PrimersqPCR Primer Design From Whole GenomesProblems with Classification
3 ReclassificationANI Are You OK? Are You OK ANI?PYANIANIm of SREsA New Hope
4 ConclusionsFinal thoughtsAcknowledgements
![Page 43: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/43.jpg)
Acknowledgements
Dickeya/Erwinia/PectobacteriumSteve Baeyen (ILVO)Emma Campbell (JHI)Sean Chapman (JHI)John Elphinstone (Fera)Tracey Gloster (St Andrews)Rachel Glover (Fera)Sonia Humphris (JHI)Martine Maes (ILVO)Katrin Mackenzie (BioSS)Neil Parkinson (Fera)Minna Pirhonen (Helsinki)Gerry Saddler (SASA)Elaine Shemilt (Duncan ofJordanstone)Ian Simpson (Edinburgh)Ian Toth (JHI)Jan van der Wolf (Wageningen)Johan van Vaerenbergh (ILVO)Frank Wright (BioSS)Eirini Xemantilotou (StAndrews/JHI)NematodesPeter Cock (JHI)John Jones (JHI/St Andrews)Robbie Rae (John Moores)Peter Thorpe (JHI)
PhytophthoraMiles Armstrong (Dundee)Anna Avrova (JHI)Jim Beynon (Warwick)Paul Birch (Dundee)David Cooke (JHI)Kath Denby (York)Eleanor Gilroy (JHI)Sarah Green (Forest Research)Edgar Huitema (Dundee)Rory McLean (Dundee)Hazel McLellan (Dundee)Sophien Kamoun (TSL)Gail Preston (Oxford)Paul Sharp (Edinburgh)Jens Steinbrenner (Warwick)Pieter van West (Aberdeen)Steve Whisson (JHI)Computational and SystemsBiologyDavid Broadhurst (EdithCowan)Peter Cock (JHI)Mark Dufton (Strathclyde)Roy Goodacre (Manchester)Douglas Kell (Manchester)David Martin (Dundee)Iain Milne (JHI)Pedro Mendes (Manchester)
E. coli/other bacteriaFlorence Abram (Galway)Martina Bielaszewska (Muenster)Fiona Brennan (Galway)Ken Forbes (Aberdeen)Nicola Holden (JHI)Ashleigh Holmes (JHI)Paul Hoskisson (Strathclyde)Helge Karch (Muenster)Norval Strachan (Aberdeen)David Studholme (Exeter)Nick Waters (Galway)Metagenomics/CommunitiesNatalie Ferry (Salford)Thomas Freitag (JHI)Ryan Joynson (Liverpool)John Mitchell (St Andrews)Les Noble (Aberdeen)Jim Prosser (Aberdeen)V Anne Smith (St Andrews)PotatoMicha Bayer (JHI)Glenn Bryan (JHI)Graham Etherington (TSL)Ingo Hein (JHI)Florian Jupe (JHI)Jonathan Jones (TSL)Dan Maclean (TSL). . .and many others. . .
![Page 44: In A Di erent Class? - EAPR...Binomial nomenclature not designed for large amounts of genome data, or organised metadata curation Tangled Taxonomy of SREa a Czajkowski et al. (2015)](https://reader033.vdocument.in/reader033/viewer/2022050400/5f7de59c93ebf170d245874a/html5/thumbnails/44.jpg)
Licence: CC-BY-SA
By: Leighton Pritchard
This presentation is licensed under the Creative CommonsAttribution ShareAlike licensehttps://creativecommons.org/licenses/by-sa/4.0/