peter vandamme

64
Introduction to polyphasic taxonomy Peter Vandamme EUROBILOFILMS - Third European Congress on Microbial Biofilms Ghent, Belgium, 9 - 12 September 2013

Upload: vancong

Post on 11-Feb-2017

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Peter Vandamme

Introduction to polyphasic taxonomy

Peter Vandamme

EUROBILOFILMS - Third European Congress on Microbial Biofilms

Ghent, Belgium, 9 - 12 September 2013

Page 2: Peter Vandamme

http://www.lm.ugent.be/http://www.lm.ugent.be/

Page 3: Peter Vandamme

Content

� The observation of diversity: phenotypic and genotypic

coherence allows to define bacterial species

� Taxonomy and species definitions vary with technology:

old and new practices

� Phenotypic and numerical taxonomyPhenotypic and numerical taxonomy

� DNA

� Phylogeny

� Polyphasic taxonomy

� Whole genome sequences

Page 4: Peter Vandamme

Observation of diversity in species

4

Campylobacter lari whole cell protein patterns

Page 5: Peter Vandamme

Observation of diversity in species

5Campylobacter jejuni RAPD patterns

Page 6: Peter Vandamme

Observation of diversity in species: AFLP

6

Page 7: Peter Vandamme

Origin of diversity: genetic drift

7

Page 8: Peter Vandamme

Evolution

• Growth, genetic drift, physical separation and periods of selection lead to evolution and variation in bacterial genomes

8

– Size & organization

– Content

– Sequence

Page 9: Peter Vandamme

Genome size and organization

� Genome size varies from 580,074 bp (Mycoplasma genitalium) to 9,105,828 bp (Bradyrhizobium japonicum)

9

Page 10: Peter Vandamme

Genome size and organization

� Genome size varies from 580,074 bp (Mycoplasma genitalium) to 9,105,828 bp (Bradyrhizobium japonicum)

� 1 circular chromosome (eg. Escherichia coli 4.6 – 5.4 Mbp)

� Multiple circular chromosomes

10

� eg. Ralstonia solanacearum 3.7 Mbp and 2.1 Mbp ; Burkholderia cenocepacia 3.8 Mbp, 3.2 Mbp en 0.9 Mbp

� 1 linear chromosome (eg. Borrelia burgdorferi 0.9 Mbp)

� 1 linear and 1 circular chromosome (eg. Agrobacterium tumefaciens 2.8 en 2.1 Mbp)

Page 11: Peter Vandamme

Variability in gene content

11

• Venn diagram showing core and accessory genes for Streptococcus species. The surfaces are approximately proportional to the number of genes (Lefébure and Stanhope 2007 Genome Biol. 8: R71)

Page 12: Peter Vandamme

Variability in gene content

12

• Venn diagram showing core and accessory genes for Streptococcus species. The surfaces are approximately proportional to the number of genes (Lefébure and Stanhope 2007 Genome Biol. 8: R71)

Page 13: Peter Vandamme

The number of genes two genomes have in common depends on their evolutionary distance

Gene contentGene contentGene contentGene content

13Avg. no. of nucleotide substitutions/site for 16S rRNA

Fraction of shared genes

Page 14: Peter Vandamme

The species core and pan-genome

14

Page 15: Peter Vandamme

Fig. 2. GBS core genome

15Copyright ©2005 by the National Academy of Sciences

Tettelin et al. (2005) Proc. Natl. Acad. Sci. USA 102, 13950-13955

Page 16: Peter Vandamme

Fig. 3. GBS pan-genome

16Copyright ©2005 by the National Academy of Sciences

Tettelin et al. (2005) Proc. Natl. Acad. Sci. USA 102, 13950-13955

Page 17: Peter Vandamme

Lefébure et al. 2010: WGS of 96 C. coli and C. jejuni strains

The two species have a similar pan-genome size; however, C. coli has acquired a larger core genome and each species has evolved a number of species-specific core genes, possibly reflecting different adaptive strategies, in spite of their occurrence in the same niche (the gastrointestinal tract of several hosts).

17

hosts).

Recombination within the core genome is frequent within species, rare between sister species, and extremely rare with other species.

Both species’ pan-genomes underwent unique and cohesive features defining their genomic identity.

Page 18: Peter Vandamme

Difference in sequence?

• Relative occurrence of di-, tri-, tetra- (…) nucleotides: Karlin signatures

• Genes that are shared between organisms can differ considerably in sequence. The percentage sequence divergence in orthologous genes is

18

sequence divergence in orthologous genes is described by the ANI parameter (ANI: average nucleotide identity)

Page 19: Peter Vandamme

19

Page 20: Peter Vandamme

• Genomes seem to be composed of a core set of genes that is conserved among strains of the same species and accessory genes that are strain specific.

• Content and size of core vary with species

• Although it is clear that mechanisms exist for abundant

Variability in gene content

20

• Although it is clear that mechanisms exist for abundant and widespread genetic transfer between microbial lineages, the observation of phenotypic and genotypic clustering argues for genomic stability and cohesion. Especially LGT and recombination are now considered cohesive rather then disruptive forces in bacterial species.

* Konstantinidis and Tiedje. 2005. Genomic insights that advance the species definition for prokaryotes. PNAS 102:2567-2572

Page 21: Peter Vandamme

How is this information used to define bacterial species?

21

Page 22: Peter Vandamme

“...Taxonomy is written by taxonomists for taxonomists; in this form the subject is so dull that few, if any, non-taxonomists are tempted to read it, and presumably even fewer try their hand at it. It is the most

subjective branch of any biological discipline, and in many ways is more of an

art than a science...”

22

art than a science...”

(S. T. Cowan, 1971)

Page 23: Peter Vandamme

The bacterial species concept, definition & taxonomy

• There is a practical need to define bacterial species as a name bears information.

• The approaches used to define bacterial species past and present reflect state-of-the-art in

23

past and present reflect state-of-the-art in science and technology.

• The observation of phenotypic and genotypic clustering argues for genomic stability and cohesion.

• Such clusters could be called species.

Page 24: Peter Vandamme

• Progress in the field of taxonomy has been dominated by technological progress. Initially (until the 1950s), ‘conventional’ bacterial taxonomy placed heavy emphasis on analyses of

The bacterial species concept, definition & taxonomy

24

taxonomy placed heavy emphasis on analyses of phenotypic properties of the organism.

• To define and identify an organism, one must assess several of its phenotypic properties, from general to specific.

Page 25: Peter Vandamme

Phenotypic characterisationPhenotypic characterisation

Page 26: Peter Vandamme

26

Page 27: Peter Vandamme

Numerical taxonomy

• In the 1950s – 1960s it became evident that the analysis of large numbers of characteristics provided a more stable classification and a superior means to classify and identify bacteria.

27

• First generations of computers were used to analyze large data sets of biochemical and phenotypic characteristics

Page 28: Peter Vandamme

Discovery of the secret of life

• DNA was used to classify bacteria!

• Determining the guanine plus cytosine base ratio (GC ratio) of the DNA of

28

(GC ratio) of the DNA of the organism can be part of this process.

Page 29: Peter Vandamme

DNA-DNA hybridisation

• Single stranded whole genomic DNA of two strains is hybridised. The thermal stability of the obtained heterologous hybrid

29

heterologous hybrid (expressed as a percentage value) is a measure for whole genome sequence similarity.

Page 30: Peter Vandamme

• The complete genome should be the reference standard to determine phylogeny and taxonomy

• Pending routine access to whole genome sequences, measuring the thermal stability between two genomes,

Ad Hoc Committees on Reconciliation of

Approaches to Bacterial Systematics(Wayne et al. 1987 – TC [08/09/2013]:3,261)

30

measuring the thermal stability between two genomes, through DNA-DNA hybridization represents the best indirect assessment of the level of whole genome sequence similarity

• The phylogenetic definition of coherent phenotypic clusters, called species, generally would include strains with at least 60 - 70% DNA-DNA hybridization

Page 31: Peter Vandamme

What about phylogeny?

• DNA-DNA hybridisations between organisms considered closely related very often yielded low DNA-DNA hybridisation values, just like DNA-DNA hybridisations between completely different bacteria.

• Perhaps, if evolution of the whole genome can not be measured, similarities in more conserved parts of the

31

measured, similarities in more conserved parts of the genome might be more accessible?

• A gene encoding a highly conserved function (chronometer) might be a good target: rRNA genes???

• DNA-rRNA hybridisations provided a framework of five rRNA superfamilies which corresponded with the five subdivisions in the Proteobacteria.

• Technological progress allowed ‘isolation’ and sequence analysis of conserved genes.

Page 32: Peter Vandamme

• The most widely used molecular clocks (‘single locus appraoches’ are small subunit ribosomal RNA (SSU rRNA) genes – Found in all domains of life (not the case with other

chronometers)

• 16S rRNA in prokaryotes and 18S rRNA in eukaryotes

Molecular clocks (chronometers)

32

• 16S rRNA in prokaryotes and 18S rRNA in eukaryotes

– Functionally constant

– Sufficiently conserved (change slowly) with variable regions (V1-

V9), but too conserved to discriminate between closely related

species

– Sufficient length

– Without (?) lateral gene transfer or recombination: differences

should be primarily caused by point mutation, such that the

number of nucleotide differences correlates with the number of

changes through evolution

Page 33: Peter Vandamme

•The Ribosomal Database Project (RDP) •A large collection of rRNA sequences•Provides a variety of analytical programs

Ribosomal Database Project

33

•Provides a variety of analytical programs

• RDP Release 10, Update 32: May 14, 2013: 2,765,278 16S rRNAs

• http://rdp.cme.msu.edu/

Page 34: Peter Vandamme

• Phylogenetic trees reflecting similarity in ribosomal RNA sequences, but assumed to reflect organismal phylogeny have now been prepared for all the major prokaryotic and eukaryotic groups.

34

eukaryotic groups.

Page 35: Peter Vandamme

'The All-Species Living Tree' Project

• Public databases accumulated poor quality and erroneously annotated sequences.

• The need for curated databases!

• http://www.arb-silva.de/projects/living-tree/

35

• http://www.arb-silva.de/projects/living-tree/

Page 36: Peter Vandamme

16S rRNA sequence analysis: advantages

• There are several technological and scientific advantages for using 16S rRNA genes sequences for studying the phylogeny of bacteria. The main assets are:

36

• The availability of a near-universal database

• The availability of highly conserved 16S rRNA primers

Page 37: Peter Vandamme

16S rRNA sequence analysis: caveats

• Often insufficient

diversity to distinguish

closely related species

(Fox et al., 1992. How close is

close: 16S rRNA sequence

identity may not be sufficient

37

identity may not be sufficient

to guarantee species identity).

Page 38: Peter Vandamme

16S rRNA sequence analysis: caveats

• Often insufficient diversity to distinguish closely related species (Fox et al., 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity).

38

• Often too much diversity within species:– 2.5-3% (Stackebrandt and Goebel. 1994. Taxonomic

note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology)

– 4-5% in 16S rRNA genes of epsilon proteobacteria

Page 39: Peter Vandamme

Limits of 16S rRNA basedphylogeny

3939

Page 40: Peter Vandamme

16S rRNA sequence analysis: caveats

• Often insufficient diversity to distinguish closely related species(Fox et al., 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity).

• Often too much diversity within species:

– 2.5-3% (Stackebrandt and Goebel. 1994. Taxonomic

40

– 2.5-3% (Stackebrandt and Goebel. 1994. Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology)

– 4-5% in 16S rRNA genes of epsilon proteobacteria

• Tentative representation of the phylogeny of closely related bacteria

Page 41: Peter Vandamme

• Chronometers: (genes of)

– ribosomal proteins and RNAs

– Cytochrome

– Fe-S proteins (e.g. ferredoxins)

Evolutionary relationships of prokaryotes

41

– Fe-S proteins (e.g. ferredoxins)

– ATPase (synthesis/hydrolysis of ATP)

– recA (recombination protein)

– gyrB, groEL, rpoB...

Page 42: Peter Vandamme

Analysis of other chronometers to study phylogeny of bacteria?

• Pro: the less conserved nature of these genes facilitates a higher taxonomic resolution between closely related bacteria

42

• Con:

– Not universally present

– No universal databases

– Development of universal primers proved impossible

– Interference of recombination and lateral gene transfer

Page 43: Peter Vandamme

Limits of recA based phylogeny

43

Page 44: Peter Vandamme

• There is no single molecule that represents all organismal relationships adequately.

• Different molecules carry different types of information.

Polyphasic taxonomy

44

• A wealth of other methods was developed which were, just like the original biochemical tests, used to classify and identify bacteria. All of these methods carried some information that could be used as indirect measure of whole genome similarity between isolates.

Page 45: Peter Vandamme

Chemotaxonomy - Respiratoryquinones

45

Page 46: Peter Vandamme

Chemotaxonomy - Phospholipidanalysis

46

Page 47: Peter Vandamme

Chemotaxonomy - Polyamineanalysis

47

Page 48: Peter Vandamme

Chemotaxonomy - Whole cellfatty acids

48

Page 49: Peter Vandamme

SDS-PAGE en DNA-DNA hybridisatie (Azospirillum)

Au 2LMG 7108T

Au 5100 2396 22

A. h

alo

pa

efe

ren

s%DNA-binding7108T 2787T

Whole-cell protein electrophoresis: Azospirillum

49

Au 7Au 9

Au 10Au 11Au 12

DSM 2787T

Y 13Y 9

ATCC29145T

SpBr17

96 22

97 2093 15

22 10024 6321 7024 1819 9

A. h

alo

pa

efe

ren

s

A. amazonense

A. brasilenseA. lipoferum

Page 50: Peter Vandamme

Comparison of MALDI-TOF MS spectral patterns

50

Page 51: Peter Vandamme

Raman spectroscopy

51

Page 52: Peter Vandamme

Genotyping - Ribotyping

Lactobacillus sakei

52

Lactobacillus curvatus

Lactobacillus curvatus

Lactobacillus sakei

Page 53: Peter Vandamme

Genotyping - AFLP -Campylobacter

53

Page 54: Peter Vandamme

Polyphasic taxonomy

• Consensus approach to bacterial taxonomy which integrates several generally accepted ideas for the classification of bacteria• Species delineation is based on DNA-DNA

hybridisation experiments• Bacterial phylogeny can be studied through

comparative sequence analysis of conserved

54

comparative sequence analysis of conserved macromolecules such as 16S rRNA

• Polyphasic taxonomy determines and acknowledges the value of other methods for the delineation of bacteria at different hierarchical levels

• The aim is to collect as much information as possible in order to define a pragmatic consensus classification that facilitates identification

Page 55: Peter Vandamme

Polyphasic species definition

• The bacterial species appears to be an assemblage of isolates originating from a common ancestor population in which genetic drift resulted in clones with different degrees

55

drift resulted in clones with different degrees of recombination and characterized by:

– a certain degree of phenotypic consistency

– a significant degree of DNA-DNA hybridization

– over 97% of 16S rRNA gene sequence similarity

Page 56: Peter Vandamme

Polyphasic Genomic taxonomyPolyphasic Genomic taxonomy

56

observation 1observation 2

observation 3

Page 57: Peter Vandamme

Now that we have access to whole-genome sequences: what do they tell us?

57

Page 58: Peter Vandamme

Gene content could be used to define species …

58

• Venn diagram showing core and accessory genes for Streptococcus species. The surfaces are approximately proportional to the number of genes (Lefébure and Stanhope 2007 Genome Biol. 8: R71)

Page 59: Peter Vandamme

… and higher taxonomic units

59

• Venn diagram showing core and accessory genes for Streptococcus species. The surfaces are approximately proportional to the number of genes (Lefébure and Stanhope 2007 Genome Biol. 8: R71)

Page 60: Peter Vandamme

Average Nucleotide Identity?

• Genomes seem to be composed of a core set of genes that is conserved among strains of the same species and accessory genes that are strain specific

• Phylogenetic signal present in core genes (ANI values): 95% ANI corresponds with 70% DNA-DNA hybridisation

60

Page 61: Peter Vandamme

Average Nucleotide Identity?

• Phylogenetic signal present in core genes (ANI values): 95% ANI corresponds with 70% DNA-DNA hybridization

• ANI does not necessarily correlate with gene content

– ANI values reflect phylogeny

– Gene content reflects ecology

61

– Gene content reflects ecology

• Bacteria with considerable differences in gene content are classified in the same species in spite of considerable genomic differences

Page 62: Peter Vandamme

ANI based phylogeny

62

Figure 3

Page 63: Peter Vandamme

Conclusions (1)

– Whole genome sequences can become part of polyphasic taxonomy and the standard description of bacterial species.

– Whole genome sequences provide parameters

63

– Whole genome sequences provide parameters for a superior reconstruction of organismalphylogeny and for the delineation of species as defined by DNA-DNA hybridization.

– Why hold on to DNA-DNA hybridization level as a standard?

Page 64: Peter Vandamme

Conclusions (2)

– Currently, less than 10,000 bacterial species have been described representing far less than 0.1% of the existing bacterial diversity.

64

– The present practice of polyphasic taxonomy as requested by the editorial boards of taxonomic journals is counterproductive in light of the vast microbial diversity that remains to be described.