rachel adams - smbe euks meeting

73
Next-generational sequencing for microbial ecology: alpha diversity, beta diversity, and biases in high-throughput sequencing Rachel Adams Andrew Rominger Sara Branco Thomas Bruns

Upload: adamsri

Post on 17-Jul-2015

271 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Rachel Adams - SMBE Euks Meeting

Next-generational sequencing for microbial ecology:

alpha diversity, beta diversity, and biases in high-throughput sequencing

Rachel AdamsAndrew Rominger

Sara BrancoThomas Bruns

Page 2: Rachel Adams - SMBE Euks Meeting

Understudied but fundamental ecological habitat

Implications for human healthSick building syndrome

Metrics are practically absent: composition and quantitative characteristics

Need comparison of “typical” buildings

The microbiome of the built environment

Page 3: Rachel Adams - SMBE Euks Meeting

Understudied but fundamental ecological habitat

Implications for human healthSick building syndrome

Metrics are practically absent: composition and quantitative characteristics

Need comparison of “typical” buildings and high replication across settings to detect patterns

The microbiome of the built environment

Page 4: Rachel Adams - SMBE Euks Meeting

?

?

?

The What and Why of the indoor microbiome

Page 5: Rachel Adams - SMBE Euks Meeting

?

?

?Architecture

Ventilation

Building function

The What and Why of the indoor microbiome

Page 6: Rachel Adams - SMBE Euks Meeting

?

?

?Architecture

Ventilation

Building function Environmental setting

The What and Why of the indoor microbiome

Page 7: Rachel Adams - SMBE Euks Meeting

?

?

?Architecture

Ventilation

Building function Environmental setting

Residents

The What and Why of the indoor microbiome

Page 8: Rachel Adams - SMBE Euks Meeting

Fungi in the indoor microbiome, and beyond

Yeasts

Filaments

Page 9: Rachel Adams - SMBE Euks Meeting

Fungi in the indoor microbiome, and beyond

Yeasts

Filaments

Saprobes

Page 10: Rachel Adams - SMBE Euks Meeting

Fungi in the indoor microbiome, and beyond

Yeasts Saprobes

Symbionts

Parasites Mutualists

− +

Page 11: Rachel Adams - SMBE Euks Meeting

Assessing environmental fungi

1. Estimated that 5-20% of fungi grow in culture2. Identification requires a fungal taxonomist

Page 12: Rachel Adams - SMBE Euks Meeting

Assessing environmental fungi

SSU RNA (18S) (5.8S) LSU RNA (28S)

ITS1 ITS2

Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi - Schoch et al. 2012

Page 13: Rachel Adams - SMBE Euks Meeting

High-throughput sequencing has greatly expanded capabilities in microbial ecology

Page 14: Rachel Adams - SMBE Euks Meeting

ACGAGTGCGT

High-throughput sequencing has greatly expanded capabilities in microbial ecology

Page 15: Rachel Adams - SMBE Euks Meeting

ACGAGTGCGT

High-throughput sequencing has greatly expanded capabilities in microbial ecology

Page 16: Rachel Adams - SMBE Euks Meeting

ACGAGTGCGTACGCTCGACA AGACGCACTC AGCACTGTAG ATCAGACACG

104 – 107 sequence reads

High-throughput sequencing has greatly expanded capabilities in microbial ecology

Page 17: Rachel Adams - SMBE Euks Meeting

α1

β12

ϒ

α2 α3

β23

β13

alpha, beta, gamma diversity

Page 18: Rachel Adams - SMBE Euks Meeting

α1

α2 α3

alpha, beta, gamma diversity

Page 19: Rachel Adams - SMBE Euks Meeting

α1

β12

α2 α3

β23

β13

alpha, beta, gamma diversity

Page 20: Rachel Adams - SMBE Euks Meeting

α1

β12

ϒ

α2 α3

β23

β13

alpha, beta, gamma diversity

Page 21: Rachel Adams - SMBE Euks Meeting

Kunin et al. 2010

Groundtruthing high-throughput sequencing for alpha richness

Page 22: Rachel Adams - SMBE Euks Meeting

Kunin et al. 2010

αtrue < αest

Groundtruthing high-throughput sequencing for alpha richness

Page 23: Rachel Adams - SMBE Euks Meeting

Groundtruthing high-throughput sequencing

Page 24: Rachel Adams - SMBE Euks Meeting

True samples

Hig

h-th

roug

hput

seq

uenc

ing

Observed samples

α1

α2 α3

α1+

α2+ α3+

In terms of diversity, we know that α

can be elevated in high-throughput sequenced communities...

Page 25: Rachel Adams - SMBE Euks Meeting

True community

Observed community

β12 β13

β23

β12? β13?

β23?

α1

α2 α3

α1+

α2+ α3+

...but how does that change conclusions of ecological processes that are based on β diversity?

Hig

h-th

roug

hput

seq

uenc

ing

Page 26: Rachel Adams - SMBE Euks Meeting

A key component to community ecology: Linking processes to this compositional variation

Adams et al., ISME Journal, 2013

Beta diversity: the variation in species composition among sites

Page 27: Rachel Adams - SMBE Euks Meeting

Do errors that inflate alpha diversity bias conclusions on beta diversity between samples?

Why would it? • Particular taxa in one environment grouping do not amplify or

amplify in a way that skews relative abundance of all others*• Clustering incorrectly groups divergent taxa or splits identical

taxa

Hypothesis: No

While richness/diversity estimations will be off for any given sample, conclusions of beta-diversity will be robust to the errors

Question and hypotheses

Page 28: Rachel Adams - SMBE Euks Meeting

Do errors that inflate alpha diversity bias conclusions on beta diversity between samples?

Why would it? • Particular taxa in one environment grouping do not amplify or

amplify in a way that skews relative abundance of all others*• Clustering incorrectly groups divergent taxa or splits identical

taxa

Hypothesis: No

While richness/diversity estimations will be off for any given sample, conclusions of beta-diversity will be robust to the errors

Question and hypotheses

Page 29: Rachel Adams - SMBE Euks Meeting

Do errors that inflate alpha diversity bias conclusions on beta diversity between samples?

Why would it? • Particular taxa in one environment grouping do not amplify or

amplify in a way that skews relative abundance of all others*• Clustering incorrectly groups divergent taxa or splits identical

taxa

While richness/diversity estimations will be off for any given sample, conclusions of beta-diversity will be robust to the errors

Question and hypotheses

Page 30: Rachel Adams - SMBE Euks Meeting

Simulation process

Initial community

Simulated community

OTU1 OTU2 … OTUj

Sample 1

Sample 2

Sample i

OTU1 OTU2 … OTUk

Sample 1

Sample 2

Sample i

Page 31: Rachel Adams - SMBE Euks Meeting

Simulation process

Expected relative abundance of OTUs

Initial communities

Page 32: Rachel Adams - SMBE Euks Meeting

Simulation process

Biased relative abundance

Variation in taxon-specific amplification

Initial communities

Expected relative abundance of OTUs

Page 33: Rachel Adams - SMBE Euks Meeting

Simulation process

Biased relative abundance

Variation in taxon-specific amplification

Biased relative abundance + error

Sequence error

Initial communities

Expected relative abundance of OTUs

Page 34: Rachel Adams - SMBE Euks Meeting

Simulation process

Biased relative abundance

Variation in taxon-specific amplification

Biased relative abundance + error

Sequence error

Clustering OTUs

Initial communities

Biased relative abundance + error + clustering

Expected relative abundance of OTUs

Page 35: Rachel Adams - SMBE Euks Meeting

Simulation process

Biased relative abundance

Variation in taxon-specific amplification

Biased relative abundance + error

Sequence error

Biased relative abundance + error + clusteringClustering OTUs

Simulated communities

Initial communities

Expected relative abundance of OTUs

Page 36: Rachel Adams - SMBE Euks Meeting

Model summary – 2 types of errors

1. Create group differences that aren’t there (Type I error)

-0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

True

NMDS1

NM

DS

2

-0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

Perceived

NMDS1

NM

DS

2

Page 37: Rachel Adams - SMBE Euks Meeting

Model summary – 2 types of errors

2. Loose groups differences that are there (Type II error)

-0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

True

NMDS1

NM

DS

2

-0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

Perceived

NMDS1

NM

DS

2

Page 38: Rachel Adams - SMBE Euks Meeting

Model summary output

1. Presence of bias: Statistical categorical differences

Groups R2 p-value

Location 0.02 0.34

Season 0.20 0.001

2. Degree of bias: percentage difference between true and simulated communities

(Simulated – True) True

= Normalized bias

Page 39: Rachel Adams - SMBE Euks Meeting

Model summary output

1. Presence of bias: Statistical categorical differences

2. Degree of bias: percentage difference between true and simulated communities

(Simulated distance – True distance)True distance

= Normalized error

Morisita-Horn distance metric

Groups R2 p-value

Location 0.02 0.34

Season 0.20 0.001

Page 40: Rachel Adams - SMBE Euks Meeting

Categorical differences are robust to high-throughput sequencing errors in alpha diversity, regardless of the underlying patterns of beta-diversity

The degree of bias is not affected by the underlying patterns of beta-diversity but dependent on community characteristics

Model findings

Page 41: Rachel Adams - SMBE Euks Meeting

Model findings

Categorical differences are robust to high-throughput sequencing errors in alpha diversity, regardless of the underlying patterns of beta-diversity

The degree of bias is not affected by the underlying patterns of beta-diversity but dependent on community characteristics

Page 42: Rachel Adams - SMBE Euks Meeting

True Simulated True Simulated

0.0

0.2

0.4

0.6

0.8

1.0

p v

alu

esNo groups Two groups

Model summary – Type I & II error

Page 43: Rachel Adams - SMBE Euks Meeting

True Simulated True Simulated

0.0

0.2

0.4

0.6

0.8

1.0

p v

alu

esNo groups Two groups

Model summary – Type I & II error

Page 44: Rachel Adams - SMBE Euks Meeting

True Simulated True Simulated

0.0

0.2

0.4

0.6

0.8

1.0

p v

alu

esNo groups Two groups

Model summary – Type I & II error

Whether groups are different or the same will not be biased by inflated alpha diversity

Page 45: Rachel Adams - SMBE Euks Meeting

Model summary – Degree of bias

Degree of bias will be affected by - the error rate of the platform and OTU- clustering- the gamma diversity of the environment- the precise shape of the species abundance

distribution

But not the relationship among samples

Page 46: Rachel Adams - SMBE Euks Meeting

Increasing probability of sequencing error and over-splitting OTUs increases bias

1e-04 0.0334 0.0667 0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

No groups

Nor

mal

ized

err

or

1e-04 0.0334 0.0667 0.1

Two groups

Probability of splitting

Page 47: Rachel Adams - SMBE Euks Meeting

Increasing OTU richness decreases bias

100 600 1100

0.0

0.2

0.4

0.6

0.8

Number of OTUs

Nor

mal

ized

err

or

Page 48: Rachel Adams - SMBE Euks Meeting

Shape of species abundance distribution (SAD) affects bias

0 200 400 600 800 1000 1200

01

00

02

000

30

00

40

005

00

0

Rank

Ab

und

an

ce

Page 49: Rachel Adams - SMBE Euks Meeting

Shape of species abundance distribution (SAD) affects bias

1.5 2.5 3.5

0.0

0.2

0.4

0.6

0.8

Increasing SAD variance

No

rmal

ized

err

or

Page 50: Rachel Adams - SMBE Euks Meeting

As true community distance increases, degree of error decreases

0.65 0.70 0.75 0.80

0.2

0.3

0.4

0.5

0.6

True distance

No

rma

lize

d e

rro

r

Page 51: Rachel Adams - SMBE Euks Meeting

Clustering is the main error-producing step

True Amplified Split

0.0

0.1

0.2

0.3

0.4

0.5

R^2

va

lue

sTwo groups

Page 52: Rachel Adams - SMBE Euks Meeting

Simulation overview

Categorical analysis very robust to errors in high-throughput biases

Degree of bias will be affected by error rate of the sequencing platform and OTU-clustering, the gamma diversity of the environment, the precise shape of the species abundance distribution

High-throughput error leads to an over-estimation of the difference between groups

Mean bias is ~20-40%Incorrect OTU clustering is most of that

Page 53: Rachel Adams - SMBE Euks Meeting

Steps

1. In silico: Add further complexity to simulations

2. In vitro: Empirically test artificially-created microbial communities

Page 54: Rachel Adams - SMBE Euks Meeting

Do errors that inflate alpha diversity bias conclusions on beta diversity between samples?

Why would it?

• Particular taxa in one environment grouping do not amplify or amplify in a way that skews relative abundance of all others*

• Clustering incorrectly groups divergent taxa or splits identical taxa

Hypothesis: No

While richness/diversity estimations will be off for any given sample, conclusions of beta-diversity will be robust to the errors

Question and hypotheses

Page 55: Rachel Adams - SMBE Euks Meeting

Air samples in a mycology classroom: a unique source distorts perceived species richness

Page 56: Rachel Adams - SMBE Euks Meeting

Air samples in a mycology classroom: a unique source distorts perceived species richness

Page 57: Rachel Adams - SMBE Euks Meeting

Mycology classroom appears to be less rich than other classrooms…

0 2000 4000 6000 8000

02

0040

060

080

010

00B

AC

D

E

Individuals

Cha

o E

stim

ated

Ric

hne

ss

Page 58: Rachel Adams - SMBE Euks Meeting

… but has higher biomass

A B C D E

050

100

15

02

00

Classroom

Pe

nic

illiu

m s

pore

eq

uiva

lent

s

Page 59: Rachel Adams - SMBE Euks Meeting

Composition of non-mycology classrooms are similar

AB

CD

E

Proportion

Cla

ssro

om

0 20 40 60 80 100

Page 60: Rachel Adams - SMBE Euks Meeting

Mycology classroom dominated by a few taxa

AB

CD

E

Proportion

Cla

ssro

om

0 20 40 60 80 100

Page 61: Rachel Adams - SMBE Euks Meeting

xxPuffballs dominate mycology classroom

Pisolithus, aka dog turd fungus Battarrea, tall stiltball

Lycoperdon, common puffball

Page 62: Rachel Adams - SMBE Euks Meeting

Mycology classroom dominated by a few taxa

AB

CD

E

Proportion

Cla

ssro

om

0 20 40 60 80 100

* * **

Adams et al., in review

Page 63: Rachel Adams - SMBE Euks Meeting

Beta diversity of mycology classroom: distinct communities

-1.5 -1.0 -0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

NMDS1

NM

DS

2Observed

Page 64: Rachel Adams - SMBE Euks Meeting

Beta diversity of mycology classroom: distinct communities

-1.5 -1.0 -0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

NMDS1

NM

DS

2ObservedTaxonomy reassigned

Page 65: Rachel Adams - SMBE Euks Meeting

Beta diversity of mycology classroom: distinct communities

-1.5 -1.0 -0.5 0.0 0.5

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

NMDS1

NM

DS

2ObservedTaxonomy reassignedAbundance reassigned

Page 66: Rachel Adams - SMBE Euks Meeting

Conclusions

• While deciphering alpha diversity is problematic:- Inflated alpha due to sequence error & clustering- Deflated alpha due to unevenness

beta diversity calculations are robust to these errors in high-throughput sequencing

• Empirical test will be used to corroborate conclusions of in silico simulations

• High-throughput sequencing will continue to be a promising tool for microbial ecologists

Page 67: Rachel Adams - SMBE Euks Meeting

Conclusions

• While deciphering alpha diversity is problematic:- Inflated alpha due to sequence error & clustering- Deflated alpha due to unevenness

beta diversity calculations are robust to these errors in high-throughput sequencing

• Empirical test will be used to corroborate conclusions of in silico simulations

• High-throughput sequencing will continue to be a promising tool for microbial ecologists

Page 68: Rachel Adams - SMBE Euks Meeting

Conclusions

• While deciphering alpha diversity is problematic:- Inflated alpha due to sequence error & clustering- Deflated alpha due to unevenness

beta diversity calculations are robust to these errors in high-throughput sequencing

• Empirical test will be used to corroborate conclusions of in silico simulations

• High-throughput sequencing will continue to be a promising tool for microbial ecologists

Page 69: Rachel Adams - SMBE Euks Meeting
Page 70: Rachel Adams - SMBE Euks Meeting

References – potential biases in high-throughput sequencingDNA extraction: Frostegard et al Appl Environ Microbiol 1999; DeSantis et al FEMS Microbiology 2005; Feinsten et al Appl Environ Microbiol 2009; Morgan et al PLoS ONE 2010; Delmont et al Appl Environ Microbiol 2011

PCR amplification/Relative abundance: Amend et al Mol Ecol 2010; Engelbrektson et al ISME Journal 2010; Bellemain et al BMC Microbiol 2010; Schloss et al PLoS ONE 2011; Pinto & Raskin PLoS ONE 2012; Klindworth et al Nucleic Acids Res 2013

Sequencing error/Chimeras/OTU clustering: Huse et al Genome Biol 2007; Huse et al Environ Microbiol 2010; Kunin et al Environ Microbiol 2010; Quince et al BMC Bioinformatics 2010; Lee et al PLoS ONE 2012; Pinto & Raskin PLoS ONE 2012; Bachy et al ISME Journal 2013

Sequencing platform/protocol: Morgan et al PLoS ONE 2010; Luo et al PLoS ONE 2012

Even sampling depth: Schloss et al PLoS ONE 2011; Gihring et al Environ Microbiol 2012

Denoising: Gasper & Thomas PLoS ONE 2013;

Page 71: Rachel Adams - SMBE Euks Meeting

Empirical test of simulation results

100 600 1100

0.0

0.2

0.4

0.6

0.8

Number of OTUs

Nor

mal

ized

err

or

Page 72: Rachel Adams - SMBE Euks Meeting

PCR bias

-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2

0.0

0.5

1.0

1.5

2.0

PCR bias: beta distribution a=0.5, beta=1.0

Scatter around line of true abundance versus amplified abundance

Den

sity

0 200 400 600 800 1000 1200

020

04

006

0080

010

00

1200

1400

True abundance

Am

plifi

ed a

bund

anc

e

Page 73: Rachel Adams - SMBE Euks Meeting

OTU splitting bias

0 5 10 15 20

0.0

0.1

0.2

0.3

0.4

Split bias: binomial distribution with n=100

Number of splits

Den

sity

p=0.001

p=0.0667

p=0.0334

p=0.0001

0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Split location: beta distribution with a=b=0.5

Location of split

Den

sity