introduction tointroduction to metagenomics9/15/2008 1 introduction tointroduction to metagenomics...

19
9/15/2008 1 Introduction to Introduction to Metagenomics Phil Hugenholtz Phil Hugenholtz Microbial Ecology Program History of the “meta” part Cell genomic cartoon Carl Woese

Upload: others

Post on 26-May-2020

30 views

Category:

Documents


0 download

TRANSCRIPT

9/15/2008

1

Introduction toIntroduction to

Metagenomics

Phil HugenholtzPhil Hugenholtz

Microbial Ecology Program

History of the “meta” part

Cell genomic cartoon

Carl Woese

9/15/2008

2

Falk Warnecke

9/15/2008

3

tame wild

most bacteria don’t grow on plates“the great plate count anomaly”

Norm Pace

9/15/2008

4

extraction

whole–cell hybridization

environmental sample

bulk DNA / RNA

PCR

cloningprobe design

nucleic acid hybridization

community rRNA / rDNA

nucleic acid probes

phylogenetic trees

sequencingcomparative analysis

rRNA / rDNA sequences and database

rRNA / rDNA clones

Bacteria,1998

9/15/2008

5

extraction

whole–cell hybridization

environmental sample

bulk DNA / RNA

PCR

cloningprobe design

nucleic acid hybridization

community rRNA / rDNA

nucleic acid probes

phylogenetic trees

sequencingcomparative analysis

rRNA / rDNA sequences and database

rRNA / rDNA clones

genomic

genomic

isolate community

Genomics Metagenomics

sequencing

9/15/2008

6

The catch…. it comes in small piecesThe catch…. it comes in small pieces

Average Sanger read length - 750 bases

Must assemble the reads together

Giant jigsaw puzzles

Metagenome assembly - like putting together several jigsaw puzzles

Falk Warnecke, JGI

9/15/2008

7

. . . with some pieces missing

Falk Warnecke, JGI

Can we still reconstruct?

Falk Warnecke, JGI

9/15/2008

8

Can we still reconstruct?

Falk Warnecke, JGI

Making metagenomes

Sample Extract DNA Shotgun clone High throughput sequence

3, 8, 40 kb

Assemble reads Call genes Bin fragments

9/15/2008

9

Will this really work??

need to test whole genome shot-gun sequencing on a very simple community to see if it will work!

pink biofilm community

Archaea

Sulfobacillus

ArchaeaLeptospirillum

10%4%

1%

Eucarya

85%

FISH countsTyson et al., 2004 Nature 428:37-43

9/15/2008

10

MetabolismExpected

proton pumps for pH

metal resistance genes

novel cytochromes for Fe2+ oxidation2chemotaxis genes for Fe2+ and O2

genes for CO2 and N2 fixation*

lots of amino acid and sugar transporters in the archaea

Unexpected* l t f if*only one set of nif genes

DNA photolyase genes

cellulose synthase in Leptospirillum -biofilm bouyancy

Fine-scale population structure

Shah et al (2005) BMC Bioinformatics 6:29Shah et al., (2005) BMC Bioinformatics 6:29

9/15/2008

11

Gross-scale population structure

- align metagenomic reads against reference g g ggenomes or genome fragments

- color-code to enhance visualization

- recently implemented in IMG/M

from Fig. 2 Coleman et al. (2006) Science 311(5768):17

does it work with more complex communities?

9/15/2008

12

Acid mine drainage Sargasso Sea Soil

1 10 100 1000 10000Species complexity

Susannah Tringe

90

100

30

40

50

60

70

80

% Sequence of Reads

0

10

20

Acid Mine DrainageBiofilm

Sargasso Sea Soil

Environmental Sample (Complexity)

(Low)(Moderate) (High)

Gene Tyson

9/15/2008

13

Acid mine drainage Sargasso Sea Soil

1 10 100 1000 10000Species complexity

?Susannah Tringe

Environmental Gene Tags (EGT)

Identify genes in sequence data (contigs to unassembled reads) from multipleenvironmental samples

Assign genes to their gene family, or higher level groupings

Compare relative abundance of different gene families according to habitat

9/15/2008

14

A

Adaptive gene for habitat AAdaptive gene for habitat BEssential gene

B

Environmental Gene Tags(EGTs)

Susannah Tringe

proteorhodopsins

hypotheticalsTringe et al. Science Mar 2005

9/15/2008

15

Tringe et al. Science Mar 2005

K+ transportNa+ transport

Tringe et al. Science Mar 2005

photosynthesis

antibiotics

9/15/2008

16

Snapshot studies

xspace

~90% of all current and pending metagenomic

j t

time

projects

Spatial series studiesHypersaline mat

xxxxx

space

xx

time

9/15/2008

17

Temporal series studies

Ventilator associated pneumonia

x x x x x x xspace

time

New sequencing technologies

Sample Extract DNA Shotgun clone High throughput sequence

pyrosequencing

3, 8, 40 kb

Assemble reads Call genes Bin fragments

9/15/2008

18

454 pyrosequencer

Sanger GS20 FLX XLR0.07 Mbp 35 Mbp 100 Mbp 400 Mbp per run700 bp 100 bp 200 bp 400 bp reads0.1 ¢ 0.03 ¢ 0.01¢ 0.003 ¢ per base

Mega-metagenomics

600 Mbp alone!

Future?????

9/15/2008

19

ProteinsMetaproteomics

Unexplored territory

3 3 3

Enhancing “basic” metagenomics

DNAMetagenomics

RNAMetatranscriptomics

AUG

AUG

AUG

AUG

AUG

AUG

1

2

3

1

2

3

1

2

3

(WGA) WGA

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

AUG

SIP

SIP

Cells

Microbialcommunity

Enrichedpopulation

Singlecell

Basic metagenomicapproach

( )

ab