sequencing the world of possibilities for energy & environment annotation: function prediction...

39
Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology Program DOE-Joint Genome Institute [email protected]

Upload: emerald-elfrieda-terry

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Annotation: function prediction andmetabolic reconstruction

Thanos LykidisGenome Biology Program

DOE-Joint Genome Institute

[email protected]

Page 2: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Two main goals of genome analysis:

• Evolutionary analysis– How does an organism compare to the rest?

• Metabolic reconstruction– What can an organism do and how?

Page 3: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Metabolic reconstruction

• Predict the biochemistry and physiology of an organism based on its genome sequence

• Explain known biochemical and physiological properties

Page 4: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

To do metabolic reconstruction we need to “annotate” the genome:

• Find the genes

• Understand (predict) what these genes are doing

Page 5: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

The “same-gene” problem

Page 6: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Metabolic reconstruction-Gene function

• Experiment– enzyme assays– mutants

• Computation– sequence comparison

• BLAST, phylogenomics protein family (Pfam, COG, InterPro)

– chromosomal context – fusion

Page 7: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Similarity-based annotation

Page 8: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Similarity-based annotation

Page 9: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 10: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

The dgk example

Page 11: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 12: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Extensive distribution of the dgk protein family

Page 13: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 14: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 15: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 16: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 17: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Flow chart of the reconstruction process

Gene Annotation Reaction Pathway

Page 18: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Treponema pallidum is an uncultivated pathogenic bacterium.

Fitzgerald TJ et al, J. Bacteriol 130:1333 1977.

TP0671

Page 19: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 20: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 21: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 22: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

A, PIS bacterial

B, PIS eukaryotic

C, CLS eukaryotic

D, PGS

E, PSS

F, PCSG, unknownH, CPT/EPTI, unknown

Page 23: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

670

99

86

Group H contains eukaryotic CPT/EPT

A_aeolicusC_reinhardtiiC_intestinalisD_melanogasterH_sapiensH_sapiensC_elegansC_elegans

A_thalianaA_thalianaD_melanogasterD_melanogasterC_intestinalis

C_intestinalisH_sapiens

S_cerevisiaeS_cerevisiae

T_denticolaT_pallidumN_aromaticivoransS_coelicolorS_avermitilis

Page 24: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Based on the BLAST hits we get a hint that TP0671 is a CEPT

+

CDP-Cho + DAG PtdCho

CDP-Etn DAG PtdEtn

CPT

EPT

Eukarya

Eukarya

Page 25: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

A functional prediction has to make sense in the context of metabolism

Page 26: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Pathway

What is a pathway?

A sequence of reactions transforming one metabolite to another

cholinePhosphocholine

CDP-choline

Phosphatidylcholine

Cholinekinase

Phosphoholinecytidylyltransferase

Phosphatidyltransferase

Everything should come together

Page 27: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 28: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 29: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Page 30: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

All genes of the pathway are present

cholinePhosphocholine

CDP-choline

Phosphatidylcholine

Cholinekinase

Phosphoholinecytidylyltransferase Phosphatidyltransferase

Page 31: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Reconstruction of phospholipid biosynthesis in Treponema pallidum

CDP-DAG

PtdSer

PtdEtn

PtdGlc

CL

PSS PGS

CLS

PtdOH DAG PtdChoPtdEtn

PSDCho, Etn

P-Cho, P-Etn

CDP-Cho, CDP-Etn

TP0107

TP0107

Page 32: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Working with no similarity

A

B

C

Enzyme 1

Enzyme x

Page 33: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

The plsX-plsY pathway

Page 34: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Scoring phylogenetic profile similarity

Page 35: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Scoring phylogenetic profile similarity

Page 36: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Scoring phylogenetic profile similarity

Page 37: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Clustering of fatty acid biosynthesis genes

Acetyl-CoA

Malonyl-CoA

Malonyl-ACP

-ketoacyl-ACP

-hydrxyacyl-ACP

trans-2-enoyl-ACP

(s)-acyl-ACP

fabZ

accA

accD

accC

accB

fabF

fabG

fabD

cis-2-enoyl-ACP

(u)-acyl-ACP

fabK

acpP

fabH

HTH

fabM

acc

fabD

fabHfabF

fabZ

fabI

fabG

cis-2-enoyl-ACP (u)-acyl-ACP

fabM

fabA

fabK

S. pneumoniae

Page 38: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Current Status of annotation

~ 50-80% precise, accurate prediction

~ 10-30% “twilight zone” predictions

~ 10-30% genome specific genes

Page 39: Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology

Sequencing the World of Possibilities for Energy & Environment

Metabolic reconstruction:Inferring physiology from sequence