finding detailed relationships between proteins specific to phenotypes among microbial organisms

27
Finding detailed relationships between proteins specific to phenotypes among microbial organisms Daniel Park Molecular Biology Institute, UCLA Yeates lab SoCalBSI August 24, 2006

Upload: colorado-butler

Post on 03-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Finding detailed relationships between proteins specific to phenotypes among microbial organisms. Daniel Park Molecular Biology Institute, UCLA Yeates lab SoCalBSI August 24, 2006. OUTLINE. Phylogenetic profiles Ternary logic analysis Building COG & phenotype profiles - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

Finding detailed relationships between proteins specific to phenotypes among microbial organisms

Daniel ParkMolecular Biology Institute, UCLA

Yeates labSoCalBSI

August 24, 2006

Page 2: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

OUTLINE

• Phylogenetic profiles

• Ternary logic analysis

• Building COG & phenotype profiles

• Results of logic analysis

Page 3: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

OUTLINE

• Phylogenetic profiles

• Ternary logic analysis

• Building COG & phenotype profiles

• Results of logic analysis

Page 4: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

PHYLOGENETIC PROFILES• Turning an earlier question on its side:• From, “What proteins are found in a genome?”• To, “What genomes contain a given protein?”

Page 5: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

VARIATIONS OF PHYLOGENETIC PROFILES

• Relationships between protein families

• Relationships between protein family profile and given target ‘phenotype’ profile

Page 6: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

OUTLINE

• Phylogenetic profiles

• Ternary logic analysis

• Building COG & phenotype profiles

• Results of logic analysis

Page 7: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

COMPLEXITY OF CELLULAR PROCESSES

Page 8: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

HIGHER ORDER RELATIONSHIPS:TERNARY LOGIC ANALYSIS

A B

Page 9: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

8 LOGIC TYPES FOR PHYLOGENETIC PROFILE TRIPLETS

Page 10: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

MEASURING MUTAL INFORMATION BETWEEN TWO PROFILES

Where U is the uncertainty coefficient relating profiles x and y H is the Shannon entropy of the probability distributions

Range of U: [0,1] Ex. U = 0.88 88% decrease in uncertainty

High value of U indicates high

mutual information between x and y

)(/)],()()([)|( xHyxHyHxHyxU

Page 11: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

MEASURING MUTAL INFORMATION AMONG THREE PROFILES

U(c | f(a,b)) where f(a,b) is the logical combination of a and b

Constraints:

U(c|a) < xU(c|b) < xU(c|f(a,b)) > y

Page 12: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

OUTLINE

• Phylogenetic profiles

• Ternary logic analysis

• Building COG & phenotype profiles

• Results of logic analysis

Page 13: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

COGs: CLUSTERS OF ORTHOLOGOUS GROUPS

Set of orthologous proteins from at least three different lineages

Cluster Functional group

Page 14: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

COMBINATIONS OF COG PROFILES MATCHING A PHENOTYPE

Page 15: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

ASSOCIATING MORE GENOMES WITH COGS

No. of fully sequenced bacterial genomes over the last 9 years

66

354

70

50

100

150

200

250

300

350

400

1997 2003 2006

Years

No

. o

f b

acte

rial

gen

om

es

Page 16: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

`

BUILDING COG PROFILES

• 81,480 proteins• 354 bacterial genomes• 4,613 COGs

Page 17: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

BUILDING PHENOTYPE PROFILES

http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi

Page 18: Finding detailed relationships between proteins specific to phenotypes among microbial organisms
Page 19: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

OUTLINE

• Phylogenetic profiles

• Ternary logic analysis

• Building COG & phenotype profiles

• Results of logic analysis

Page 20: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

Cumulative no. of protein triplets recovered at an uncertainty coefficient score greater than a given

threshold

Page 21: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

Frequency for each of the eight logic function types observed

Page 22: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

CORRELATIONS WITH PHENOTYPES:TEMPERATURE RANGE

• For U > 0.8, one relationship between proteins was found:

Hyperthermophilicity = and( COG0432, !COG0225 )U ( Hyp. | COG0432 ) = 0.26

U ( Hyp. | COG0225 ) = 0.29

U ( Hyp. | and( COG0432, !COG0225 ) ) = 0.71

[S] COG0432: Uncharacterized conserved protein

[O] COG0225: Peptide methionine sulfoxide reductase

Page 23: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

LOGICAL COMBINATION OF COG PROFILES MATCHING A PHENOTYPE PROFILE

c = hyperthermophilicityf = and( COG0432, !COG0225 ) a = COG0432 (Uncharacterized conserved protein)b = !COG0225 (Peptide methionine sulfoxide reductase)

Page 24: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

CONCLUSIONS

• There may be a correlation between the absence of methionine sulfoxide reductase and the presence of an uncharacterized conserved protein in hyperthermophiles.

Page 25: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

CONCLUSIONS

– Classified ~80,000 proteins from 354 bacterial genomes into ~4,600 COGs

– Built COG and phenotype profile matrices for 354 fully sequenced bacterial genomes

– Support that ternary relationships among COGs are biologically significant

– Support that some logic types are seen in biology more than others: 1 (and)

57 (xor)

Page 26: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

FUTURE DIRECTIONS

• Build a richer database of phenotype profiles

• Investigate relationships at lower cutoffs

• Experimentally characterize the unknown COG0432 by crystallography

Page 27: Finding detailed relationships between proteins specific to phenotypes among microbial organisms

ACKNOWLEDGEMENTS

Todd Yeates

Matteo Pellegrini

Yeates lab

Morgan Beeby

Brian O’Connor

Rest of the lab

SoCalBSI 2006

Jamil Momand

Wendie Johnston

Sandra Sharp

Nancy Warter-Perez

Ronnie Cheng

Fellow participants