phar 201 lecture 08, 20121 from reductionism comes new science: protein structure data reveals how...

31
PHAR 201 Lecture 08, 2012 1 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics I Philip E. Bourne Department of Pharmacology, UCSD

Upload: annabella-johns

Post on 23-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 1

From Reductionism Comes New Science:

Protein Structure Data Reveals How Environmental Pressures

Shape Evolution

PHAR 201/Bioinformatics I

Philip E. BourneDepartment of Pharmacology, UCSD

Page 2: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 2

Introduction

• Previously we reviewed one system of reductionism – SCOP

• SCOP is used to assign superfamilies and families to complete proteomes in another resource called SUPERFAMILY

• Today we will see how this is used to do new science (Dupont et al PNAS 2007 103(47) 17822-17827; PNAS 2010

doi: 10.1073/pnas.0912491107 ) • We cast this new science in the context of the

Gaia hypothesis

Page 3: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 3

The SCOP Hierarchy v1.75Based on 38221 Structures

7

1195

1962

3902

110800

Page 4: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 4

The Gaia Hypothesis

Gaia - a complex entity involving the Earth's biosphere, atmosphere, oceans, and soil; the totality constituting a feedback system which seeks an optimal physical and chemical environment for life on this planet.

James Lovelock

Gaia (pronounced /'geɪ.ə/ or /'gaɪ.ə/) "land" or "earth", from the Greek Γαῖα; is a Greek goddess personifying the Earth

Page 5: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 5

We Show Some Support for the Gaia Hypothesis

Emergent properties of an organism have been influenced

by the environment

These organisms in turn have influenced

the environment

Page 6: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 6

Nature’s Reductionism

There are ~ 20300 possible proteins>>>> all the atoms in the Universe

11.2M protein sequences from 10,854 species (source RefSeq)

38,221 protein structures yield 1195 domain folds (SCOP 1.75)

Page 7: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 7

What Does Nature’s Reductionism Tell Us?

• The advent of a new fold is a big deal

• From new folds come new function(s)

• Are these new folds enough to distinguish “species”?

Page 8: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 8

To Answer this Question We Only Need to Make Use of Existing

Resources

• SCOP – Further catalogs Nature’s reductionism into structural domains, folds, families and superfamilies

• SUPERFAMILY assigns the above to fully sequenced proteomes

Page 9: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 9

Method – Distance Determination

(FSF)SCOP

SUPERFAMILY

organisms

C. intestinalis C. briggsae F. rubripes

a.1.1 1 1 1

a.1.2 1 1 1

a.10.1 0 0 1

a.100.1 1 1 1

a.101.1 0 0 0

a.102.1 0 1 1

a.102.2 1 1 1

C. intestinalis C. briggsae F. rubripes

C. intestinalis 0 101 109

C. briggsae 0 144

F. rubripes 0

Presence/Absence Data Matrix

Distance Matrix

Page 10: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 10

The Answer Would Appear to be Yes

• It is possible to generate a reasonable tree of life from merely the presence or absence of superfamilies within a given proteome

Yang, Doolittle and Bourne2005 PNAS 102(2): 373-378

Page 11: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 11

Moreover… Distribution of among the three kingdoms

as taken from SUPERFAMILY

• Superfamily distributions would seem to be related to the complexity of life

• Update of the work of Caetano-Anolles2

(2003) Genome Biology 13:1563

Eukaryota (650)

Archaea (416) Bacteria (564)

2 42

10

135

118

387

17

SCOP fold (765 total)

1

153/14

9/1

21/2 310/0645/49

29/0 68/0

Any genome / All genomes

Yang, Doolittle & Bourne (2005) PNAS 102(2) 373-8

Page 12: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 12

The Unique Superfamily in Archaea – d.17.6

• Archaeosine tRNA-guanine transglycosylase (tgt), C2 domain

• First step in the biosynthesis of an archaea-specific modified base, archaeosine (7-formamidino-7-deazaguanosine)

• Found in tRNAs• At present found

exclusively in Archaea. Reference: Interpro IPR004804

Page 13: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 13

Let us Take This a Step FurtherConsider the Distribution of Disulfide Bonds

among Folds• Disulphides are only stable under

oxidizing conditions• Oxygen content gradually

accumulated during the earth’s evolution

• The divergence of the three kingdoms occurred 1.8-2.2 billion years ago

• Oxygen began to accumulate ~ 2.0 billion years ago

• Logical deduction – disulfides more prevalent in folds (organisms) that evolved later

• This would seem to hold true

• Can we take this further?

Eukaryota

Archaea Bacteria

0% (0/2)

16.7% (7/42)

0% (0/10)

31.9% (43/135)

14.4% (17/118) 4.7%

(18/387)

5.9% (1/17)

SCOP fold (708 total)

1

Page 14: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 14

Recap So Far

• Structure is a useful tool to study evolution since it is conserved over longer periods of geological time

• A course-grained characterization of structure, namely superfamily, distinguishes between species

• There is a tantalizing suggestion that proteomes may contain imprints of their ancient environment

Page 15: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 15

Recap So Far

• Structure is a useful tool to study evolution since it is conserved over longer periods of geological time

• A course-grained characterization of structure, namely superfamily, distinguishes between species

• There is a tantalizing suggestion that proteomes may contain imprints of their ancient environment

Page 16: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 16

Consider Changes in Metal Ion Concentrations

Chris Dupont, Scripps Institute of Oceanography (now JCVI)

Bioinformatics Final Exam 2004

Dupont, Yang, Palenik, Bourne. PNAS 2007 103(47) 17822-17827; PNAS 2010 doi: 10.1073/pnas.0912491107

Page 17: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 17

Evolution of the Earth

• 4.5 billion years of change• 300+50K• 1-5 atmospheres• Constant photoenergy• Chemical and geological

changes• Life has evolved in this time

• The ocean was the “cradle” for 90% of evolution

Page 18: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 18

• Whether the deep ocean became oxic or euxinic following the rise in atmospheric oxygen (~2.3 Gya) is debated, therefore both are shown (oxic ocean-solid lines, euxinic ocean-dashed lines).

• The phylogenetic tree symbols at the top of the figure show one idea as to the theoretical periods of diversification for each Superkingdom.

0

0.5

1

1.00E-20

1.00E-16

1.00E-12

1.00E-08

1.00E-15

1.00E-12

1.00E-09

1.00E-06

1.00E-11

1.00E-09

1.00E-07

00.511.522.533.544.5

Billions of years before present

Concentration

(O2

in arbitrary units, Zn and Fe in m

oles L-1

BacteriaArchaea

Eukarya

Oxygen

Zinc

Iron

CobaltManganese

Theoretical Levels of Trace Metals and Oxygen in the Deep Ocean Through Earth’s History

Replotted from Saito et al, 2003Inorganica Chimica Acta 356: 308-318

Page 19: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 19

Making the Metallome of Each Species – Can Only be Done from Structure

1. Start with SCOP2. Each {super}family level

assignment was checked manually for metal binding

3. All the structures representing the family had to bind the metal for it to be considered unambiguous

4. The literature was consulted to resolve ambiguities

5. Superfamily database used to map to proteomes

6. 23 Archaea, 233 Bacteria, 57 Eukaryota

7. Cu, Ni, Mo ignored (<0.3%) of proteome

Page 20: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 20

Levels of Ambiguity

• Ambiguous superfamily binds different metals or have members that are not known to bind metals

• Ditto families

• Approx 50% of superfamilies and 10% of families are ambiguous

• Only unambiguous families used in this study

Page 21: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 21

Bacteria Fe superfamilies

a.1.1 a.1.2

a.104.1 a.110.1

a.119.1 a.138.1

a.2.11 a.24.3

a.24.4 a.25.1

a.3.1 a.39.3

a.56.1 a.93.1

b.1.13 b.2.6

b.3.6 b.33.1

b.70.2 b.82.2

c.56.6 c.83.1

c.96.1 d.134.1

d.15.4 d.174.1

d.178.1 d.35.1

d.44.1 d.58.1

e.18.1 e.19.1

e.26.1 e.5.1

f.21.1 f.21.2

f.24.1 f.26.1

g.35.1 g.36.1

g.41.5

Eukaryotic Fe superfamilies

a.1.1 a.1.2

a.104.1 a.110.1

a.119.1 a.138.1

a.2.11 a.24.3

a.24.4 a.25.1

a.3.1 a.39.3

a.56.1 a.93.1

b.1.13 b.2.6

b.3.6 b.33.1

b.70.2 b.82.2

c.56.6 c.83.1

c.96.1 d.134.1

d.15.4 d.174.1

d.178.1 d.35.1

d.44.1 d.58.1

e.18.1 e.19.1

e.26.1 e.5.1

f.21.1 f.21.2

f.24.1 f.26.1

g.35.1 g.36.1

g.41.5

Superfamily Distribution As Well As Overall Content Has Changed

Page 22: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 22

Metallomes are Very Diverse (Discriminatory)

• A quantile plot showing the percent of Bacterial proteomes each Fe-binding fold family occurs in (x).

• This plot also shows the average copy number of that fold family in the proteomes where it occurs (♦).

• Few Fe-binding folds are in most proteomes.

• Widespread Fe-binding folds are not necessarily abundant.

• Similar trends are observed for Zn, Mn, and Co in all three Superkingdoms.

0

2

4

6

8

10

12

14

010

20304050

607080

90100

Unique Fe-binding fold families (108 total)

(x) P

erce

nt o

f Bac

teri

al p

rote

omes

whi

ch a

fold

fam

ily o

ccur

s in

(♦)Average copy num

ber

Page 23: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 23

Metal Binding Proteins are Not Consistent Across Superkingdoms

0

1

2

Zn Fe Mn Co

Archaea Bacteria Eukarya

Total domains in a proteome

Tot

al Z

n-bi

ndin

g do

mai

ns in

a p

rote

ome

10

104

102.5 105

Slo

pe o

f fi

tted

pow

er la

w

A B

Since these data are derived from current species they are independent ofevolutionary events such as duplication, gene loss, horizontal transfer andendosymbiosis

Page 24: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

Power Laws: Fundamental Constants in the Evolution of Proteomes

A slope of 1 indicates that a group of structural domains is in equilibrium with genome

growth, while a slope > 1 indicates that the group of domains is being preferentially

duplicated (or retained in the case of genome reductions).

van Nimwegen E (2006) in: Koonin EV, Wolf YI, Karev GP, (Ed.). Power laws, scale-free networks, and genome biology 24PHAR 201 Lecture 08, 2012

Page 25: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 25

Metal Binding Proteins are Not Consistent Across Superkingdoms

0

1

2

Zn Fe Mn Co

Archaea Bacteria Eukarya

Total domains in a proteome

Tot

al Z

n-bi

ndin

g do

mai

ns in

a p

rote

ome

10

104

102.5 105

Slo

pe o

f fi

tted

pow

er la

w

A B

Page 26: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 26

Why are the Power Laws Different for Each Superkingdom?

• Power laws are likely influenced by selective pressure. Qualitatively, the differences in the power law slopes describing Eukarya and Prokarya are correlated to the shifts in trace metal geochemistry that occur with the rise in oceanic oxygen

• We hypothesize that proteomes contain an imprint of the environment at the time of the last common ancestor in each Superkingdom

• This suggests that Eukarya evolved in an oxic environment, whereas the Prokarya evolved in anoxic environments

Page 27: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 27

Do the Metallomes Contain Further Support for this Hypothesis?

Overall percent of Fe bound bySuperkingdom Fold Family % Fe-binding O2 Fe-S heme amino

Cytochrome P450 0.44 + 0.48 heme yesCytochrome c3-like 0.13 + 0.3 heme noCytochrome b5 0.12 + 0.09 heme no

Eukarya Purple acid phosphatase 0.11 + 0.08 amino no 21 + 9 47 + 19 32 + 12Penicillin synthase-like 0.07 + 0.1 amino yesHypoxia-inducible factor 0.07 + 0.04 amino yesDi-heme elbow motif 0.06 + 0.01 heme no

4Fe-4S ferredoxins 1.80 + 0.7 Fe-S noMoCo biosynthesis proteins 1.60 + 0.3 Fe-S noHeme-binding PAS domain 1.10 + 1.0 heme no

Archaea HemN 0.80 + 0.20 Fe-S 1 68 + 12 13 + 14 19 + 6a helical ferrodoxin 0.60 + 0.16 Fe-S nobiotin synthase 0.55 + 0.1 Fe-S noROO N-terminal domain-like 0.5 + 0.1 amino 2

High potential iron protein 0.38 + 0.25 Fe-S noHeme-binding PAS domain 0.3 + 0.4 heme 1MoCo biosynthesis proteins 0.21 + 0.15 Fe-S no

Bacteria HemN 0.2 + 0.15 Fe-S no 47 + 11 22 + 12 31 + 164Fe-4S ferredoxins 0.2 + 0.2 Fe-S nocytochrome c 0.14 + 0.2 heme noa helical ferrodoxin 0.12 + 0.09 Fe-S no

1. Some, but not all, PAS domains actually sense oxygen2. The Rubredoxin oxygen:oxidoreductase (ROO) protein does not contact oxygen, but catalyzes an oxygen reduction pathway

Page 28: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 28

e- Transfer ProteinsSame Broad Function, Same Metal, Different Chemistry

Induced by the Environment?

Fe-S clustersFe bound by S

Cluster held in place by Cys

Generally negative reduction potentials

Very susceptible to oxidation

CytochromesFe bound by heme (and

amino-acids)

Generally positive reduction potentials

Less susceptible to oxidation

Page 29: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

The importance of “small class” Zn folds to Eukarya

1

10

100

1000

10000

100 1000 10000 100000

Total number of domainsin a proteomes

Tot

al “

smal

l cla

ss”

Zn

bi

ndin

g do

mai

ns

A B

Archaea0/531/28

Eukarya30/5318/28

Bacteria0/530/28

5/530/28

11/539/28

7/530/28

0/530/28

Archaea0/531/28

Eukarya30/5318/28

Bacteria0/530/28

5/530/28

11/539/28

7/530/28

0/530/28

Distribution of 53 unique small class Zn families

0

0.5

1

1.00E-20

1.00E-16

1.00E-12

1.00E-08

1.00E-15

1.00E-12

1.00E-09

1.00E-06

1.00E-11

1.00E-09

1.00E-07

00.511.522.533.544.5

Billions of years before present

Concentration

(O2

in arbitrary units, Zn and Fe in m

oles L-1

BacteriaArchaea

Eukarya

Oxygen

Zinc

Iron

CobaltManganese

29PHAR 201 Lecture 08, 2012

Page 30: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 30

Hypothesis

• Emergence of cyanobacteria changed oxygen concentrations

• Impacted metal concentrations in the ocean

• Organisms used new metals in new ways to evolve new biological processes eg complex signaling

• This in turn further impacted the environment

Page 31: PHAR 201 Lecture 08, 20121 From Reductionism Comes New Science: Protein Structure Data Reveals How Environmental Pressures Shape Evolution PHAR 201/Bioinformatics

PHAR 201 Lecture 08, 2012 31

A Final Thought

Perhaps We Should Study Both the Life Sciences and Earth

Sciences Together?