from structure to function janet thornton european bioinformatics institute

36
From Structure to Function Janet Thornton European Bioinformatics Institute

Upload: gavin-freeman

Post on 12-Jan-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: From Structure to Function Janet Thornton European Bioinformatics Institute

From Structure to Function

Janet Thornton

European Bioinformatics Institute

Page 2: From Structure to Function Janet Thornton European Bioinformatics Institute

From Structure to Functional Annotation

Page 3: From Structure to Function Janet Thornton European Bioinformatics Institute

Mid-West Center forStructural Genomics (MCSG)

University of TorontoAled Edwards

Argonne National LaboratoryAndrzej Joachimiak

Northwestern UniversityWayne Anderson

University of Washington at St LouisDaved Fremont

UT Southwestern Medical CenterZbyszek Otwinowski

University of VirginiaWladek Minor

EBI / University College LondonJanet Thornton, Christine Orengo

Page 4: From Structure to Function Janet Thornton European Bioinformatics Institute
Page 5: From Structure to Function Janet Thornton European Bioinformatics Institute

60 structures solved to date

ylxR hypothetical cytosolic protein

Hypothetical protein (EC4030_F)

Hypothetical protein (MTH1)

ygbM hypothetical protein (EC1530)

Conserved hypothetical protein (MT777)

cutA protein implicated in Cu homeostasis (TM1056)

Some examples …~30% are ‘hypothetical proteins’

Page 6: From Structure to Function Janet Thornton European Bioinformatics Institute

TIM barrel enzymes – 18 different homologous families

>60 different E.C. numbers

EC Wheel of TIM barrelsStructure of TIM barrel:Triose phosphate isomerase

Page 7: From Structure to Function Janet Thornton European Bioinformatics Institute

Pairwise sequence identity and conservation of enzyme function (Todd et al 2001)

• Single-domain proteins: >81,000 homologous enzyme / enzyme and enzyme / non-enzyme pairs

0%10%20%30%40%50%60%70%80%90%

100%

0-10

11-2

0

21-3

0

31-4

0

41-5

0

51-6

0

61-7

0

71-8

0

81-9

0

91-1

00

Sequence identity (%)

UnconservedConserved

Fractionalpercentage

Page 8: From Structure to Function Janet Thornton European Bioinformatics Institute

From Structure To Biochemical Function

Gene Protein 3D Structure Function

Given a protein structure:• Where is the functional site?• What is the multimeric state of the protein?

– PQS – Hannes Ponstingl (this morning)

• Which ligands bind to the protein?• What is biochemical function?

Page 9: From Structure to Function Janet Thornton European Bioinformatics Institute

Automated Structure Comparison

• The most powerful method for assigning function from structure is global or partial 3D structure comparison (e.g. Dali, SSAP; SSM)

• Hidden Markov Models derived from structural domains can often recognise distant relatives from sequence– Christine Orengo (tomorrow)

Page 10: From Structure to Function Janet Thornton European Bioinformatics Institute

Aspartate Amino Transferase Superfamily

Aspartate Aminotransfera

se

2,2-Dialkylglycine Decarboxylase

Tyrosine Phenolyase

Ornithine Decarboxylase

Page 11: From Structure to Function Janet Thornton European Bioinformatics Institute

Aspartate Amino Transferase Superfamily

Aspartate Aminotransferase

2,2-Dialkylglycine Decarboxylase

Tyrosine Phenolyase

Ornithine Decarboxylase

2.6.1.1

4.1.1.64 4.1.1.17

4.1.99.2

77

76

77

76

73

79

11

106

9

7

7

Page 12: From Structure to Function Janet Thornton European Bioinformatics Institute

Aspartate Amino Transferase Family

Aspartate Aminotransferase

2,2-Dialkylglycine Decarboxylase

Tyrosine Phenolyase

Ornithine Decarboxylase

2.6.1.1

4.1.1.64

4.1.1.17

4.1.99.2

all bind Pyridoxal 5’ Phosphate (PLP) co-factor

Page 13: From Structure to Function Janet Thornton European Bioinformatics Institute

Number of enzyme functions

0

10

20

30

40

50

60

superfamilies

num

ber

of

enzy

me

fu

nctio

ns

structural data

structural andsequence data

/ hydrolases

type I PLP-dependent enzymes

TIM barrel glycosyl hydrolases

Page 14: From Structure to Function Janet Thornton European Bioinformatics Institute

Convergent and Divergent Evolution

• Unrelated proteins can perform the same function (convergent evolution), sometimes using the same mechanism – sometimes using different mechanisms

• Related proteins can perform different functions – divergent evolution

Page 15: From Structure to Function Janet Thornton European Bioinformatics Institute

Active site convergence

Trypsin Subtilisin

Page 16: From Structure to Function Janet Thornton European Bioinformatics Institute

Alpha/beta hydrolaseTrypsin Subtilisin

Brain platelet activating factor acetylhydrolase

CheB methylesterase

Clp protease

Page 17: From Structure to Function Janet Thornton European Bioinformatics Institute

Predicting Binding SiteBinding-site analysis: cutA

Most likely binding site

Surface clefts

Residue conservation

Conserved surface patches

Page 18: From Structure to Function Janet Thornton European Bioinformatics Institute

Identifying Binding Site Function Using Motifs

- 3D enzyme active site structural motifs (Craig Porter)

- Catalytic Site Atlas - Identification of catalytic residues (Gail Bartlett, Alex Gutteridge)

- Metal binding sites (Malcolm MacArthur)

- Binding site features (Gareth Stockwell)

- Automatically generated templates of ligand-binding and

- DNA binding motifs (Sue Jones, Hugh Shanahan)

- “Reverse” templates (Roman Laskowski)

JESS – fast template search algorithm (Jonathan Barker)

PINTS - Searches for similar clusters (Aloy, Russell … – EMBL Heidelberg))

Page 19: From Structure to Function Janet Thornton European Bioinformatics Institute

Catalytic Site Atlas

Enzyme reports from primary literature information -lactamase Class A– EC: 3.5.2.6– PDB: 1btl– Reaction: -lactam + H2O -amino acid– Active site residues: S70, K73, S130, E166– Plausible mechanism:N

O

OH

N H 2

OH

S e r

L y s

S e r

N H 3 +

O

H

O

N

O

S e r

L y s

S e r

N H 3 +

O

O

NH

O

O

O

OH

H

S e r

L y s

S e r

G l u

OO H

O

OHO

NH

O

H

N H

S e r

L y s

S e r

G l u

Page 20: From Structure to Function Janet Thornton European Bioinformatics Institute

3-D templates

•Use 3D templates to describe the active site of the enzyme

–analogous to 1-D sequence motifs such as PROSITE, but in 3-D

•Sequence position independent

•Captures essence of functional site in protein

Page 21: From Structure to Function Janet Thornton European Bioinformatics Institute

TEmplate Search and Superposition TESS

• defines a functional site as a sequence-independent set of atoms in 3-D space

• search a new structure for a functional site

• search a database of structures for similar clusters

Wallace et al., 1997

e.g. serine proteinase,catalytic triad

Page 22: From Structure to Function Janet Thornton European Bioinformatics Institute

Pepsin

Page 23: From Structure to Function Janet Thornton European Bioinformatics Institute

Eukaryotic & Fungal Aspartic Proteinases: all-atom DTG-DTG Template

Aspartic Proteinase - Active Site residues - [DTG]x2

Page 24: From Structure to Function Janet Thornton European Bioinformatics Institute

A template of 8 atoms is sufficient to identifyall Aspartic Proteinases

Asp CO2 Gly C

Gly CAsp O

Thr/Ser O

Thr O

Aspartic Proteases: Active Site Template

Page 25: From Structure to Function Janet Thornton European Bioinformatics Institute

green= truered=false

Aspartic Protease Template Search

against all PDB

Page 26: From Structure to Function Janet Thornton European Bioinformatics Institute

3D Templates to Characterise Functional Sites

Template searches

(189 enzyme active site templates)

(~600 Metal binding site templates)

Page 27: From Structure to Function Janet Thornton European Bioinformatics Institute

GARTfaseCholesterol oxidaseIIAglc histidine kinase

Carbamoylsarcosineamidohhydrase

Dihydrofolate reductase Ser-His-Aspcatalytic triad

Database of enzyme active site templates189 templates

Page 28: From Structure to Function Janet Thornton European Bioinformatics Institute

MCSG structure

BioH – unknown function involved in biotin synthesis in E.coli

An example

Structure: Rossmann fold, hence many structural homologues

Expected to be an enzyme

Sequence contains two Gly-X-Ser-X-Gly motifs typical ofacyltransferases and thioesterases

Page 29: From Structure to Function Janet Thornton European Bioinformatics Institute

Ser-His-Asp catalytic triad of the lipases with rmsd=0.28Å

(template cut-off is 1.2Å)

CSA template searchOne very strong hit

Experimentally confirmed by hydrolase assays

Novel carboxylesterase acting on short acyl chain substrates

Page 30: From Structure to Function Janet Thornton European Bioinformatics Institute

Templates of Active Sites• Catalytic cluster conserved – Simple template

–e.g. Aspartic Proteinase (DTG)x2

• Order and geometry of catalytic residues varies–Multiple templates e.g. Polymerases

• Same catalytic cluster used in many different enzyme functions – one template identifies multiple active sites in unrelated structures

– eg Asp/His/Ser catalytic triad is well conserved in structure

Page 31: From Structure to Function Janet Thornton European Bioinformatics Institute

Instances of convergence Ser-His-Asp triads Cys-His-Asp triads Ribonuclease T1s Malic enzyme and isocitrate dehydrogenase Haloperoxidases Creatinase and carboxypeptidase G2 Glycosidases Class II extradiol-type dioxygenase and class III

extradiol-type dioxygenase Receptor tyrosine phosphatase and low-molecular

weight tyrosine phosphatase Pyridoxal 5' phosphate enzymes

James Torrance

Page 32: From Structure to Function Janet Thornton European Bioinformatics Institute

Template databases

• HAND CURATED– Enzyme active sites (PROCAT) – 189 templates

• Currently being extended

– Metal-binding sites – 600 templates

• AUTOMATED– Ligand-binding sites – 10,000 templates

– DNA-binding sites – 800 templates

Page 33: From Structure to Function Janet Thornton European Bioinformatics Institute

Another example of convergent evolution: The DNA HTH Binding Motif

1jhg

1hcr 1b9m 1eto

1lmb

1ais

1orc Sue Jones

Page 34: From Structure to Function Janet Thornton European Bioinformatics Institute

ProFunc – function from 3D structure

Homologous sequences of known function

Binding site identification and analysis

Homologous structures of known function

Functional sequence motifsQ-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC]

Enzyme active site 3D-templates

HTH-motifs Electrostatics Surface comparison

… etc

DNA-, ligand- binding and “reverse” templates

Residue conservation analysis

Page 35: From Structure to Function Janet Thornton European Bioinformatics Institute

Three MCSG Examples(James Watson)

Three examples show the varying levels of information that can be retrieved from structures:

1. Almost full functional information. GOOD

•APC 1040

2. General information. NOT SO GOOD

•APC 012

3. Little or no information obtained. UGLY

•APC 078

Page 36: From Structure to Function Janet Thornton European Bioinformatics Institute

Acknowledgements

• Roman Laskowski, James Watson, Richard Morris, Rafael Najmanovich, Fabian Glaser - EBI

• Christine Orengo, Annabel Todd, James Bray, Russell Marsden – University College, London

• MCSG members – Andzrej Jaochimiak, Al Edwards etc

• Funding: NIH - PSI; EU - SPINE; DoE – DNA Motifs; UK BBSRC LINK