databanks + new tools = new insights the axiom s imple a tom d epth i ndex c alculator protein fold...

37
Databanks + New tools = New insights THE AXIOM Simple Atom Depth Index Calculator protein fold barcoding CATH – ADAPT… -1

Upload: rosaline-cox

Post on 20-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Databanks +New tools =New insights

THE AXIOM

Simple Atom Depth

Index Calculator

protein fold barcodingCATH – ADAPT… -1

Page 2: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

protein foldingBirth of the Earth

Digging inside objects to discover their origins

SADIC: a new tool to analyze atom depth

Page 3: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

* Chakravarty S, Varadarajan R. Residue depth: a novel parameter for the analysis of protein structure and stability. Structure Fold Des. 1999 7:723-732

* Pintar A, Carugo O, Pongor S. Atom depth as a descriptor of the protein interior. Biophys J. 2003 84:2553-2561.

atom depth calculated as the distance with:

the closest external water*

the closest dot of the water accessible surface*

the closest surface exposed atom*

atom depth

HEWL 4lzt

2D

Page 4: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

atom depth2D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

Calculation of exposed volumes

3D

HEWL 4lzt

2D

Page 5: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

atom depth

Calculation of exposed volumes

HEWL 4lzt

3D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

Page 6: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Calculation of exposed volumes

Depth index:

Di,r = 2Vi,r / V 0,r

where Vi,r is the exposed volume of a sphere of radius r centered on atom i of the molecule and V0,r is the exposed volume of the same sphere when centered on an isolated atom

HEWL 4lzt

atom depth3D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

the sphere radius r should have the biggest value which makes Vi = 0 for the most buried atom

Page 7: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

0,0

0,5

1,0

1,5

2,0

4,0

8,0

12,0

16,0

20,0

24,0

Di,r

r [Å]

Page 8: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Thr 47 α carbon Di,9 = 1.59

Ile 58 α carbon Di,9 = 0.13

Trp 28 α carbon Di.9 = 0.03

58

47

28

atom depth3D vs 2D

HEWL 4lzt

Page 9: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

3D atom depth analysis

from PDB ID1UBQ

http://www.sbl.unisi.it/prococoa/

Di

Page 10: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

SBL Bioinformatics Projects

Projects SADIC correlated:

1. fold dependent aa compositions of protein cores;

2. towards i-SADIC.----------------------------------------------------

Projects SADIC uncorrelated:

1. systematic analysis of PPI

Page 11: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Di analysis of protein atomsdefining strutural

layers in protein 3D structureseach strutural layer

includes atoms with similar Di’s

fast and accurate analysis of aa content of structural

layers

Page 12: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Ln Dicolor

L6 > 1.2 red

L5 1.0 – 1.2 orange

L4 0.8 – 1.0 yellow

L3 0.6 – 0.8 green

L2 0.4 -0.6 blue

L1 0.2 - 0.4 indigo

L0 < 0.2 violet

3 VTR (chitinolytic enzyme 572 aa)

Di analysis of protein atoms

Page 13: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

N 0.19CA 0.30C 0.25O 0.23CB 0.50CG 0.68CD 0.91CE 1.11NZ 1.29

K63

N 0.38CA 0.52C 0.50O 0.52CB 0.76CG 0.95CD 1.17OE1 1.24OE2 1.24

E24

3D atom depth analysisN 0.10CA 0.05C 0.11O 0.18CB 0.02CG 0.02CD1 0.02CD2 0.00

L43

Dimax

Dimax

Dimax

from PDB ID1UBQ

http

://ww

w.s

bl.u

nis

i.it/pro

co

co

a/

Page 14: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Dimax analysis of protein residues

defining aa occupancy in protein strutural layers

each strutural layer includes residues with

similar Dimax’sfast and accurate analysis of aa

distribution in protein structures

Page 15: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Dimax analysis of protein singlesquite a few proteins like to stay single

(at least in the crystalline state)

Bioinformatiha 2, Firenze 18 ottobre

-9

Page 16: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

a database of protein singlesExperimental Method: X-RAY (79,770)

Chain Type: Protein (74,456)

Only 1 chain in asym. unit: (28,803)

Oligomeric state: 1 (21,193)

Number of Entities: 1 (3,517)

Homologue Removal @ 95% identity

(2,410)

2,410 proteins in the dataset

4,657,574 atoms589,383 residues

2162

322482

642802

9621122

12821442

16021762

192202468

1012141618

DOOPS:

Page 17: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

a database of protein singles

2,410 proteins in the dataset

4,657,574 atoms589,383 residues

DOOPS:

Swiss-Prot: 540,958 proteins in the dataset (192 Maa)

2162

322482

642802

9621122

12821442

16021762

192202468

1012141618

0 20001000

Page 18: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

calculation of % amino acid content in L0

the first quantitative analysis of a large array of protein cores!aa % in L0

Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2

Phenylalanine 6.36Glycine 10.81

Histidine 1.32Isoleucine 11.74

Lysine 0.58Leucina 16.27

Methionine 2.49Asparagine 1.7

Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85

Threonine 4.65Valine 13.7

Tryptophan 1.43Tyrosine 2.5

Dimax analysis of protein cores2,410 proteins; 4,657,574 atoms; 589,383 residues DOOPS:

~20 % of total molecular volume ΣDOOPS aa(L0) =

106,088(from 2410 proteins)

core aa if Dimax < 0.2

aa % in L0

Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2

Phenylalanine* 6.36Glycine 10.81

Histidine 1.32Isoleucine 11.74

Lysine 0.58Leucina 16.27

Methionine 2.49Asparagine 1.7

Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85

Threonine 4.65Valine 13.7

Tryptophan 1.43Tyrosine 2.5

Page 19: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

ClassArchitectur

esTopolog

y

Homologous

superfamily

Domains

1 (mainly α) 5 386 875 37,038

2 (mainly β) 20 229 520 43,881

3 (α & β) 14 594 1113 90,029

4 (few sec. str.) 1 104 118 2,588

Total 40 1313 2626173,53

6

Di analysis of protein coresfolding clues from aa core

composition?

:

Page 20: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 total

Proteinsmono

213 (84)

84(40)

19(17)

10(3)

17(13)

57(37)

94(73)

134(110)

12(12)

84(73)

52(44)

139(106)

218203

10(8)

49(49)

1,190(872)( )

Di analysis of protein coresfolding clues from aa core

composition?

#

domain

DOOPS + CATHselected Architectures

with ≥ 10 PDB files

:

Page 21: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Cys

PDB ID 1UZK(A01)

aa % average value (av)

av + σ

av + 2σ

av - σ

av - 2σ

Towards protein folding barcodes

ribbon

LeuPhe

PDB ID 1RG8(A00)

trefoil

Val

PDB ID 2IMH(A01)

four layersandwich

ClassArchitectur

esTopolog

y

Homologous

superfamily

1 5 386 875

2 20 229 520

3 14 594 1113

4 1 104 118

Total 40 1313 2626

% L0 1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 overall

ALA 13,28 10,32 21,46 12,74 9,26 10,05 8,43 9,32 5,5 10,69 10,08 12,58 11,88 14,95 12,0111.5

1ARG 0,6 1,28 0,24 1,39 0 0,64 1,72 0,75 0 0,55 1,11 1,75 0,3 0,47 0,95 0.83

ASN 0,67 2,62 0,73 2,77 1,85 2,04 1,77 1,36 0 2,1 2,9 0,96 1,52 2,8 2,1 1.70

ASP 1,61 2,62 0,24 2,91 1,23 1,27 2,03 1,79 0 2,1 2,9 3,02 1,77 2,34 0,95 1.77

CYS 3,35 2,99 5,37 0,83 22,84 2,04 1,46 4,42 0,92 2,83 2,1 1,49 1,86 1,4 3,05 2.63

GLN 0,6 1,5 0,24 1,11 1,23 1,15 1,81 1,69 0 0,46 1,56 2,15 0,99 1,4 1,33 1.21

GLU 1,48 1,44 0,73 1,52 0 1,15 1,19 1,04 0 0,91 2,59 2,41 1,08 0,93 0,67 1.20

GLY 8,05 8,72 9,76 13,85 16,05 9,92 16,2 10,82 9,17 8,78 11,81 11,35 12,64 13,08 9,9110.8

1HIS 1,01 1,6 2,44 1,11 0,62 0,76 0,79 0,56 0 2,65 1,96 3,02 1,91 0,47 2,48 1.32

ILE 12,68 9,95 10,73 8,59 6,79 13,61 10,68 10,78 13,76 12,8 11,77 12,53 11,53 7,01 11,3411.7

4

LEU 23,88 18,34 22,44 11,77 8,02 17,18 12,97 13,98 33,94 16,54 11,9 14,33 14,22 15,42 13,6316.2

7LYS 0,67 0,91 0 1,11 0 0,38 0,49 0,56 0 0,09 0,62 1,36 0,55 0 0,67 0.58

MET 2,62 4,17 1,71 4,99 0 2,8 2,65 3,15 1,83 2,93 2,76 2,41 2,39 3,27 1,91 2.49

PHE 6,44 6,79 2,93 4,57 4,32 7,12 7,06 6,73 15,6 7,22 4,95 6,18 6,07 4,21 6,01 6.36

PRO 1,34 2,46 3,41 2,63 3,09 3,31 3 2,78 0 3,29 2,9 1,84 2,25 1,4 1,81 2.45

SER 3,49 4,55 3,66 5,96 3,09 5,34 5,56 5,13 2,75 2,83 5,35 4,43 4,23 6,07 5,34 4.85

THR 2,28 4,81 4,15 7,2 5,56 3,31 5,12 4,47 0,92 3,2 5,22 4,25 4,94 5,14 5,91 4.65

TRP 1,01 1,55 0 2,77 3,7 0,38 1,63 2,78 2,75 2,19 1,52 0,66 1,26 0,47 2,1 1.43

TYR 2,62 3,69 0,24 4,57 2,47 1,27 2,69 4,38 0,92 3,29 3,12 1,58 2,32 0 2,29 2.50

VAL 12,34 9,68 9,51 7,62 9,88 16,28 12,75 13,51 11,93 14,53 12,88 11,7 16,29 19,16 15,54 13.7

# PDB

213 (84)

84(40)

19(17)

10(3)

17(13)

57(37)

94(73)

134(110)

12(12)

84(73)

52(44)

139(106)

218203

10(8)

49(49) 2,410

Di of 173,536 CATH domains28 h, 5’ (average comp. time 1.72

s/domain)Calculations performed on

6 cores 990X CPU based computer

Ala

PDB ID 3CKC(A02)

alphahorseshoe

CATH-ADAPT

CATH - atom depth assisted protein

tomography

Page 22: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Towards protein folding barcodesPutting the protein universe in

order

Page 23: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

Towards protein folding barcodesPutting the protein universe in

order

Page 24: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

Page 25: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 26: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

H/D exchange rate profilesD

DD

DD

D

D

D

D

D

D

D

D

D

Page 27: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 28: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 29: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 30: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

2D atom depth or 3D atom depth

H/D exchange rate profiles

data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.

dnwi = or atom distance with the nearest water

molecule

Di,9 = or atom depth index with a probe od radius 9 Å

Page 31: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

iSADIC atom depth 3D atom depth

H/D exchange rate profiles

data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.

Di,9 = or atom depth index with a probe od radius 9 Å

iDi,9 = aDi,9 + bASAi

cDi,9 + dDnwi

Page 32: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

iSADIC atom depth 3D atom depth

H/D exchange rate profiles

iDi,9 = aDi,9 + bASAi

cDi,9 + dDnwi

Page 33: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

protein-protein interface analysis

biological vs crystallographic interfaces

Page 34: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

crystallographic dimers

biological dimers

Page 35: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…
Page 36: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…
Page 37: Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…

vs

N ARG CA ARG C ARG O ARG CB ARG CG ARG CD ARG NE ARG CZ ARG NH1 ARG NH2 ARG H ARG HA ARG HB2 ARG HB3 ARG HG2 ARG HG3 ARG HD2 ARG HD3 ARG HE ARGHH11 ARGHH12 ARGHH21 ARGHH22 ARG

N LYSCA LYSC LYSO LYSCB LYSCG LYSCD LYSCE LYSNZ LYSH LYSHA LYSHB2 LYSHB3 LYSHG2 LYSHG3 LYSHD2 LYSHD3 LYSHE2 LYSHE3 LYSHZ1 LYSHZ2 LYSHZ3 LYS