analyzing the simplicial decomposition of spatial protein structures rafael Ördög, zoltán...

36
Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Upload: lisa-matthews

Post on 31-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Analyzing the Simplicial Decomposition of Spatial Protein Structures

Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Page 2: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Aims of our research

AimsEasy to use protein database containing

relevant geometrical data on proteins. (Capable of treating thousands of PDB entries at once.)

Drug discovery by data mining in the database.

Page 3: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Steps of our research

Steps Cleaning and restructuring the PDB (RS-PDB)

Done by Zoltan Szabadka

Creating a database of geometrical & chemo-geometrical data

Under construction in our present research

Discovering rules, and creating learning systems for ligand pre-docking.

Mostly later work

Page 4: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 5: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 6: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 7: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 8: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 9: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 10: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 11: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Delaunay Decompositions To find the Delaunay

decomposition of a set, we have used the qhull algorithm, its source is available at: http://www.qhull.org/.

Page 12: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Important properties of Delaunay decompositions Regions are defined

by circum spheres being empty (Region is empty as well)

Regions are tetrahedra except if more than 4 points are on the same sphere.

Page 13: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Important properties of Delaunay decompositions Partition of the convex hull of A. The graph defined by the edges of the

Delaunay regions: Delaunay GraphCan be used for searching closest neighbors

Page 14: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Delaunay decomposition of heavy atoms of the protein in 1n9c with the ligand

Page 15: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Important properties of Delaunay decompositions The “dual” structure can solve the Post

Office problem.Partitioning the city into service areas of

given post offices, so that every one belongs to the closest post office.

Duality here is only theoretical, in practice it is the same structure. (Voronoi diagram.)

Page 16: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz
Page 17: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Previous work

Singh, Tropsha and Vaisman The point set was chosen to be the set of

Cα atoms of the protein Aim: predict secondary protein structure In contrast: we chose the point set to be

the set of all heavy atoms. (Non hydrogen atoms.)

Page 18: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume

Page 19: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Tetrahedrality

Page 20: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality

Tetrahedrality: 0 for regular tetrahedra, and < 1

i<j(li-lj)2)

(15 (ili / 6)2)

Page 21: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Frequency

Two dimensional temperature plots of the frequency of regions with given volume and tetrahedrality. In all proteins (Our whole database) In a given protein

Page 22: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of all regions (Cα atoms)

Page 23: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of all regions (Heavy atoms)

Page 24: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of all regions

Page 25: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of regions with ligand atom

Page 26: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of regions with ligand atom

Page 27: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of all regions (Heavy atoms)

Page 28: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Classifying by corner atoms

Question: are the different peaks in the earlier plots in connection with the function of the corner atoms?Classification by the symbols of corner atomsClassification by hetid of the residues the

atom is found in. Question: How frequent are different

corner atom sets?

Page 29: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Most frequent corner sets

C C N O25%

C C C O17%

C C C N11%

C C C C10%

C C O O10%

C N O O9%

C N N O6%

C C N N5%

C O O O2%

Other5%

Page 30: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Connection of volume and tetrahedrality corner atom set

Page 31: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Volume and tetrahedrality of all regions (Heavy atoms)

Page 32: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Frequency of metals in different types of tetrahedra

  CCNO CNOO CNNO COOO CCNN NNOO NOOO CNNN NNNO OOOO CCCS NNNN CCNS CCSS NOSS

ZN 5 7 3 2 0 14 14 0 38 2 0 0 4 0 32

MG 1 11 4 1 0 15 10 6 5 6 0 5 0 0 0

FE 2 3 0 1 1 0 0 2 0 0 2 1 0 5 0

MN 0 0 0 3 1 3 4 0 4 5 0 0 0 0 0

CA 0 0 0 0 0 0 1 0 2 48 0 0 0 0 0

Ca appears almost exclusively in the vicinity of four Oxygen

Zn prefers NOSS and NNNO type of tetrahedra, but also frequent in CNOO NNOO NOOO

Only Zn was found in NOSS

Page 33: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Thank you

Page 34: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

About the geometric extension

Presently we cannot handle: Missing atoms Precision errors, non-tetrahedral regions

The PDB is handled as a juggled input The resulting database can only be used for quality statistical

purposes.

Strongly restricted database. No missing atoms, 2.2 Ǻ resolution, includes protein

5757 such PDB (June 23, 2006 )

Our current research addresses the problems above.

Page 35: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Recent problems For example aromatic rings should be on one

circle, in one plane, hence on one sphere, but they refuse to be: Distortion is minor, not recognizable by eye

Is it just measuring error? Or is it due to the structure around the ring?

In contrast some atoms not expected to fall on one sphere tend to do so.

Page 36: Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz

Structure of the geometric extension Essential:

Corner Reference to the atoms in the RS-PDB

Region the radius and coordinates of the center of the circum sphere volume and tetrahedrality of the tetrahedron three type of bond graphs code hetid, atom name, and symbol set assigned to the regions

corner set and more

Additional: Edge, Neighbor, (Ligand) Atom