n. sukumar, curt breneman - cheminformaticsreccr.chem.rpi.edu/presentations/acs_boston_2007.pdf ·...
TRANSCRIPT
![Page 1: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/1.jpg)
Bio- and chem-Informatics: Where do the twain meet?
N. Sukumar, Curt Breneman,
Kristin P. Bennett, Charles Bergeron,
Mark J. Embrechts, Changjian Huang,
Shekhar Garde, Rahul Godawat, Ishita Manjrekar,
Theresa Hepburn, C. Matthew Sundling, Margaret McLellan, Micheel Krein
ACS, August 2007 http://reccr.chem.rpi.edu
![Page 2: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/2.jpg)
Birth of InformaticsBirth of Informatics
40,000 BC Experiment/Observation
~1700 AD Mathematical Theory
1950+ Computation
1970+ Simulation
1990+ Informatics/Data Mining
![Page 3: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/3.jpg)
Cheminformatics/BioinformaticsCheminformatics/Bioinformatics: Statement of the Problem
Experiment Assay Screening or Gene Data(the more data the better)
DataNo Prior Hypothesis
![Page 4: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/4.jpg)
Data Representation Statistical Model Biological Activity
CheminformaticsCheminformatics & Bioinformatics& Bioinformatics: One science, two tongues
The Confusion of Tongues by Gustave Doré (1865)Engraving based on the Minaret of Samarra
NN
Cl
O
AAACCTCATAGGAAGCATACCAGGAATTACATCA…
![Page 5: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/5.jpg)
The vocabulary ofThe vocabulary ofCheminformaticsCheminformatics & Bioinformatics& Bioinformatics
The Tower of Babel by Pieter Brueghel the Elder (1563)
Data Representation Statistical Model Biological Activity
NN
Cl
O
AAACCTCATAGGAAGCATACCAGGAATTACATCA…
![Page 6: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/6.jpg)
The grammar ofThe grammar ofCheminformaticsCheminformatics & Bioinformatics& Bioinformatics
Data Representation Statistical Model Biological Activity
The Building of the Tower of Babelby Abel Grimmer (1570-1619)
ΣΣΣ
Σ
![Page 7: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/7.jpg)
Model Building, Validation and
Applicability Domains
Model Building, Validation and
Applicability DomainsGeneric Data Mining
ToolsGeneric Data Mining
Tools
CheminformaticsCheminformaticsBioinformaticsBioinformatics
Alignment-free Molecular Property
Descriptors
Alignment-free Molecular Property
Descriptors
Protein Chromatography
Modeling
Protein Chromatography
Modeling
Drug Design and QSARDrug Design and QSARProtein Kinetic
Stability PredictionProtein Kinetic
Stability Prediction
Simulation-based Protein Affinity
Descriptors
Simulation-based Protein Affinity
Descriptors
CheminformaticsCheminformatics at RECCRat RECCR(funded under the Molecular Libraries Roadmap Initiative of NIH)
Director: Curt Breneman
![Page 8: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/8.jpg)
Descriptors Model ActivityStructures
Structural Descriptors
Physiochemical Descriptors
Topological Descriptors
Geometrical Descriptors
DescriptorsDescriptors
NN
Cl
O
AAACCTCATAGGAAGCATACCAGGAATTACATCA…
![Page 9: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/9.jpg)
Electron Density DerivedMolecular Surface Properties
– Electrostatic Potential
– Electronic Kinetic Energy Density
– Electron Density Gradients ∇ρ•N– Kinetic Energy Gradients ∇G•N, ∇K•N
– Laplacian of the Electron Density
– Local Average Ionization Potential
– Bare Nuclear Potential (BNP)
– Fukui function F-(r) = ρHOMO(r)
K ( r ) = −(ψ * ∇ 2ψ + ψ∇ 2ψ *)G (r ) = −∇ ψ * .∇ ψ
EP ( r ) =Z α
r − Rαα∑ −
ρ (r' )dr 'r − r'∫
L(r) = −∇ 2 ρ(r) = K (r) − G (r)
PIP ( r ) =ρ i ( r ) ε i
ρ ( r )i∑
( ) ZB N P rr R
αα
α
=−∑
![Page 10: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/10.jpg)
RECON/TAE Descriptors in MOERECON/TAE Descriptors in MOE
![Page 11: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/11.jpg)
Surface Property Distribution Surface Property Distribution RECON/TAE DescriptorsRECON/TAE Descriptors
Surface histograms represent electronic property distributions to provide data for descriptors
PIP (Local Ionization Potential)surface property for a member ofthe Lombardo blood-brain barrierdataset.
![Page 12: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/12.jpg)
• Breneman, C.M., et al., New developments in PEST shape/property hybrid descriptors.J. Comput. aided Mol. Design, 17, 231-240, 2003.
• Wagener, M., J. Sadowski, and J. Gasteiger, J. Am. Chem. Soc., 117, 7769-7775, 1995.
Topological RECON Autucorrelation Descriptors implemented by Bill Katt
RAD: Recon Autocorrelation DescriptorsRAD: Recon Autocorrelation Descriptors
Uses Integrated TAE Surface Properties
Function binned by distance Rxy between atoms x and y.
,( ) 1/xy x y
x yA R n P P= ×∑
d
d
![Page 13: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/13.jpg)
CoEPrA competitionhttp://www/coepra.org/
calibration prediction
1 89 88 9 57872 76 76 8 51443 133 133 9 5787
Basic information about the three datasets.
number of samples number of amino acids
number of COEPRA descriptorsround
Comparative Evaluation of Prediction Algorithms competition organized to provide objective testing of various algorithms via a process of blind prediction for chemical and biological data. Each round consisted of a training and a test set of sequences of amino acid residues (octa/nonapeptides) and 643 COEPRA descriptors per residue, the nature of which are unknown. The activities are binding affinities to HLA-A*0201 major histocompability complex.C. Bergeron, T. Hepburn, C. M. Sundling, N. Sukumar, K. P. Bennett and C. M. Breneman, Protein and Peptide Letters (submitted)
![Page 14: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/14.jpg)
LOO CV calibration
prediction LOO CV calibration
prediction LOO CV calibration
prediction
COEPRA 0.6253 0.45525 0.72626 0.67786 0.72143 0.69124MOE/RAD 0.26075 0.34384 0.40711 0.38614 0.42743 0.49457SIMIL 0.5121 0.35226 0.5751 0.54868 0.583 0.61791COEPRA+MOE/RAD 0.67972 0.46358 0.7415 0.66128 0.72444 0.69411COEPRA+SIMIL 0.61979 0.45916 0.73493 0.66398 0.72071 0.69336all 0.67466 0.46638 0.73861 0.66339 0.72661 0.69404
COEPRA 0.29793 0.40054 0.4984 0.74618 0.47026 0.58993MOE/RAD 0.095377 0.14367 0.3228 0.54602 0.30109 0.44067SIMIL 0.14204 0.19956 0.6134 0.42729 0.48197 0.51488COEPRA+MOE/RAD 0.29279 0.40289 0.50207 0.78416 0.46442 0.59074COEPRA+SIMIL 0.27913 0.41212 0.50504 0.75443 0.47468 0.59474all 0.27472 0.41359 0.5091 0.78195 0.46891 0.59566
COEPRA 0.30222 0.15262 0.3544 0.19975 0.37345 0.21932MOE/RAD 0.16246 -0.13471 0.1037 0.035348 0.17685 0.19992SIMIL 0.23704 0.0321 0.33496 0.11797 0.32633 0.16893COEPRA+MOE/RAD 0.30319 0.17773 0.3541 0.2115 0.37538 0.24234COEPRA+SIMIL 0.30468 0.14853 0.35641 0.19684 0.37588 0.21944all 0.30459 0.17254 0.35553 0.2078 0.37697 0.24051
round 1
round2
round 3
exponential KPLSPLS Gaussian KPLS
Results of CoEPrA competition
![Page 15: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/15.jpg)
ROMS implemented by Changjian Huang and Mark J. Embrechts
![Page 16: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/16.jpg)
![Page 17: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/17.jpg)
Protein Protein RECONRECON
![Page 18: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/18.jpg)
![Page 19: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/19.jpg)
![Page 20: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/20.jpg)
1CGP – DNA Complex
![Page 21: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/21.jpg)
Representations of DNA Structure: Representations of DNA Structure: Can we improve on ATCG?Can we improve on ATCG?
• Promoter regions and transcription factor binding sites require specific identification
• Most successful methods represent DNA by sequence of letters
• DNA bases assumed to act independently
• Several higher order multibase models exist
• Sequence data for training/validation is usually limited
• Representation of DNA has little to do with the energetics of binding of protein to DNA
![Page 22: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/22.jpg)
““DixelDixel”” DNA descriptorsDNA descriptors
• A “basis set” of all possible nucleotide base pairs with all possible neighbors results in a set of base pair “triplets”.
• Ab Initio properties of base pair and two flanking base pairs (end capped) are computed.
Central base pair is encoded and stored as a RECON “Dixel” object.
![Page 23: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/23.jpg)
Base pair properties perturbed by Base pair properties perturbed by flanking base pairs flanking base pairs –– raw raw ““DixelDixel”” datadata
![Page 24: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/24.jpg)
Dixels for EP Dixels for PIP
![Page 25: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/25.jpg)
CRP Dixel Model and Conserved base-pairsDescriptor Importance by Property
0
0.51
1.5
2
2.5
33.5
4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
BNP DGN DRN EP G LAPL PIP
![Page 26: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/26.jpg)
![Page 27: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/27.jpg)
![Page 28: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/28.jpg)
PEST: Molecular Shape/Property PEST: Molecular Shape/Property Hybrid Encoding Hybrid Encoding
• Curt M. Breneman, C. Matthew Sundling, N. Sukumar, Lingling Shen, William P. Katt and Mark J. Embrechts, “New developments in PEST shape/property hybrid descriptors” J. Computer-Aided Mol. Design, 17, 231–240, (2003)
• Karthigeyan Nagarajan, Randy Zauhar, and William J. Welsh, “Enrichment of Ligands for the Serotonin Receptor Using the Shape Signatures Approach” J. Chem. Inf. Model., 45, 49-57 (2005)
Ø A TAE property-encoded surface is subjected to internal ray reflection analysis.
Ø A ray is initialized with a random location and direction within the molecular surface and reflected throughout inside the electron density isosurface until the molecular surface is adequately sampled.
Ø Molecular shape information is obtained by recording the ray-path information, including segment lengths, reflection angles and property values at each point of incidence.
Ø Adds shape information that encode the spatial relationships of surface properties
Ø Alignment-free
![Page 29: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/29.jpg)
Regional Shape/Property Surface Encoding in PEST Analysis
![Page 30: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/30.jpg)
Molecular Surface analysisusing PEST rays
Property-Encoded Electron Density-derived Surface
EP(2,5) EP(6,1)
![Page 31: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/31.jpg)
PPEST Protein Shape/Property Descriptors PPEST Protein Shape/Property Descriptors
PPEST implemented by Qiong Luo and Matt Sundling
![Page 32: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/32.jpg)
1POC EP pH 3.0
1POC EP pH 6.0
1POC EP pH 4.0
1POC EP pH 7.0
1POC EP pH 5.0
1POC EP pH 8.0
1POC (Bee-venom Phospholipase A2 )pH Sensitive EP Encoding by Matt Sundling
![Page 33: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/33.jpg)
1POC LENGTH/EP Protein PEST1POC LENGTH/EP Protein PEST
1POC EP ph 6.0
1POC EP ph 4.0
1POC EP ph 7.0
1POC EP ph 5.0
1POC EP ph 8.0
![Page 34: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/34.jpg)
• Liquid water constitutes one of the essential components of biological systems and it is difficult to overstate the role of water in biological structure and function.
• Proteins crystallize with several units of water weakly bound to the rest of the protein
• Water provides the thermodynamic driving force for proteins to fold and self-assemble.
• It mediates not only tertiary and quaternary interactions, but also interactions between different biomolecules and between biomolecules and ligands or surfaces.
• Water is also known to take part in specific enzymatic reactions.• Protein conformational dynamics appear to be linked (slaved) to the
dynamics of vicinal water, thereby affecting protein function.• Water in the vicinity of proteins and other biomolecules critically
influences protein structure, dynamics, function and other thermodynamic and kinetic properties.
Role of water in proteinsRole of water in proteins
![Page 35: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/35.jpg)
SimulationSimulation--derivedderivedHydrationHydration--based descriptorsbased descriptors
Statistical analysis of the dynamics of water distributions solvating proteins used to create a set of regional property descriptors:
Water O fluctuation Structure Electron density
• average local water density,• water density fluctuations,• local water orientations,• electron density profile due to water packing and orientations (polarization),• electrostatic potential on protein surface induced by the vicinal water
structuring,• dynamics of local water.
![Page 36: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/36.jpg)
First hydration shell propertiesFirst hydration shell propertiesof of CspBCspB proteinprotein
Protein amino acids(green = hydrophobic,blue = positively charged,red = negatively charged)
local water-O density local water-H density local electron density projected onto the triangulated protein surface
Hydration-based descriptors developed and implemented by Shekhar Garde, Rahul Godawat and Ishita Manjrekar
![Page 37: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/37.jpg)
PMF expansion based methodPMF expansion based method• Developing an efficient alternative to full simulations by means
of a potentials-of-mean-force expansion
• employing a library of lower-order correlation functions derived from explicit simulations to predict the average equilibrium density and the orientation profile of water in the space surrounding biomolecules or ligands.
Water density values in space surrounding an alpha-helix (left) and a protein X (right) predicted using the PMF expansion (cyan) and obtained from exact simulation (magenta)
![Page 38: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/38.jpg)
Eleven different canonical sites that represent protein atoms in the PMF-library, obtained from clustering of AMBER force-field data
~ 0.6% 1.04600 0 .356 -0 .16423 S
~ 2% 0.71128 0 .325 -0 .89731 N2
~ 15% 0.71128 0 .325 -0 .41573 N1
~ 3% 0.87864 0 .296 -0 .81021 O2
~ 15% 0.87864 0 .296 -0 .57349 O
~ 2% 0.88031 0 .307 -0 .66428 OH
~ 7% 0.35982 0 .34 -0 .14331 C1
~ 15% 0.35982 0 .34 0 .59735 C
~ 4% 0.45773 0 .34 0 .27946 CT
~ 25% 0.45773 0 .34 -0 .01545 CT
~ 9% 0.45773 0 .34 -0 .29039 CT
% in typ.
prote in
Eps ilon (KJ/m ol)
S ize (nm )
Charge A tom Type
![Page 39: N. Sukumar, Curt Breneman - Cheminformaticsreccr.chem.rpi.edu/Presentations/ACS_Boston_2007.pdf · 2007. 11. 24. · testing of various algorithms via a process of blind prediction](https://reader034.vdocument.in/reader034/viewer/2022052104/603f64fb4bcd9e62042d136f/html5/thumbnails/39.jpg)
http://reccr.chem.rpi.edu
RECCR is funded under the Molecular Libraries Roadmap Initiative of NIH(# 1P20HG003899-01 of 09-23-2005)
Thank you!