julia salas cs379a 1-24-06

12
Julia Salas CS379a 1-24-06

Upload: mikaia

Post on 28-Jan-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Julia Salas CS379a 1-24-06. Aim of the Study. To survey the docking and scoring algorithms available today Evaluate protocols for three tasks: 1. Prediction of the conformation of ligand bound to protein target 2. Virtual screening of database to identify leads - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Julia Salas CS379a 1-24-06

Julia Salas

CS379a

1-24-06

Page 2: Julia Salas CS379a 1-24-06

Aim of the Study

• To survey the docking and scoring algorithms available today

• Evaluate protocols for three tasks:1. Prediction of the conformation of ligand bound to protein target

2. Virtual screening of database to identify leads

3. Prediction of binding affinities

General Methods• Investigate several docking programs using a variety of different

target types

• Use a large set of “closely related compounds” (compound set) for each target type

Page 3: Julia Salas CS379a 1-24-06

Target Types/Targets Used

• Target Types:Target Types: 7 protein classes represented

• Targets: 8 proteins of interest to GSK

• Variety: Diversity of mechanisms, binding site shape, binding site chemical environment

Page 4: Julia Salas CS379a 1-24-06

Goal: Represent a typical pharmaceutical compound collection

Compound Sets Used (Ligands)

• Compound/Ligand Sets: 1303 compounds– 150-200 “closely related” compounds– Compounds have experimentally determined affinities– Affinities of compounds in a single set span a min of 4

orders of magnitude– Each set has shown biological activity towards target

protein– Each set has a max of 20% inactive and 20% extremely

active compounds– Each set has published (2-54) cocrystal structures with

the target protein

Page 5: Julia Salas CS379a 1-24-06

Compound Sets Used (Ligands)

• zdc

Page 6: Julia Salas CS379a 1-24-06

Docking and Scoring Algorithms

Docking Algorithms• Evaluated 10 programs with

different algorithms and scoring functions:– 19 protocols total

Procedure• Each method evaluated by an

expert, no time restrictions or other constraints

• Evaluators did not have cocrystal structures, only ligand structure and protein active site residues

Same ligand starting structure:

•Optimized to a (local) min

•“Reasonable” bond distances/angles

•Correct atom hybridization

•4 structures provided (differ in ionization)

•SMILES (text-based) structure description

Page 7: Julia Salas CS379a 1-24-06

Analysis of Docking Programs and Scoring Functions

• 19 protocols evaluated on three tasks:

1. Prediction of the conformation of ligand bound to

protein target

2. Virtual screening of database to identify leads

3. Prediction of binding affinities

Page 8: Julia Salas CS379a 1-24-06

Prediction of Ligand Conformation Bound to Protein Target

• Compare predictions to (136) cocrystal structures using:

1. rmsd for heavy atoms

2. Volume overlap Tanimoto similarity index• Two standards for success: rmsd within

– 2Å (correct orientation) Black Bars– 4Å (within binding site) Gray Bars

• Can evaluate both the scoring function and the overall methods

IX, ID= Vol overlap integrals for crystal and docked structure

OX,D=Vol overlap between crystal and docked pose

0 ≤ Tvol ≤ 1

Page 9: Julia Salas CS379a 1-24-06

Prediction of Ligand Conformation Bound to Target: Conclusions

The good…• Docking programs could generate crystal conformations

• For “all” (-HCVP) targets, at least one program could dock ≥40% of ligands within 2%

– 90% of ligands could be docked with 4Å with 100% docked in correct location

The bad…• Program with best performance changes

target to target

• Scoring function lead to consistently incorrect predictions

• HCVP had very weak predictions

Page 10: Julia Salas CS379a 1-24-06

Virtual Screening of Database to Identify Leads

• Ability to identify the active compounds1. Enrichment: How quickly did the protocol identify the active compound vs.

random chance?

• Success: Identify at least 50% of the active compounds within the top 10% of the score-ordered list halfway between random and max.

2. Lead Identification: Cost analysis…how many compounds do you need to screen to find at least one active compound from each class?

• All active compound classes ID’d within top 10%• Percent actives vs. percent compounds screened

measured

Page 11: Julia Salas CS379a 1-24-06

Prediction of Binding Affinities

• Calculated docking scores compared to measured affinity

• Docking scores were autoscaled and then compared

• Conclusions:

– No statistically significant correlation between scoring function and measured affinity

Page 12: Julia Salas CS379a 1-24-06

Conclusions and Discussion Questions

• Docking programs were able to generate poses that resemble cocrystal structures

• Largest difficulties were in determining the small molecule structure, not placing ligand in binding site

• Scoring functions were not successful in predicting the best structures• Active compounds could be identified in a pool of decoys• Docking scores could not be correlated to affinity

Question 1: What factors may have contributed to the failure of these programs to predict small molecule conformation?

Question 2: The failure of the programs to predict HCVP structures was attributed to the enzyme’s large active site. Why? Additionally, should flexibility/dynamics be considered?

Question 3: Compound classes were defined by similar backbone structure. Although all compounds in a class had measured affinities, can we assume they all have the same binding mode?