pose and affinity prediction by icm in d3r gc3 property fields (apf) totrov m. atomic property...

23
Pose and affinity prediction by ICM in D3R GC3 Max Totrov Molsoft

Upload: doanphuc

Post on 03-May-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

PoseandaffinitypredictionbyICMinD3RGC3

MaxTotrovMolsoft

Posepredictionmethod:ICM-dock

• ICM-dock:- pre-samplingofligandconformers- multipletrajectoryMonte-Carlowithgradientminimizationininternalcoordinates

- receptorrepresentedbygridpotentials

- multiplereceptorconformationsusedwhenneededtoaddressflexibility

- posere-rankingwithICMVLSscore

- optionalchemicalbiasingbyAPFfromavailableexperimentalligandstructures/templates

Atomic Property Fields (APF)

TotrovM.AtomicPropertyFields:generalized3Dpharmacophoric potentialforautomatedligandsuperposition,pharmacophoreelucidationand3DQSAR.ChemBiolDrugDes.2008;71(1):15-27.

3D pharmacophore: arrangement of molecular properties in space that confers activity. Generalization of point pharmacophore concept:Discrete pharmacophoric points ® Continuous distributionsMoieties represented as Ph4 types ® Vectors of atomic propertiesfji - vector of properties i for atom jAtomic similarity measure - dot product of property vectors: ∑ fji fki

Atomic Property Field (APF) - continuous 3D potential:Pi(r) =∑ fji exp((r-rj)2/l2APF); Pseudo-energy (score) of a compound in APF: EAPF=-∑fji Pi(rj); Implementation - on a 3D (multi)grid- continuous derivatives (spline)- fast potential for molecular mechanics/optimization in combination with force-field energy

APF - 3D pharmacophoric potential

AccuracyofFlexibleLigand Superposition

ADA CDK2 DHFR ER FXA HIVRT NA P38 THR TK TRP mean(39) (72) (410) (39) (146) (43) (49) (454) (72) (22) (49) (1100)

----------------------------------------------------------------------------------------------------------------------------------------

Surflex-sim 212.8212.544.39 56.414.1118.6 18.379.694.17 68.1840.82 23.15135.951.3953.66 43.5933.5672.09 75.5169.693.06 31.8259.18 59.07051.2836.111.95 062.339.3 6.1220.72.780 0 17.78

ROCS 212.8243.0674.15 41.0314.3830.23 79.599.472.7886.368.16 35.63120.5136.1114.39 56.4128.7734.88 14.2941.1969.449.0981.63 32.83066.6720.8311.46 2.5656.8534.88 6.1249.3427.784.5510.2 31.54

FlexS 215.382556.148.7235.6216.28 36.7314.9830.5681.8218.37 33.48120.5119.4411.71 43.5913.746.51 57.1474.015.5613.642.04 35.77064.155.5632.2 7.6950.6837.21 6.1211.0163.894.5579.59 30.75

ICM/APF 246.1512.586.83 51.2870.55 18.6 75.5120.0488.8990.9169.39 54.48123.0868.0611.95 46.1516.44 46.51 14.2968.289.729.0928.57 36.49030.7719.441.22 2.5613.0134.88 10.211.671.3902.04 9.03

Giganti et al. J Chem Inf Model 2010, 50, 992-1004

Independent broad benchmark: ligands without X-ray structures but similar chemotype to a solved complex. Assessment of superposition quality 2/1/0 -’good’/’acceptable’/’poor’. 11 targets from DUD (out of 40).

Ligand-BiaseddockingwithAPF

• MCdockingsimulations:APFpotentialsinadditiontophysicalinteractiontermgrids

• Poseranking:compositescorecombiningphysics-basedICMVLSscoreandAPFpseudo-energy

LamPC,Abagyan R,TotrovM.Ligand-biasedensemblereceptordocking(LigBEnD):ahybridligand/receptorstructure-basedapproach.JComput AidedMol Des.2018;32(1):187-198.

VisualizationofAPFusedforligandbiasinCathepsin Sdocking

D3RCathepsin S:PosepredictionAverageRMSD,toppose AverageRMSD,bestposeof5

Median RMSD,bestposeof5MedianRMSD,toppose

1.06Å1.7Å

2.82Å 1.31Å

CatS Ligands:RMSDfortop5poses

- Mostaccurateposeof5

Apparentcrystalcontacteffects

SuperimposedanswerX-raysnmxm(CatS_7),rpwj (CatS_9)andgabj (CatS_14).Extensiveligand-crystallographicneighborcontacts(~150Å2)arevisible

Primary‘receptor’Cathespin

Ligands

CrystalneighborCathespin

Primary‘receptor’Cathespin

CrystalneighborCathespin

SuperimposedanswerX-rayyrpk(CatS_16),anditstoppredictedpose.AlsoshownaretopposesforCatS_7,CatS_9andCatS_14(thinwires)

KinasesandFXR:flexibilityensembles

• Ensembleconstruction:- PDBstructurescollected/alignedviaPocketome database

- Upto10representativeX-raystructuresselectedbyiterativeproceduretomaximizenumberofcompatibleligands

- Foreachreceptorconformation,compatibleligandsareusedasAPFtemplatesindocking BoundLigand/Receptorconformation

compatibilitymatrixheatmap forVEGFR2

X-rayRe

ceptorCon

form

ation

X-rayLigand

Pocketome:comprehensivecollectionofligand-bindingpocketsfromPDB

• InstantaccesstoallrelevantPDBX-raystructures,optimallypre-alignedaroundthebindingpocket.

Kufareva I,Ilatovskiy AV,Abagyan R.Pocketome:anencyclopediaofsmall-moleculebindingsitesin4D.NucleicAcidsRes.2012;40:D535-40.

FXR(GC2)posepredictionresultsAverageRMSD,toppose AverageRMSD,bestposeof5

1.95Å 1.69Å

1.95Å 1.95Å

MedianRMSD,toppose MedianRMSD,bestposeof5

Affinitypredictionapproaches

• Dockingtogeneratealignedposes• Receptor/Physics-basedapproach– ICMVLSscore:ΔG = α1ΔEFF + α2ΔEGB + α3ΔEHP + α4ΔEHB + α5ΔEPD + α6TΔSTO

• Ligand/APF-basedapproach:DG » ∑ ∑ fm

i PiAPF-QSAR (rm);

PiAPF-QSAR (r) =∑ ∑wk

ifji exp(-(r-rj)2/l2);

7·Ntrain weights wki for the contributions of each molecule

k in the training set into each APF component iDGl » ∑ ∑ wk

i EAPFkl

i; EAPFkl

i = ∑ ∑ fmi fj

i exp(-(rm-rj)2/l2); Partial Least Squares (PLS) to determine weights wk

i

TotrovM.AtomicPropertyFields:generalized3Dpharmacophoric potentialforautomatedligandsuperposition,pharmacophoreelucidationand3DQSAR.ChemBiolDrugDes.2008;71(1):15-27.

TrainingAPF3DQSAR:Cathepsin S

• 302relatedcompoundsfromChEMBL v2.3docked

• 3DposesusedtobuildAPF3D-QSARmodel

VisualizationofAPFfieldsofpKd modelforCathepsin S

Trainingsetsofactivitydata

• Source:ChEMBLv2.3

• Varyingnumberandrelevance:

Cath.S VEGFR2

JAK2 p38aTarget Nofdatapoints

CathS 1754

VEGFR2 5733

JAK2 1618

p38a 4183 DistributionsofTanimoto distancestotheclosesttrainingsetcompoundforeachchallengecompound

Training/TestingSetGeneration• LOOcross-validationorsimpleN-foldrandomtestsubsets

don’treflectrealisticchallengeadequately

• Stringent3-foldclustercross-validation:- Clusterfulltrainingset(APF3Dchemicaldistance,0.25

cutoff)- Randomlyassignclusterstothreegroups- Useany2groupstotrain,3rd groupfortest(Q2/RMSE)

ImprovinguponAPF3DQSAR:

• CombiningAPFand‘physics’basedterms:- APFproducesbettermodelsprovidedsufficienttrainingdata.Physicsbasedtermsaretypicallynoisier,butmoregeneral.

- Canthetwocombineforbetterperformance?

- Investigatesingleandstagedmodelscombining‘chemical’and‘physical’terms.PLSand/orRFR

• Dynamic‘focused’models:- Someevidencethatlargetrainingset‘dilutes’localactivitytrends

- Investigatefocusedmodelstrainedonsubsetsofdatarelatedtochallengemolecules

Dynamic/FocusedModelTraining

1. DockLigands2. Clusterby3D

posesinAPF3. Foreach

cluster:Find~300nearestknownligandsastrainingset

4. Trainamodelforeachcluster

ChallengeLigands ChEMBL Ligands

Kinasemodelscross-validationTrainingSets

Terms Method VEGFR2Q2

VEGFR2RMSE

JAK2Q2 JAK2RMSE

P38aQ2 P38aRMSE

Physics/VLS-Score

Notraining

0.12 NA 0.23 NA 0.13 NA

Full/Static

PhysicsOnly

PLS 0.13 1.2 0.30 1.1 0.25 1.0

RFR 0.12 1.2 0.23 1.2 0.18 1.1

APFonly PLS 0.22 1.4 0.30 1.2 0.29 1.1

Physics+APF

1StagePLS

0.26 1.3 0.36 1.2 0.29 1.1

2StagePLS

0.22 1.4 0.32 1.2 0.30 1.1

PLS/RFR 0.25 1.2 0.33 1.1 0.33 1.0

Focused/Dynamic

APFonly PLS 0.26 1.2 0.35 1.2 0.32 1.0

Physics+APF

PLS/RFR 0.28 1.1 0.40 1.1 0.33 1.0

RMSEisshowninpKd units

ChallengeSetPerformance

TrainingSet

Terms Method VEGFR2Corr R

VEGFR2RMSE

JAK2CorrR

JAK2RMSE

P38aCorrR

P38aRMSE

Static APF PLS 0.54 1.4 0.55 1.2 0.55 1.1Physics+APF

PLS 0.61 1.2 0.65 1.0 0.56 1.1PLS/RFR 0.68 1.0 0.61 1.0 0.63 1.0

Focused/Dynamic

APF PLS 0.53 1.5 0.53 1.2 0.51 1.3Physics+APF

PLS/RFR 0.67

QC3F=0.53<dmin>=0.2

1.0 0.59

QC3F=0.63<dmin>=0.3

1.0 0.56

QC3F=0.57<dmin>=0.27

1.3

D3Raffinityprediction:ligandranking,Kendall𝛕

Cathepsin Sstage1𝛕 =0.45

VEGFR2

𝛕 =0.45

JAK2_SC2

𝛕 =0.47

p38a

𝛕 =0.41

Conclusions

• Ligandbiaseddocking(ICMdock+APF)consistentlyproducesgoodposeaccuracy

• AtomicPropertyField-based3DQSARactivitymodelsoutperformphysicaltermbasedmodels

• Clustercross-validationisadequatetoassessmodelquality

• Usingdynamic/focusedtrainingsetsdidnotresultinconsistentlybetterpredictions

• CompositemodelsandinparticularPLS(APF)/RFR(Phys)areconsistentlymostpredictive

Acknoledgments

• PoloLam• EugeneRausch• RubenAbagyan

• D3Rorganizers