pose and affinity prediction by icm in d3r gc3 property fields (apf) totrov m. atomic property...
TRANSCRIPT
Posepredictionmethod:ICM-dock
• ICM-dock:- pre-samplingofligandconformers- multipletrajectoryMonte-Carlowithgradientminimizationininternalcoordinates
- receptorrepresentedbygridpotentials
- multiplereceptorconformationsusedwhenneededtoaddressflexibility
- posere-rankingwithICMVLSscore
- optionalchemicalbiasingbyAPFfromavailableexperimentalligandstructures/templates
Atomic Property Fields (APF)
TotrovM.AtomicPropertyFields:generalized3Dpharmacophoric potentialforautomatedligandsuperposition,pharmacophoreelucidationand3DQSAR.ChemBiolDrugDes.2008;71(1):15-27.
3D pharmacophore: arrangement of molecular properties in space that confers activity. Generalization of point pharmacophore concept:Discrete pharmacophoric points ® Continuous distributionsMoieties represented as Ph4 types ® Vectors of atomic propertiesfji - vector of properties i for atom jAtomic similarity measure - dot product of property vectors: ∑ fji fki
Atomic Property Field (APF) - continuous 3D potential:Pi(r) =∑ fji exp((r-rj)2/l2APF); Pseudo-energy (score) of a compound in APF: EAPF=-∑fji Pi(rj); Implementation - on a 3D (multi)grid- continuous derivatives (spline)- fast potential for molecular mechanics/optimization in combination with force-field energy
APF - 3D pharmacophoric potential
AccuracyofFlexibleLigand Superposition
ADA CDK2 DHFR ER FXA HIVRT NA P38 THR TK TRP mean(39) (72) (410) (39) (146) (43) (49) (454) (72) (22) (49) (1100)
----------------------------------------------------------------------------------------------------------------------------------------
Surflex-sim 212.8212.544.39 56.414.1118.6 18.379.694.17 68.1840.82 23.15135.951.3953.66 43.5933.5672.09 75.5169.693.06 31.8259.18 59.07051.2836.111.95 062.339.3 6.1220.72.780 0 17.78
ROCS 212.8243.0674.15 41.0314.3830.23 79.599.472.7886.368.16 35.63120.5136.1114.39 56.4128.7734.88 14.2941.1969.449.0981.63 32.83066.6720.8311.46 2.5656.8534.88 6.1249.3427.784.5510.2 31.54
FlexS 215.382556.148.7235.6216.28 36.7314.9830.5681.8218.37 33.48120.5119.4411.71 43.5913.746.51 57.1474.015.5613.642.04 35.77064.155.5632.2 7.6950.6837.21 6.1211.0163.894.5579.59 30.75
ICM/APF 246.1512.586.83 51.2870.55 18.6 75.5120.0488.8990.9169.39 54.48123.0868.0611.95 46.1516.44 46.51 14.2968.289.729.0928.57 36.49030.7719.441.22 2.5613.0134.88 10.211.671.3902.04 9.03
Giganti et al. J Chem Inf Model 2010, 50, 992-1004
Independent broad benchmark: ligands without X-ray structures but similar chemotype to a solved complex. Assessment of superposition quality 2/1/0 -’good’/’acceptable’/’poor’. 11 targets from DUD (out of 40).
Ligand-BiaseddockingwithAPF
• MCdockingsimulations:APFpotentialsinadditiontophysicalinteractiontermgrids
• Poseranking:compositescorecombiningphysics-basedICMVLSscoreandAPFpseudo-energy
LamPC,Abagyan R,TotrovM.Ligand-biasedensemblereceptordocking(LigBEnD):ahybridligand/receptorstructure-basedapproach.JComput AidedMol Des.2018;32(1):187-198.
VisualizationofAPFusedforligandbiasinCathepsin Sdocking
D3RCathepsin S:PosepredictionAverageRMSD,toppose AverageRMSD,bestposeof5
Median RMSD,bestposeof5MedianRMSD,toppose
1.06Å1.7Å
2.82Å 1.31Å
Apparentcrystalcontacteffects
SuperimposedanswerX-raysnmxm(CatS_7),rpwj (CatS_9)andgabj (CatS_14).Extensiveligand-crystallographicneighborcontacts(~150Å2)arevisible
Primary‘receptor’Cathespin
Ligands
CrystalneighborCathespin
Primary‘receptor’Cathespin
CrystalneighborCathespin
SuperimposedanswerX-rayyrpk(CatS_16),anditstoppredictedpose.AlsoshownaretopposesforCatS_7,CatS_9andCatS_14(thinwires)
KinasesandFXR:flexibilityensembles
• Ensembleconstruction:- PDBstructurescollected/alignedviaPocketome database
- Upto10representativeX-raystructuresselectedbyiterativeproceduretomaximizenumberofcompatibleligands
- Foreachreceptorconformation,compatibleligandsareusedasAPFtemplatesindocking BoundLigand/Receptorconformation
compatibilitymatrixheatmap forVEGFR2
X-rayRe
ceptorCon
form
ation
X-rayLigand
Pocketome:comprehensivecollectionofligand-bindingpocketsfromPDB
• InstantaccesstoallrelevantPDBX-raystructures,optimallypre-alignedaroundthebindingpocket.
Kufareva I,Ilatovskiy AV,Abagyan R.Pocketome:anencyclopediaofsmall-moleculebindingsitesin4D.NucleicAcidsRes.2012;40:D535-40.
FXR(GC2)posepredictionresultsAverageRMSD,toppose AverageRMSD,bestposeof5
1.95Å 1.69Å
1.95Å 1.95Å
MedianRMSD,toppose MedianRMSD,bestposeof5
Affinitypredictionapproaches
• Dockingtogeneratealignedposes• Receptor/Physics-basedapproach– ICMVLSscore:ΔG = α1ΔEFF + α2ΔEGB + α3ΔEHP + α4ΔEHB + α5ΔEPD + α6TΔSTO
• Ligand/APF-basedapproach:DG » ∑ ∑ fm
i PiAPF-QSAR (rm);
PiAPF-QSAR (r) =∑ ∑wk
ifji exp(-(r-rj)2/l2);
7·Ntrain weights wki for the contributions of each molecule
k in the training set into each APF component iDGl » ∑ ∑ wk
i EAPFkl
i; EAPFkl
i = ∑ ∑ fmi fj
i exp(-(rm-rj)2/l2); Partial Least Squares (PLS) to determine weights wk
i
TotrovM.AtomicPropertyFields:generalized3Dpharmacophoric potentialforautomatedligandsuperposition,pharmacophoreelucidationand3DQSAR.ChemBiolDrugDes.2008;71(1):15-27.
TrainingAPF3DQSAR:Cathepsin S
• 302relatedcompoundsfromChEMBL v2.3docked
• 3DposesusedtobuildAPF3D-QSARmodel
VisualizationofAPFfieldsofpKd modelforCathepsin S
Trainingsetsofactivitydata
• Source:ChEMBLv2.3
• Varyingnumberandrelevance:
Cath.S VEGFR2
JAK2 p38aTarget Nofdatapoints
CathS 1754
VEGFR2 5733
JAK2 1618
p38a 4183 DistributionsofTanimoto distancestotheclosesttrainingsetcompoundforeachchallengecompound
Training/TestingSetGeneration• LOOcross-validationorsimpleN-foldrandomtestsubsets
don’treflectrealisticchallengeadequately
• Stringent3-foldclustercross-validation:- Clusterfulltrainingset(APF3Dchemicaldistance,0.25
cutoff)- Randomlyassignclusterstothreegroups- Useany2groupstotrain,3rd groupfortest(Q2/RMSE)
ImprovinguponAPF3DQSAR:
• CombiningAPFand‘physics’basedterms:- APFproducesbettermodelsprovidedsufficienttrainingdata.Physicsbasedtermsaretypicallynoisier,butmoregeneral.
- Canthetwocombineforbetterperformance?
- Investigatesingleandstagedmodelscombining‘chemical’and‘physical’terms.PLSand/orRFR
• Dynamic‘focused’models:- Someevidencethatlargetrainingset‘dilutes’localactivitytrends
- Investigatefocusedmodelstrainedonsubsetsofdatarelatedtochallengemolecules
Dynamic/FocusedModelTraining
1. DockLigands2. Clusterby3D
posesinAPF3. Foreach
cluster:Find~300nearestknownligandsastrainingset
4. Trainamodelforeachcluster
ChallengeLigands ChEMBL Ligands
Kinasemodelscross-validationTrainingSets
Terms Method VEGFR2Q2
VEGFR2RMSE
JAK2Q2 JAK2RMSE
P38aQ2 P38aRMSE
Physics/VLS-Score
Notraining
0.12 NA 0.23 NA 0.13 NA
Full/Static
PhysicsOnly
PLS 0.13 1.2 0.30 1.1 0.25 1.0
RFR 0.12 1.2 0.23 1.2 0.18 1.1
APFonly PLS 0.22 1.4 0.30 1.2 0.29 1.1
Physics+APF
1StagePLS
0.26 1.3 0.36 1.2 0.29 1.1
2StagePLS
0.22 1.4 0.32 1.2 0.30 1.1
PLS/RFR 0.25 1.2 0.33 1.1 0.33 1.0
Focused/Dynamic
APFonly PLS 0.26 1.2 0.35 1.2 0.32 1.0
Physics+APF
PLS/RFR 0.28 1.1 0.40 1.1 0.33 1.0
RMSEisshowninpKd units
ChallengeSetPerformance
TrainingSet
Terms Method VEGFR2Corr R
VEGFR2RMSE
JAK2CorrR
JAK2RMSE
P38aCorrR
P38aRMSE
Static APF PLS 0.54 1.4 0.55 1.2 0.55 1.1Physics+APF
PLS 0.61 1.2 0.65 1.0 0.56 1.1PLS/RFR 0.68 1.0 0.61 1.0 0.63 1.0
Focused/Dynamic
APF PLS 0.53 1.5 0.53 1.2 0.51 1.3Physics+APF
PLS/RFR 0.67
QC3F=0.53<dmin>=0.2
1.0 0.59
QC3F=0.63<dmin>=0.3
1.0 0.56
QC3F=0.57<dmin>=0.27
1.3
D3Raffinityprediction:ligandranking,Kendall𝛕
Cathepsin Sstage1𝛕 =0.45
VEGFR2
𝛕 =0.45
JAK2_SC2
𝛕 =0.47
p38a
𝛕 =0.41
Conclusions
• Ligandbiaseddocking(ICMdock+APF)consistentlyproducesgoodposeaccuracy
• AtomicPropertyField-based3DQSARactivitymodelsoutperformphysicaltermbasedmodels
• Clustercross-validationisadequatetoassessmodelquality
• Usingdynamic/focusedtrainingsetsdidnotresultinconsistentlybetterpredictions
• CompositemodelsandinparticularPLS(APF)/RFR(Phys)areconsistentlymostpredictive