multi target bioactivity models in pipeline pilot
Post on 16-Apr-2017
541 Views
Preview:
TRANSCRIPT
Multi-target bioactivity models in Pipeline PilotUsing ligand and target information
Gerard JP van Westen
Pipeline Pilot UGM (17-1-2013)
Cool things to do with PP•Multi-target bioactivity models▫ The why…▫ The how…▫ The results… (time permitting)
The why.. a target is never alone…•Drug targets often have similar paralogs ▫Selectivity is required
• Viral targets often mutate leading to resistance▫Broad activity is required
•Non-similar proteins have been shown to share ligands▫E.g. acetylcholine and serotonin
Sequence Similarity
Emtricitabine, FTC(NRTI)
1.0 0.9 0.3
0.9 1.0 0.4
0.3 0.4 1.0
Phenylalanine
Tyrosine
Arginine
Sequence Similarity
Emtricitabine, FTC(NRTI)
1.0 0.9 0.3
0.9 1.0 0.4
0.3 0.4 1.0
FYI
IYF
WTF
FYI IYF WTF
The how… what is PCM ?• Proteochemometric modeling combines both a ligand
descriptor and target descriptor
GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
What is PCM ?• Proteochemometric modeling combines both a ligand
descriptor and target descriptor
GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
Bio-Informatics
What is PCM ?• Proteochemometric modeling combines both a ligand
descriptor and target descriptor
Bio-Informatics
GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
PCM using Pipeline Pilot• For this work we use mostly: ▫Chemistry (circular fingerprints) ▫Data modeling ▫R statistics components (machine learning)
• Lacking was a protein descriptor type component…
• (In addition I missed some validation components…)▫Matthews Correlation Coefficient▫R2 to a line through the origin (R2 zero)
Target descriptors• Simple way to derive protein descriptors
1. Select the binding pocket 2. Align the relevant residues3. Convert to physicochemical properties
•PP component can create different protein descriptors1. ProtFP Feature: J. Med. Chem. 2012, 55, 7010-7020 ; BMC
Bioinformatics 2012, Submitted2. ProtFP PCA: BMC Bioinformatics 2012, Submitted3. Z-Scales : J. Med. Chem. 1998, 41, 2481-2491 4. VHSE : Biopolymers 2005, 80, 775-7865. ST-Scales : Amino Acids 2010, 38, 805-8166. T-Scales : J. Mol. Struct. 2007, 830, 106-1157. MS-WHIM J. Chem. Inf. Comp. Sci. 1999 39, 525-5338. FASGAI : Eur. J. Med. Chem. 2009, 44, 1144-11549. Blosum62 : J. Comp. Biol. 2009, 16, 5, 703-723
Target Descriptors
Revised version of paper to be submitted
•The example is using Z-scales by Sandberg et al.
•Uses a PCA to derive 5 principal components that describe amino acid similarity ▫Based on side chain physicochemical properties
•We use first 3▫1 – Lipophilicity▫2 – Size▫3 – Charge / Polarity
M Sandberg, L Eriksson J Med Chem (1998) 41: 2481 - 2491
Target Descriptors
• Dataset Provide by Tibotec and Virco• Antivirogram® assay• Patient data• Reverse Transcriptase and Protease sequences• Fold Change in –logIC50
Target Amino acids Binding Site Drug Class Drugs Mutant Sequences Data points
Reverse Transcriptase 400* Orthosteric NRTI 8 10,501 72,727
Reverse Transcriptase 400* Allosteric NNRTI 4 10,723 35,249
Protease 99 Orthosteric PI 9 27,081 180,162
Example Data set
GJP van Westen, A Hendriks et al. PLoS Comp Biol (2013) Accepted / In press
• What is important to our models?• What residue position?• What mutation is present at that position?• How much is contributed to resistance?
• Bioactivity spectra can be obtained from these models
Feature Importance
• Currently we have applied the technique using PP to:• Adenosine receptors (human + rat)• HIV inhibitors (preclinical lead optimization)• HIV inhibitors (clinical drugs) • OATP1 inhibitors• Aminergic GPCRs• …
Data sets
Acknowledgements
• Ad IJzerman• Andreas Bender• Alwin Hendriks
•Herman van Vlijmen• Joerg Wegner• Anik Peeters
• John Overington•George Papadatos
Multi-target bioactivity models in Pipeline PilotUsing ligand and target information
Gerard JP van Westenwww.gjpvanwesten.nl
Pipeline Pilot UGM (17-1-2013)
Model validation (classification)• PP lacked a component to calculate correlation
coefficients between two properties in the data stream in (binary) classification.
Model validation (regression)• PP lacked a component to calculate correlation
coefficients between two properties in the data stream in regression. (R2 zero, etc)
A. Tropsha; Predictive Quantitative Structure-Activity Relationships Modeling; in Handbook of Chemoinformatics Algorithms (2010) J. Faulon and A. Bender; Editors.
Ligand Descriptors• Scitegic Circular Fingerprints▫Circular, substructure based
fingerprints ▫Maximal radius of 3 bonds from
central atom▫Each substructure is converted to a
molecular feature
Carbon
Oxygen
Substructure
top related