09 demo aqeel
TRANSCRIPT
SideFinder: Predicting Drug Side Effects
Aqeel AhmedInsight Data Science
FellowProf. Heather A. Carlson GroupCollege of PharmacyUniversity of Michigan
Motivations: Drug Side Effects Prediction
Developing a new drug is challengingRequires billions of dollarsOften more than 13 years of effortsSide effects are one of the main causes of
Drug failure Drug withdrawal
Predictions can save time/money Major public health concern
Estimated: 100,000 deaths per yearSelecting better drug candidates
DEMO: www.SideFinder.info
SIDER: http://sideeffects.embl.deDSigDB: http://tanlab.ucdenver.edu/DSigDB
Data Resources
DSigDB Drug Protein Interactions From: ChEMBL and PubChem
SIDER Drug side effects From: public documents and package inserts
Data Representation Each drug is represented by ~700 dimensional interaction
vector Each element encodes for the interactions (1: interaction)
Each drug is also associated with output side effect profile Aim is to predict side effect profile for a new drug
1 2 3 . . . . . . 705
1 0 1 0 0 0 1 0 1 00 0 1 0 0 1 0 0 1 10 0 1 0 1 0 0 1 1 11 0 1 0 1 1 0 1 1 01 0 1 0 1 0 0 0 1 11 0 1 0 0 0 1 0 1 0
Protein interactions
Drug
s
1 2 3 . . . . . . . . . . 1000
0 0 1 0 0 0 1 0 0 0 1 0 1 01 0 0 1 1 0 0 1 1 0 0 1 1 00 0 0 1 1 0 1 0 0 0 1 0 0 01 0 1 0 0 0 0 1 1 0 0 0 1 00 0 0 1 1 0 1 0 0 0 1 0 1 11 0 0 1 1 0 1 0 0 0 1 0 1 0
Side effects
Machine learning
Built 1000 models (one for each side effect) Random forest classifier Logistic regression classifier
Validation 5-fold cross validation Distributions of ROC AUC, Accuracy
Area Under the Curve (AUC)
Drug associations
Side effects
Protein interactionsFeature vectors for predictions
Accuracy
Roc (AUC)
Literature support for the predicted side effectMefenamic Acid: Treats pain, including menstrual pain
Ref: Drug.com
Lovastatin: Lowering cholesterol
Ref: Indian J Endocrinol Metab. Safety of statins 2013; 17(4): 636–646
Summary Predicting side effect in drug development process
Challenging Very important
Can save time and money Improve health care
Developed a machine learning approach Used drug-protein interactions Can predict many side effects
Future directions Incorporate other biological data
Drug structure features Gene expression profiles
Integrate models in a consensus based approach Group side effects into classes on similarity and severity levels