new interpretable ai - princeton university · 2018. 10. 29. · interpretable ai i ai and...
TRANSCRIPT
![Page 1: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/1.jpg)
Interpretable AI
Dimitris Bertsimas
MIT
Septemebr 2018
1 / 24
![Page 2: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/2.jpg)
Interpretable AI
I AI and especially deep learning have made significant progressin computer vision, automatic translation and voicerecognition that are affecting society.
I Deep learning suffers from lack of interpretability.
I A driveless car is involved in an accident with loss of life. Whois at fault? Can society tolerate not understanding?
I A student is not selected for freshman admissions. Is it anadequate response that an algorithm made the decision?
I Interpretability matters.
2 / 24
![Page 3: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/3.jpg)
Interpretable AI
I AI and especially deep learning have made significant progressin computer vision, automatic translation and voicerecognition that are affecting society.
I Deep learning suffers from lack of interpretability.
I A driveless car is involved in an accident with loss of life. Whois at fault? Can society tolerate not understanding?
I A student is not selected for freshman admissions. Is it anadequate response that an algorithm made the decision?
I Interpretability matters.
2 / 24
![Page 4: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/4.jpg)
Interpretable AI
I AI and especially deep learning have made significant progressin computer vision, automatic translation and voicerecognition that are affecting society.
I Deep learning suffers from lack of interpretability.
I A driveless car is involved in an accident with loss of life. Whois at fault? Can society tolerate not understanding?
I A student is not selected for freshman admissions. Is it anadequate response that an algorithm made the decision?
I Interpretability matters.
2 / 24
![Page 5: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/5.jpg)
Goal: Develop AI algorithms that are interpretable andprovide state of the art performance.
PatientinfoAge: 30Gender: maleAlbumin: 2.8g/dLSepsis: noneINR: 1.1Diabetic: yes…
Mortality risk: 26.4%
Black-boxmodels
Interpretablemodels
Mortality risk: 26.4%
Age<25?
13.2% Male?
26.4% 18.3%
PatientinfoAge: 30Gender: maleAlbumin: 2.8g/dLSepsis: noneINR: 1.1Diabetic: yes…
3 / 24
![Page 6: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/6.jpg)
Leo Breiman, On Interpretability Trees receive an A+
I Leo Breiman et. al. (1984) introduced CART, a heuristicapproach to make predictions (either binary or continuous)from data.
I Widespread use in academia and industry (∼ 37,000citations!)
I The Iris flower data set introduced by Fisher 1936 to classifyflowers based on four measurements: petal width/height andsepal width/height.
4 / 24
![Page 7: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/7.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 8: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/8.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 9: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/9.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 10: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/10.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 11: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/11.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 12: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/12.jpg)
The Iris data set
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th
Species●
●
setosavirginica
5 / 24
![Page 13: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/13.jpg)
The Tree Representation
1
2 V
V S
Sepal length < 5.75 Sepal length ≥ 5.75
Sepal width < 2.7 Sepal width ≥ 2.7
6 / 24
![Page 14: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/14.jpg)
Leo again ....
I CART is fundamentally greedy—it makes a series of locallyoptimal decisions, but the final tree could be far from optimal
I Finally, another problem frequently mentioned(by others, not by us) is that the tree procedure isonly one-step optimal and not overall optimal. . . . Ifone could search all possible partitions . . . the tworesults might be quite different.
We do not address this problem. At this stage ofcomputer technology, an overall optimal treegrowing procedure does not appear feasible for anyreasonably sized data set.
I On interpretability trees receive an A+
7 / 24
![Page 15: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/15.jpg)
B.+Dunn, “Optimal Trees”, Machine Learning, 2017
I Use Mixed-Integer Optimization (MIO) and local search toconsider the entire decision tree problem at once and solve toobtain the Optimal Tree for both regression and classification.
I The Algorithms scale with n = 1, 000, 000, p = 10, 000.
I Motivation: MIO is the natural form for the Optimal Treeproblem:
I Decisions: Which variable to split on, which label to predict fora region
I Outcomes: Which region a point ends up in, whether a point iscorrectly classified
8 / 24
![Page 16: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/16.jpg)
OCT-H
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
4 5 6 7 8Sepal Length
Sepa
l Wid
th Species●
●
●
setosaversicolorvirginica
9 / 24
![Page 17: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/17.jpg)
Performance of Optimal Classification Trees
I Average out-of-sample accuracy across 60 real-world datasets:
70
75
80
85
2 4 6 8 10Maximum depth of tree
Out
−of
−sa
mpl
e ac
cura
cy
CART OCT OCT−H
10 / 24
![Page 18: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/18.jpg)
Performance of Optimal Classification Trees
I Average out-of-sample accuracy across 60 real-world datasets:
70
75
80
85
2 4 6 8 10Maximum depth of tree
Out
−of
−sa
mpl
e ac
cura
cy
CART OCT OCT−H Random Forest XGBoost
10 / 24
![Page 19: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/19.jpg)
How do trees compare with Deep Learning?
I B. + Mazumder+ Sobiesk, 2018
I Theorem: Optimal classification and regression trees withhyperplanes are as powerful as classification and regression(feedforward, convolutional and recurrent) neural networks,that is given a NN we can find a OCT-H (or ORT-H) that hasthe same in sample performance.
I Out of sample performance is very comparable on 7 populardata sets between NNs and OCT-Hs.
I Why is this result important?
11 / 24
![Page 20: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/20.jpg)
Surgical Outcomes Prediction - used at MGH
Figure: Decision tree for predicting any complication post surgery.
12 / 24
![Page 21: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/21.jpg)
Surgical Outcomes Prediction - App
Figure: Surgical outcome prediction questionnaire based on OptimalTrees.
13 / 24
![Page 22: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/22.jpg)
Mortality Prediction in Cancer Patients - used atDanna-Farber
Figure: Decision tree for predicting 60-day mortality in breast cancerpatients.
14 / 24
![Page 23: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/23.jpg)
Mortality Prediction in Cancer Patients - App
Figure: Cancer mortality prediction questionnaire based on Optimal Trees.
15 / 24
![Page 24: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/24.jpg)
Saving Lives in Liver Transplantation
Using OCT, we designed a new system for prioritizing livertransplantation recipients that averts 400 deaths per year in theUS compared to current practice.
16 / 24
![Page 25: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/25.jpg)
Critical Brain Injury
Using OCT, we can identify critical brain injury in children using40% less CT scans than CART and missing only 5 children (out of337, instead of 9 for CART ).
17 / 24
![Page 26: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/26.jpg)
Designing financial plans from transactions
I Using OCT we can accurately predict whether a person islikely to buy a house, or open an educational account basedon transactional data (payroll, credit cards, ...).
I Based on these predictions we create a financial plan thatmaximizes the probability of success of goals.
18 / 24
![Page 27: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/27.jpg)
Optimal Prescriptive Trees
I B+Dunn+Mundru, Optimal Prescriptive Trees, 2018.
I Consider a healthcare setting (personalized medicine, manyother applications)
I Historical observational data (Xi , zi ,Yi ), i = 1, . . . , n.
I Xi ∈ Rd : Features of patient i .
I zi ∈ {1, 2, . . . ,m} : Treatment assigned to patient i by doctor.
I Yi ∈ R : Outcome recorded of patient i (Lower the better).
I Question: When a new patient comes in with features x ,what treatment τ(x) ∈ {1, 2, . . . ,m} is best for this person?
19 / 24
![Page 28: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/28.jpg)
Can we use Machine Learning?
I For each patient xi : If we knew the best treatment(treatment out of m options that leads to best outcome),then it is a standard multiclass classification problem.
I We could learn a classifier that predicts in {1, . . . ,m} givenx ∈ Rd using this historical data.
I KEY CHALLENGE: But, we only know the outcome for zi(historically given treatment) and not the others.
I We do not know what would have happened(“counterfactuals”) to patient i under the other (m − 1)treatments.
20 / 24
![Page 29: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/29.jpg)
Optimal Prescriptive Trees
I Objective: Determine τ(x) to minimize
µ Mean outcome + (1− µ)Prediction error, 0 < µ < 1
µ
n∑i=1
(yiI[τ(xi ) = zi ] +
∑t 6=zi
yi (t)I[τ(xi ) = t]
)+
(1− µ)
[n∑
i=1
(yi − yi (zi ))2],
I Need to predict counterfactuals.1. For each subject i : If he/she received treatment 1, we know
Yi = Yi (1).2. Estimate Yi (0) as average of patients in that leaf who received
0.3. Can also use linear regression.
I Use B+Dunn OCT or ORT algorithms.
21 / 24
![Page 30: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/30.jpg)
Personalized Diabetes Management
I Data from the Boston Medical Center, from 1999-2014.
I 100,000 patient visits for type 2 diabetes.
I 13 possible treatment options (regimens).
I Patient features include demographic information (sex, race,gender etc.), treatment history, and diabetes progression.
I Outcome of interest: HbA1c level; smaller the better.
I Varied # training samples from 1,000–50,000 to examine theeffect on out-of-sample performance. Averaged this processover ten different splits of the data.
22 / 24
![Page 31: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/31.jpg)
OPT has a Performance and Interpretability Edge
● ● ● ●
● ● ● ●
● ●
●
●
●● ●
●
●
● ●●
●●
●
●
−0.6
−0.4
−0.2
0.0
103 103.5 104 104.5
Training size
Mea
n H
bA1c
cha
nge
● ● ● ●
● ● ● ●
●●
●
●
●
●●
●●
● ●
●●
●
●
●
−0.6
−0.4
−0.2
0.0
103 103.5 104 104.5
Training size
Con
diti
onal
HbA
1c c
hang
e
● ● ● ●
● ● ● ●
●●
●●
●
●●
●
●●
● ●
●●
● ●
0%
25%
50%
75%
100%
103 103.5 104 104.5
Training size
Pro
p. d
iffer
from
SO
C
●
●
●
●
●
●
BaselineOracle
RC−kNNRC−LASSO
RC−RFOPT
23 / 24
![Page 32: New Interpretable AI - Princeton University · 2018. 10. 29. · Interpretable AI I AI and especially deep learning have made signi cant progress in computer vision, automatic translation](https://reader035.vdocument.in/reader035/viewer/2022071216/6048e05d88fbe118e10667cb/html5/thumbnails/32.jpg)
Conclusions
I OCT and OCT-H provide interpretable, state of the artpredictions.
I OPT provide state of the art prescriptions direcltly from data
I Exciting applications in medicine and many other fields:computer security, financial services, drug discovery amongmany others.
I Rethink how we teach optimization.
I New Class: Machine Learning and Personalized Medicine
24 / 24