a stability upgrade for data-driven models
TRANSCRIPT
A stability upgrade for data-driven models
Y
X1 X4
X3X2
Jonas PetersUniversity of Copenhagen
AI Seminar, 14th June 2019
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Stefan Bauer Peter Buhlmann Rune Christiansen Martin Jakobsen
Nicolai Meinshausen Niklas PfisterDominik
Rothenhausler Mads Thøisen
asd
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Given yeast data...
this is the predictor with the strongest correlation:
data from: Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Given yeast data... this is the predictor with the strongest correlation:
data from: Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
After an intervention on that predictor: (it’s a non-causal predictor!)
data from: Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
After an intervention on that predictor: (it’s a non-causal predictor!)
data from: Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
The Problem of Causal Discovery:
observed data
Y X2 X3 X4 X5
3.4 −0.3 5.8 −2.1 2.21.7 −0.2 7.0 −1.2 0.4−2.4 −0.1 4.3 −0.7 3.5
2.3 −0.3 5.5 −1.1 −4.43.5 −0.2 3.9 −0.9 −3.9...
......
......
?−→ causal model
Y
target
X2 X3
X4 X5
Here: Find the direct causes of Y !
(Different from: variable adjustment, propens. score matching, double robust methods, IVs, ...)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
The Problem of Causal Discovery:
observed data
Y X2 X3 X4 X5
3.4 −0.3 5.8 −2.1 2.21.7 −0.2 7.0 −1.2 0.4−2.4 −0.1 4.3 −0.7 3.5
2.3 −0.3 5.5 −1.1 −4.43.5 −0.2 3.9 −0.9 −3.9...
......
......
?−→ causal model
Y
target
X2 X3
X4 X5
Here: Find the direct causes of Y !
(Different from: variable adjustment, propens. score matching, double robust methods, IVs, ...)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
The Problem of Causal Discovery:
observed data
Y X2 X3 X4 X5
3.4 −0.3 5.8 −2.1 2.21.7 −0.2 7.0 −1.2 0.4−2.4 −0.1 4.3 −0.7 3.5
2.3 −0.3 5.5 −1.1 −4.43.5 −0.2 3.9 −0.9 −3.9...
......
......
?−→ causal model
Y
target
X2 X3
X4 X5
Here: Find the direct causes of Y !
(Different from: variable adjustment, propens. score matching, double robust methods, IVs, ...)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
The Problem of Causal Discovery:
observed data
Y X2 X3 X4 X5
3.4 −0.3 5.8 −2.1 2.21.7 −0.2 7.0 −1.2 0.4−2.4 −0.1 4.3 −0.7 3.5
2.3 −0.3 5.5 −1.1 −4.43.5 −0.2 3.9 −0.9 −3.9...
......
......
?−→ causal model
Y
target
X2 X3
X4 X5
Here: Find the direct causes of Y !
(Different from: variable adjustment, propens. score matching, double robust methods, IVs, ...)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
The Problem of Causal Discovery:
observed data
Y X2 X3 X4 X5
3.4 −0.3 5.8 −2.1 2.21.7 −0.2 7.0 −1.2 0.4−2.4 −0.1 4.3 −0.7 3.5
2.3 −0.3 5.5 −1.1 −4.43.5 −0.2 3.9 −0.9 −3.9...
......
......
?−→ causal model
Y
target
X2 X3
X4 X5
Here: Find the direct causes of Y !
(Different from: variable adjustment, propens. score matching, double robust methods, IVs, ...)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Fundamental assumption: X1,X4 → Y is invariant under interventions.
Y
X2 X3
X4 X5
Y
X2 X3
X4 X5
cf. modularity, autonomy, Haavelmo 1944, Aldrich 1989, Pearl 2009, . . .
Key idea: Use and data and search for invariant models.
set ∅ {2} {3} {4} · · · {2, 3} {2, 4} · · · {2, 3, 4}invariance 7 7 7 7 · · · 3 7 · · · 3
S :=⋂
S invariant
S = {1, 4}
JP, Buhlmann, Meinshausen, JRSS-B 2016 (with discussion): P(S ⊆ S∗) ≥ 1− α.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Fundamental assumption: X1,X4 → Y is invariant under interventions.
Y
X2 X3
X4 X5
Y
X2 X3
X4 X5
cf. modularity, autonomy, Haavelmo 1944, Aldrich 1989, Pearl 2009, . . .
Key idea: Use and data and search for invariant models.
set ∅ {2} {3} {4} · · · {2, 3} {2, 4} · · · {2, 3, 4}invariance 7 7 7 7 · · · 3 7 · · · 3
S :=⋂
S invariant
S = {1, 4}
JP, Buhlmann, Meinshausen, JRSS-B 2016 (with discussion): P(S ⊆ S∗) ≥ 1− α.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Fundamental assumption: X1,X4 → Y is invariant under interventions.
Y
X2 X3
X4 X5
Y
X2 X3
X4 X5
cf. modularity, autonomy, Haavelmo 1944, Aldrich 1989, Pearl 2009, . . .
Key idea: Use and data and search for invariant models.
set ∅ {2} {3} {4} · · · {2, 3} {2, 4} · · · {2, 3, 4}invariance 7 7 7 7 · · · 3 7 · · · 3
S :=⋂
S invariant
S = {1, 4}
JP, Buhlmann, Meinshausen, JRSS-B 2016 (with discussion): P(S ⊆ S∗) ≥ 1− α.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
X4
Y
data from environment A = 1
data from environment A = −1
non−invariant set
−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
−0.
50.
00.
5
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
X4
Y
data from environment A = 1
data from environment A = −1
non−invariant set
−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
−0.
50.
00.
5
A stands for anchor.
EA(Y − X 4b) 6= 0
Thus: X 4 is a non-invariant model.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
X4
Y
data from environment A = 1
data from environment A = −1
non−invariant set
−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
−0.
50.
00.
5
A stands for anchor.
EA(Y − X 4b) 6= 0
Thus: X 4 is a non-invariant model.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Predictors that are inferred to be causal by ICP...
YMR104C
YM
R10
3C
−0.5 0.0 0.5 1.0 1.5
0.0
0.5
1.0
1.5
xxxxxxx
xx
xx x x
xxxxxxxxxxxxxxx
xxxxxx
x xxxxxxxxxx xxxxxxxxxxxx
x xxxx
xx xxxxxxxxx
xx xxxxxx
xxx xxxxxxx xxxxx
x xxxxxx x
xxx
x
xxxxxxx
xx xxx
xxxxxx xxxx
xxxx x
xxx xx
xxxxxxxxx xx
xxxx
xxxx
x
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
...and the corresponding interventions on the predictor
YMR104C
YM
R10
3C
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
−1.
0−
0.5
0.0
0.5
1.0
1.5
xxxxxxxxx
xxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxx
xx xxxxxxxxxxx xxx
xxxxxxxxxxxxxxxxx
xx xxxxxxxxxx
x
xxxxxxxx
xxxxxx
xxxx xxxxxx
xxxx
xxxxxxxxxxxxxxx
xxxx
xxx xx
YPL273W
YM
R32
1C
−4 −3 −2 −1 0
−4
−3
−2
−1
0 xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxx
xxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx
YCL040W
YC
L042
W
−3 −2 −1 0 1 2 3
−2
−1
01
2
x
xxx
x
xx
xx
x
x
x
xx
xxxx
xx
x
xxx
xx
xx xx
xx xxx
xx
x
xxxxx
xxx
xxxx
xxx
xxx
x
x
xxx
xxx
xxxx
xx
x
xxx
x
xxx xx
x
x
xxxxx
xx
xxx
x
xxx
x
xxxx
xx
xxxx
xxxx
xxxxx
xxxxx xxx
xx
x
xx
xxx
x
xxx
xxxx
xxx
xx
xx
x
x
x
x
xx
x
xx
xx
x x
YLL019C
YLL
020C
−4 −3 −2 −1 0 1
−2.
0−
1.0
0.0
0.5
1.0
xxxxxxxxxx
xx xxxxx
xxxx
xxxx xxxx x
xxxxxxxx
xxx
xxxxxxxxx
x
xxxxxxxxxx
xxx xxxxx
xxxxx
xxxxxxx
xx
xxxxxxxxx
xx
xxx
xxxxxx xxx
x
xx
xxxx x
x
xx
xxxxxxxxxxx
x
xx
xxxx
x
xx
xx xxx
xxxx
xx x
xxxx
x xx
xxx
Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
...and the corresponding interventions on the predictor
YMR104C
YM
R10
3C
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
−1.
0−
0.5
0.0
0.5
1.0
1.5
xxxxxxxxx
xxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxx
xx xxxxxxxxxxx xxx
xxxxxxxxxxxxxxxxx
xx xxxxxxxxxx
x
xxxxxxxx
xxxxxx
xxxx xxxxxx
xxxx
xxxxxxxxxxxxxxx
xxxx
xxx xx
YPL273W
YM
R32
1C
−4 −3 −2 −1 0
−4
−3
−2
−1
0 xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxx
xxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx
YCL040W
YC
L042
W
−3 −2 −1 0 1 2 3
−2
−1
01
2
x
xxx
x
xx
xx
x
x
x
xx
xxxx
xx
x
xxx
xx
xx xx
xx xxx
xx
x
xxxxx
xxx
xxxx
xxx
xxx
x
x
xxx
xxx
xxxx
xx
x
xxx
x
xxx xx
x
x
xxxxx
xx
xxx
x
xxx
x
xxxx
xx
xxxx
xxxx
xxxxx
xxxxx xxx
xx
x
xx
xxx
x
xxx
xxxx
xxx
xx
xx
x
x
x
x
xx
x
xx
xx
x x
YLL019C
YLL
020C
−4 −3 −2 −1 0 1
−2.
0−
1.0
0.0
0.5
1.0
xxxxxxxxxx
xx xxxxx
xxxx
xxxx xxxx x
xxxxxxxx
xxx
xxxxxxxxx
x
xxxxxxxxxx
xxx xxxxx
xxxxx
xxxxxxx
xx
xxxxxxxxx
xx
xxx
xxxxxx xxx
x
xx
xxxx x
x
xx
xxxxxxxxxxx
x
xx
xxxx
x
xx
xx xxx
xxxx
xx x
xxxx
x xx
xxx
Kemmeren et al. 2014
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
ENF CRO
IGBP land cover classification
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
H = 1 H = 2
Classification based on reconstruction of H
−1
0
1
2
3
0 5 10 15
X2
Y
H = 1 H = 2
Fluorescence yield
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
ENF CRO
IGBP land cover classification
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
H = 1 H = 2
Classification based on reconstruction of H
−1
0
1
2
3
0 5 10 15
X2
Y
H = 1 H = 2
Fluorescence yield
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
ENF CRO
IGBP land cover classification
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
H = 1 H = 2
Classification based on reconstruction of H
−1
0
1
2
3
0 5 10 15
X2
Y
H = 1 H = 2
Fluorescence yield
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
ENF CRO
IGBP land cover classification
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
H = 1 H = 2
Classification based on reconstruction of H
−1
0
1
2
3
0 5 10 15
X2
Y
H = 1 H = 2
Fluorescence yield
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
20
30
40
50
60
−125 −100 −75 −50
longitude
latit
ude
ENF CRO
IGBP land cover classification
20
30
40
50
60
−125 −100 −75 −50
longitudela
titud
e
H = 1 H = 2
Classification based on reconstruction of H
−1
0
1
2
3
0 5 10 15
X2
Y
H = 1 H = 2
Fluorescence yield
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Discrete environments (gene data):JP, Meinshausen, Buhlmann: Causal inference using invariant prediction: identification and confidence intervals, JRSSB 2016
No environments (finance data): = timePfister, Buhlmann, JP: Invariant causal pred. for seq. data, JASA 2018
Nonlinear relations (fertility data):Heinze-Deml, JP, Meinshausen: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference 2018
Discrete hidden variables (Earth system data):Christiansen, JP: Invariance-based Causal Discovery in the Presence of Discrete Hidden Variables, arXiv:1808.05541
Survival data (registry data):Laksafoss, JP: Causal Methods for Survival Analysis, work in progress
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
invariance prediction
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
invariance prediction
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
invariance prediction
Find a trade-off between
• invariance with respect to
AND • predictive power
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
invariance prediction
Find a trade-off between
• invariance with respect to
AND • predictive power
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 1: Anchor regressionY ∈ R1: targetX ∈ R1×d : predictorsA ∈ R1×q: anchors, EAtA = Id
bγ := argminb
E(Y − Xb)2︸ ︷︷ ︸prediction
+ γ ‖EAt(Y − Xb)‖22︸ ︷︷ ︸
invariance
γ → 0: OLSγ →∞: IV solution (if identifiable)γ →∞: best invariant predictor (if not identifiable)
Anchor regression minimizes worst case prediction error under shiftinterventions. Rothenhausler, Buhlmann, Meinshausen, JP (arXiv:1801.06229)
The finite sample estimator is known as a k-class estimator for IVsolution. Theil (1958), Nagar (1959) – joint work with Martin Emil Jakobsen
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 1: Anchor regressionY ∈ R1: targetX ∈ R1×d : predictorsA ∈ R1×q: anchors, EAtA = Id
bγ := argminb
E(Y − Xb)2︸ ︷︷ ︸prediction
+ γ ‖EAt(Y − Xb)‖22︸ ︷︷ ︸
invariance
γ → 0: OLSγ →∞: IV solution (if identifiable)γ →∞: best invariant predictor (if not identifiable)
Anchor regression minimizes worst case prediction error under shiftinterventions. Rothenhausler, Buhlmann, Meinshausen, JP (arXiv:1801.06229)
The finite sample estimator is known as a k-class estimator for IVsolution. Theil (1958), Nagar (1959) – joint work with Martin Emil Jakobsen
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 1: Anchor regressionY ∈ R1: targetX ∈ R1×d : predictorsA ∈ R1×q: anchors, EAtA = Id
bγ := argminb
E(Y − Xb)2︸ ︷︷ ︸prediction
+ γ ‖EAt(Y − Xb)‖22︸ ︷︷ ︸
invariance
γ → 0: OLSγ →∞: IV solution (if identifiable)γ →∞: best invariant predictor (if not identifiable)
Anchor regression minimizes worst case prediction error under shiftinterventions. Rothenhausler, Buhlmann, Meinshausen, JP (arXiv:1801.06229)
The finite sample estimator is known as a k-class estimator for IVsolution. Theil (1958), Nagar (1959) – joint work with Martin Emil Jakobsen
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 1: Anchor regressionY ∈ R1: targetX ∈ R1×d : predictorsA ∈ R1×q: anchors, EAtA = Id
bγ := argminb
E(Y − Xb)2︸ ︷︷ ︸prediction
+ γ ‖EAt(Y − Xb)‖22︸ ︷︷ ︸
invariance
γ → 0: OLSγ →∞: IV solution (if identifiable)γ →∞: best invariant predictor (if not identifiable)
Anchor regression minimizes worst case prediction error under shiftinterventions. Rothenhausler, Buhlmann, Meinshausen, JP (arXiv:1801.06229)
The finite sample estimator is known as a k-class estimator for IVsolution. Theil (1958), Nagar (1959) – joint work with Martin Emil Jakobsen
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 2:
N. Pfister, S. Bauer, JP:
Learning stable structures in kinetic systems: benefits of a causal approach
arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
http://www.srfcdn.ch/radio/modules/dynimages/624/srf-1/2015/01/diverses/264377.150114_raclette_key.jpg
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Example: Maillard reaction Glu, Mel, C5, ForAc, Triose, Cn, AcAc, Amad, lysR, Fru, AMP
d
dt[Glu]t = −θ1[Glu]t + θ2[Fru]t − θ3[lysR]t [Glu]t
d
dt[Mel]t = θ4[AMP]t . . .
If the network structure is unknown, can we recover it from data?
WANTED: system of ODEs: Xt = gθ(Xt , t), X0 = x0.
adding noise: Xt := Xt + εt .
GIVEN: discrete time measurements (w/ repetitions): Xt1 , . . . , Xtm .
Brands et al. (2002): “Kinetic model. of reactions in heated monosaccharide-casein systems.”, J. of agr. and food chemistry.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
King et al, Nature 2004, Bongard et al, PNAS 2007, Calderhead et al, NIPS 2009, Schmidt et al, Science 2009, Oates,Mukherjee, AoAS 2012, Babtie et al, PNAS 2014, Hill et al, Nature Methods 2016, Chen et al, JASA 2016,
Rudy et al, Science Advances 2017, Brunten et al, Nature Comm 2017, Mikkelsen, Hansen, arXiv 1710.09308, many more...
nonlinear least squares:
argminθ
∑t∈{t1,...,tm}
‖Xt − Xt‖22
Xt = gθ(Xt , t), X0 = x0
causality prediction
Yt = gθ(Xt , t), X0 = x0
causality prediction
causal models are invariantHaavelmo 1944, Aldrich 1989, Pearl 2009, ...
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
King et al, Nature 2004, Bongard et al, PNAS 2007, Calderhead et al, NIPS 2009, Schmidt et al, Science 2009, Oates,Mukherjee, AoAS 2012, Babtie et al, PNAS 2014, Hill et al, Nature Methods 2016, Chen et al, JASA 2016,
Rudy et al, Science Advances 2017, Brunten et al, Nature Comm 2017, Mikkelsen, Hansen, arXiv 1710.09308, many more...
nonlinear least squares:
argminθ
∑t∈{t1,...,tm}
‖Xt − Xt‖22
Xt = gθ(Xt , t), X0 = x0
causality prediction
Yt = gθ(Xt , t), X0 = x0
causality prediction
causal models are invariantHaavelmo 1944, Aldrich 1989, Pearl 2009, ...
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
King et al, Nature 2004, Bongard et al, PNAS 2007, Calderhead et al, NIPS 2009, Schmidt et al, Science 2009, Oates,Mukherjee, AoAS 2012, Babtie et al, PNAS 2014, Hill et al, Nature Methods 2016, Chen et al, JASA 2016,
Rudy et al, Science Advances 2017, Brunten et al, Nature Comm 2017, Mikkelsen, Hansen, arXiv 1710.09308, many more...
nonlinear least squares:
argminθ
∑t∈{t1,...,tm}
‖Xt − Xt‖22
Xt = gθ(Xt , t), X0 = x0
causality prediction
Yt = gθ(Xt , t), X0 = x0
causality prediction
causal models are invariantHaavelmo 1944, Aldrich 1989, Pearl 2009, ...
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
King et al, Nature 2004, Bongard et al, PNAS 2007, Calderhead et al, NIPS 2009, Schmidt et al, Science 2009, Oates,Mukherjee, AoAS 2012, Babtie et al, PNAS 2014, Hill et al, Nature Methods 2016, Chen et al, JASA 2016,
Rudy et al, Science Advances 2017, Brunten et al, Nature Comm 2017, Mikkelsen, Hansen, arXiv 1710.09308, many more...
nonlinear least squares:
argminθ
∑t∈{t1,...,tm}
‖Xt − Xt‖22
Xt = gθ(Xt , t), X0 = x0
causality prediction
Yt = gθ(Xt , t), X0 = x0
causality prediction
causal models are invariantHaavelmo 1944, Aldrich 1989, Pearl 2009, ...
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
King et al, Nature 2004, Bongard et al, PNAS 2007, Calderhead et al, NIPS 2009, Schmidt et al, Science 2009, Oates,Mukherjee, AoAS 2012, Babtie et al, PNAS 2014, Hill et al, Nature Methods 2016, Chen et al, JASA 2016,
Rudy et al, Science Advances 2017, Brunten et al, Nature Comm 2017, Mikkelsen, Hansen, arXiv 1710.09308, many more...
nonlinear least squares:
argminθ
∑t∈{t1,...,tm}
‖Xt − Xt‖22
Xt = gθ(Xt , t), X0 = x0
causality prediction
Yt = gθ(Xt , t), X0 = x0
causality prediction
causal models are invariantHaavelmo 1944, Aldrich 1989, Pearl 2009, ...
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
0
3
6
9
0 25 50 75 100
time t
Yt
a
−0.5
0.0
0.5
1.0
1.5
2.0
0 5 10 15Xt
8
dYt/
dt
b
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
4. Score for model ranking
n∑i=1
[RSS(i) − RSS(i)
]/[RSS(i)
],
where RSS(i) :=∑
`
(y
(i)t` − Y
(i)t`
)2.
5. Turn the score for models into a score for variables/complexes.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
4. Score for model ranking
n∑i=1
[RSS(i) − RSS(i)
]/[RSS(i)
],
where RSS(i) :=∑
`
(y
(i)t` − Y
(i)t`
)2.
5. Turn the score for models into a score for variables/complexes.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
How invariant isYt = θ1X
8t ?
1. For each repetition i , smooth target trajectory. y (i)
2. Obtain fitted values for target derivatives. ξ(i)t1, . . . , ξ
(i)tm
3. Smooth target trajectory, with constraints on derivatives. y (i)
4. Score for model ranking
n∑i=1
[RSS(i) − RSS(i)
]/[RSS(i)
],
where RSS(i) :=∑
`
(y
(i)t` − Y
(i)t`
)2.
5. Turn the score for models into a score for variables/complexes.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
false positive rate
true
pos
itiv
e ra
te Method
CausalKinetiX
DM
GM
random
a
CausalKinetiX DM GM random
0 1 2 3 4 5+ 0 1 2 3 4 5+ 0 1 2 3 4 5+ 0 1 2 3 4 5+0.0
0.2
0.4
0.6
number of false discoveries before complete model recovery
freq
uenc
y
b
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
a) In-sample plot
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
b) Out-of-sample plot
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 3: Exploiting Invariance in Reinforcement Learning:joint work with Mads Thøisen
1. Q-learning: 1st epoch Sutton and Barto (2015)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 3: Exploiting Invariance in Reinforcement Learning:joint work with Mads Thøisen
2. Q-learning: 20,000th epoch Sutton and Barto (2015)
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 3: Exploiting Invariance in Reinforcement Learning:joint work with Mads Thøisen
3. Learning deadlocks from smaller levels.
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Idea 3: Exploiting Invariance in Reinforcement Learning:joint work with Mads Thøisen
4. Q-learning: 20,000th epoch – exploiting deadlocks
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Conclusions
Causal models 6= predictive models.
Causal models are invariant causal discovery.
Combining invariance and predictability distributional robustness.(anchor regression, metabolic networks, reinforcement learning)
invariance prediction
Tusind tak!
Book: JP, D. Janzing, B. Scholkopf: Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press 2017
JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identif. and conf. interv. (w/ discussion), JRSSB 2016
R. Christiansen and JP: Switching Regression Models and Causal Inference in the Presence of Latent Variables, arXiv:1808.05541
N. Pfister, P. Buhlmann, JP: Invariant causal prediction for sequential data, JASA 2018
D. Rothenhaussler, P. Buhlmann, N. Meinshausen, JP: Anchor regression: heterogeneous data meets causality, arXiv:1801.06229
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Conclusions
Causal models 6= predictive models.
Causal models are invariant causal discovery.
Combining invariance and predictability distributional robustness.(anchor regression, metabolic networks, reinforcement learning)
invariance prediction
Tusind tak!
Book: JP, D. Janzing, B. Scholkopf: Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press 2017
JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identif. and conf. interv. (w/ discussion), JRSSB 2016
R. Christiansen and JP: Switching Regression Models and Causal Inference in the Presence of Latent Variables, arXiv:1808.05541
N. Pfister, P. Buhlmann, JP: Invariant causal prediction for sequential data, JASA 2018
D. Rothenhaussler, P. Buhlmann, N. Meinshausen, JP: Anchor regression: heterogeneous data meets causality, arXiv:1801.06229
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019
Conclusions
Causal models 6= predictive models.
Causal models are invariant causal discovery.
Combining invariance and predictability distributional robustness.(anchor regression, metabolic networks, reinforcement learning)
invariance prediction
Tusind tak!
Book: JP, D. Janzing, B. Scholkopf: Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press 2017
JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identif. and conf. interv. (w/ discussion), JRSSB 2016
R. Christiansen and JP: Switching Regression Models and Causal Inference in the Presence of Latent Variables, arXiv:1808.05541
N. Pfister, P. Buhlmann, JP: Invariant causal prediction for sequential data, JASA 2018
D. Rothenhaussler, P. Buhlmann, N. Meinshausen, JP: Anchor regression: heterogeneous data meets causality, arXiv:1801.06229
N. Pfister, S. Bauer, JP: Identifying Causal Structure in Large-Scale Kinetic Systems, arXiv:1810.11776
Jonas Peters (Univ. of Copenhagen) A stability upgrade for data-driven models 14 June 2019