bayesx and inla - opponents or partners?...thomas kneib conditionally gaussian hierarchical models...
TRANSCRIPT
![Page 1: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/1.jpg)
BayesX and INLA - Opponents or Partners?
Thomas Kneib
Institut fur MathematikCarl von Ossietzky Universitat Oldenburg
Monia Mahling
Institut fur StatistikLudwig-Maximilians-Universitat Munchen
Trondheim, 15.5.2009
![Page 2: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/2.jpg)
Thomas Kneib Outline
Outline
• Conditionally Gaussian hierarchical models.
• MCMC inference in conditionally Gaussian models.
• BayesX.
• Credit Scoring Data.
• Summary and Discussion.
BayesX and INLA - Opponents or Partners? 1
![Page 3: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/3.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
Conditionally Gaussian Hierarchical Models
• Hierarchical models with conditionally Gaussian priors for regression coefficients definea large class of flexible regression models.
• We will consider regression models with predictors of the form
ηi = x′iβ + f1(zi1) + . . . + fr(zir),
where x and β are potentially high-dimensional vectors of covariates and parameters,while the generic functions f1, . . . , fr represent different types of nonlinear regressioneffects.
BayesX and INLA - Opponents or Partners? 2
![Page 4: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/4.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Examples:
– Nonlinear, smooth effects of continuous covariates x where fj(zj) = f(x).
– Interaction surfaces of two continuous covariates or coordinates x1, x2 wherefj(zj) = f(x1, x2).
– Spatial effects based on discrete spatial, i.e. regional information s ∈ {1, . . . , S}where fj(zj) = fspat(s).
– Varying coefficient models where fj(zj) = x1f(x2).
– Random effects where fj(zj) = xbc with a cluster index c.
BayesX and INLA - Opponents or Partners? 3
![Page 5: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/5.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Model the generic functions with basis function approaches:
fj(zj) =K∑
k=1
γjkBjk(zj).
• Yields a vector-matrix representation of the predictor:
η = Xβ + Z1γ1 + . . . + Zrγr
• Conditionally Gaussian priors:
β|ϑ0 ∼ N(b, B) and γj|ϑj ∼ N(gj, Gj)
where b = b(ϑ0), B = B(ϑ0), gj = gj(ϑj), Gj = Gj(ϑj).
BayesX and INLA - Opponents or Partners? 4
![Page 6: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/6.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Most prominent examples of conditionally Gaussian priors in the context of estimatingsmooth effects are of the (intrinsic) Gaussian Markov random field type where
p(γj|δ2j ) ∝
(1δ2j
)rank(Kj)
2
exp
(− 1
2δ2j
γ′jKjγj
),
i.e. gj = 0 and G−1j = δ2
jKj.
BayesX and INLA - Opponents or Partners? 5
![Page 7: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/7.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Example 1: Bayesian P-Splines
f(x) =K∑
k=1
γkBk(x).
where Bk(x) are B-spline basis functions of degree l and γ follows a random walkprior such as
γk = γk−1 + uk, uk|δ2 ∼ N(0, δ2)
orγk = 2γk−1 − γk−2 + uk, uk|δ2 ∼ N(0, δ2).
BayesX and INLA - Opponents or Partners? 6
![Page 8: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/8.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
BayesX and INLA - Opponents or Partners? 7
![Page 9: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/9.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
δ2
j−1 j
E(γ j|γ j−1) = γ j−1
δ2
j−1 j
E(γ j|γ j−1) = γ j−1
• Usually, an inverse gamma prior is assigned to the smoothing variance:
δ2 ∼ IG(a, b).
• Bayesian P-splines include simple random walks as special cases (degree zero, knotsat each distinct observed covariate value).
BayesX and INLA - Opponents or Partners? 8
![Page 10: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/10.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Bayesian P-splines can be made more adaptive by replacing the homoscedasticrandom walk with a heteroscedastic version:
γk = γk−1 + uk, uk|δ2k ∼ N(0, δ2
k).
• Joint distribution of the regression coefficients becomes
p(γ|δ) ∝ exp(−1
2γ′D∆Dγ
)
where ∆ = diag(δ22, . . . , δ
2k).
• Different types of hyperpriors for ∆:
– I.i.d. hyperpriors, e.g. δ2k i.i.d. IG(a, b, ).
– Functional hyperpriors, e.g. δ2k = g(k) with a smooth function g(k) modeled again
as a P-spline.
• Conditional on ∆ the prior for γ remains of the same type and an MCMC updateswould not require changes.
BayesX and INLA - Opponents or Partners? 9
![Page 11: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/11.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Example 2: Markov random fields for regional spatial effects:
γs|γr, r ∈ N(s) ∼ N
1|N(s)|
∑
r∈N(s)
γr,δ2
|N(s)|
.
• Based on the notion of spatial adjacency:
• Again, a hyperprior can be assigned to the smoothing variance but the jointdistribution of the spatial effects remains conditionally Gaussian.
BayesX and INLA - Opponents or Partners? 10
![Page 12: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/12.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• For regularised estimation of high-dimensional regression effects β we are consideringconditionally independent priors, i.e.
β|ϑ0 ∼ N(b, B)
with b = 0 and B = diag(τ21 , . . . , τ2
q ).
• While allowing for different variances, hyperpriors for τ2j will typically be identical.
BayesX and INLA - Opponents or Partners? 11
![Page 13: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/13.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Example 1: Bayesian ridge regression
βj|τ2j ∼ N(0, τ2
j ), τ2j ∼ IG(a, b).
• Note that the log-prior log p(βj|τ2j ) equals the ridge penalty β2
j up to an additiveconstant.
• Induces a marginal t-distribution with 2a degrees of freedom and scale parameter√a/b.
BayesX and INLA - Opponents or Partners? 12
![Page 14: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/14.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Informative priors provide the Bayesian analogon to frequentist regularisation.
• Example: Multiple linear model
y = Xβ + ε, ε ∼ N(0, σ2I).
• For high-dimensional covariate vectors, least squares estimation becomes increasinglyunstable.
⇒ Add a penalty term to the least squares criterion, for example a ridge penalty
LSpen(β) = (y −Xβ)′(y −Xβ) + λ
p∑
j=1
β2j → min
β.
• Closed form solution: Penalised least squares estimate
β = (X ′X + λI)−1X ′y.
BayesX and INLA - Opponents or Partners? 13
![Page 15: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/15.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Bayesian version of the linear model:
y = Xβ + ε, β ∼ N(0, τ2I).
• Yields the posterior
p(β|y) ∝ exp(− 1
2σ2(y −Xβ)′(y −Xβ)
)exp
(− 1
2τ2β′β
)
• Maximising the posterior is equivalent to minimising the penalised least squarescriterion
(y −Xβ)′(y −Xβ) + λβ′β
where the smoothing parameter is given by the signal-to-noise ratio
λ =σ2
τ2.
BayesX and INLA - Opponents or Partners? 14
![Page 16: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/16.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• The posterior mode coincides with the penalised least squares estimate (for givensmoothing parameter).
• More generally:
– Penalised likelihoodlpen(β) = l(β)− pen(β).
– Posterior:p(β|y) = p(y|β)p(β).
• In terms of the prior distribution
Penalty ≡ log-prior.
BayesX and INLA - Opponents or Partners? 15
![Page 17: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/17.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Example 2: Bayesian lasso prior:
βj|τ2j , λ ∼ N(0, τ2
j ), τ2j ∼ Exp
(λ2
2
).
• Marginally, βj follows a Laplace prior
p(βj) ∝ exp(−λ|βj|).
• Hierarchical (scale mixture of normals) representation:
&%'$
&%'$
&%'$
&%'$
&%'$
- - -λ β λ τ2 βvs.
Lap(λ) Exp(0.5λ2) N(0, τ2)
• A further hyperprior can be assigned to the smoothing parameter such as a gammadistribution λ2 ∼ Ga(a, b).
BayesX and INLA - Opponents or Partners? 16
![Page 18: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/18.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Marginal Bayesian ridge and marginal Bayesian lasso:
−4 −2 0 2 4
−5
−4
−3
−2
−1
01
log−
prio
r
−4 −2 0 2 4
−5
−4
−3
−2
−1
01
log−
prio
r
BayesX and INLA - Opponents or Partners? 17
![Page 19: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/19.jpg)
Thomas Kneib Conditionally Gaussian Hierarchical Models
• Example 3: General Lp priors
p(βj|λ) ∝ exp(−λ|βj|p)
with 0 < p < 2 (power exponential prior).
• Note that
exp(−|βj|p) ∝∫ ∞
0
exp
(− β2
j
2τ2j
)1τ6j
sp/2
(1
2τ2j
)dτ2
j
where sp(·) is the density of the positive stable distribution with index p.
BayesX and INLA - Opponents or Partners? 18
![Page 20: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/20.jpg)
Thomas Kneib MCMC Inference in Conditionally Gaussian models
MCMC Inference in Conditionally Gaussian models
• The general structure of conditionally Gaussian models enables the construction ofgeneral MCMC samplers.
• The conditionally Gaussian prior makes inference tractable in situations which aredifficult with direct estimation (such as the lasso).
• Suitable hyperpriors enable inference and uncertainty assessment for all modelparameters.
• MCMC fully exploits the hierarchical nature of the models through the considerationof full conditionals.
BayesX and INLA - Opponents or Partners? 19
![Page 21: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/21.jpg)
Thomas Kneib MCMC Inference in Conditionally Gaussian models
• For (latent) Gaussian responses, we obtain Gibbs sampling steps for the regressioncoefficients.
• For example, β|· ∼ N(µβ,Σβ) with
µβ = Σβ1σ2
X ′(y − η−β) + B−1b, Σβ =(
1σ2
X ′X + B−1
)−1
,
• For non-Gaussian responses, construct adaptive proposal densities based on iterativelyweighted least squares approximations to the full conditionals.
• For example, β is proposed from a multivariate Gaussian distribution with expectationand covariance matrix
µβ = ΣβX ′W (y − η−β) + B−1b, Σβ =(X ′WX + B−1
)−1.
where W and y are the usual generalised linear model weights and working responses.
BayesX and INLA - Opponents or Partners? 20
![Page 22: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/22.jpg)
Thomas Kneib MCMC Inference in Conditionally Gaussian models
• Full conditionals for hyperparameters are independent of the observation model.
• Bayesian ridge:
τ2j | · ∼ IG
(a +
q
2, b +
12β2
j
)
• Bayesian lasso:
1τ2j
∣∣∣∣∣ · ∼ InvGauss
( |λ||βj|, λ
2
), λ2| · ∼ Ga
a + q, b +
12
q∑
j=1
τ2j
.
• Smoothing variances:
δ2j | · ∼ IG
(aj +
rank(Kj)2
, bj +12γjKjγj
).
BayesX and INLA - Opponents or Partners? 21
![Page 23: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/23.jpg)
Thomas Kneib BayesX
BayesX
• Markov chain Monte Carlo approaches for conditionally Gaussian regression modelsare implemented in BayesX.
BayesX and INLA - Opponents or Partners? 22
![Page 24: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/24.jpg)
Thomas Kneib BayesX
• Available from
http://www.stat.uni-muenchen.de/~bayesx
• Numerical efficient implementation employing sparse matrix operations.
• Also contains mixed model based inference for the same class of models (comparableto INLAs Gaussian approximation).
BayesX and INLA - Opponents or Partners? 23
![Page 25: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/25.jpg)
Thomas Kneib Credit Scoring Data
Credit Scoring Data
• Data on the defaults of 1,000 consumer credits from a German bank.
• Response variable is a binary indicator yi that specifies whether the credit has beenpaid back (yi = 1, credit-worthy) or not (yi = 0, not credit-worthy).
• Covariates include age of the client, credit amount and duration of the credit.
• Consider binary logit models with nonparametric effects of these three covariates.
• Compare different approximations available in INLA with MCMC-based estimation inBayesX.
BayesX and INLA - Opponents or Partners? 24
![Page 26: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/26.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount obtained with the complete data:
INLA Gaussian Approximation
−50
00
500
0 5000 10000 15000
INLA Simplified Laplace
−10
00−
500
050
0
0 5000 10000 15000
BayesX RW
−2
−1
01
0 5000 10000 15000
BayesX P−Splines
−2
−1
01
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 25
![Page 27: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/27.jpg)
Thomas Kneib Credit Scoring Data
• Effects of age obtained with one outlier excluded:
INLA Gaussian Approximation
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
INLA Simplified Laplace
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX RW
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX P−Splines
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX and INLA - Opponents or Partners? 26
![Page 28: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/28.jpg)
Thomas Kneib Credit Scoring Data
• Effects of duration obtained with one outlier excluded:
INLA Gaussian Approximation
−10
−5
05
10 20 30 40 50 60 70
INLA Simplified Laplace
−10
−5
05
10 20 30 40 50 60 70
BayesX RW
−10
−5
05
10 20 30 40 50 60 70
BayesX P−Splines
−10
−5
05
10 20 30 40 50 60 70
BayesX and INLA - Opponents or Partners? 27
![Page 29: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/29.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount obtained with one outlier excluded:
INLA Gaussian Approximation
−40
−20
020
40
0 5000 10000 15000
INLA Simplified Laplace
−80
−60
−40
−20
020
0 5000 10000 15000
BayesX RW
−2
−1
01
0 5000 10000 15000
BayesX P−Splines
−2
−1
01
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 28
![Page 30: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/30.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount based on rounded data with one outlier excluded:
INLA Simplified Laplace
−80
−60
−40
−20
020
40
0 5000 10000 15000
INLA Laplace
−40
−30
−20
−10
010
20
0 5000 10000 15000
BayesX RW
−2
−1
01
0 5000 10000 15000
BayesX P−Splines
−2
−1
01
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 29
![Page 31: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/31.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount after standardising covariates with one outlier excluded:
INLA Gaussian Approximation, RW1, a=b=0.001
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
1.0
−1 0 1 2 3 4
INLA Simplified Laplace
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
1.0
−1 0 1 2 3 4
INLA Laplace
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
1.0
−1 0 1 2 3 4
BayesX and INLA - Opponents or Partners? 30
![Page 32: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/32.jpg)
Thomas Kneib Credit Scoring Data
• Effects of age after standardising covariates with one outlier excluded:
INLA Gaussian Approximation, RW1, a=b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
−1 0 1 2 3
Model1r_s_r
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
−1 0 1 2 3
Model1r_l_r
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
−1 0 1 2 3
BayesX and INLA - Opponents or Partners? 31
![Page 33: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/33.jpg)
Thomas Kneib Credit Scoring Data
• Effects of duration after standardising covariates with one outlier excluded:
Model3_g
−10
−5
05
−1 0 1 2 3 4
Model1r_s_r
−10
−5
05
−1 0 1 2 3 4
Model1r_l_r
−10
−5
05
−1 0 1 2 3 4
BayesX and INLA - Opponents or Partners? 32
![Page 34: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/34.jpg)
Thomas Kneib Credit Scoring Data
• Computing times for some selected models (in seconds, very rough estimates):
– INLA with Gaussian approximation: 200s.
– INLA with simplified Laplace: 240s.
– INLA with Laplace (amount rounded): 2540s.
– BayesX with RW prior and 12,000 iterations: 60s.
– BayesX with RW prior and 103,000 iterations: 510s.
– BayesX with P-spline prior and 12,000 iterations: 90s.
– BayesX with P-spline prior and 103,000 iterations: 790s.
BayesX and INLA - Opponents or Partners? 33
![Page 35: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/35.jpg)
Thomas Kneib Credit Scoring Data
• Effects of age obtained with one outlier excluded: Different random walk orders andhyperparameters for Gaussian Approximation
INLA Gaussian Approximation, RW1, a=b=0.001−
1.5
−1.
0−
0.5
0.0
0.5
1.0
1.5
2.0
20 30 40 50 60 70
INLA Gaussian Approximation, RW1, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
INLA Gaussian Approximation, RW2, a=b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
INLA Gaussian Approximation, RW2, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX and INLA - Opponents or Partners? 34
![Page 36: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/36.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount obtained with one outlier excluded: Different random walk ordersand hyperparameters for Gaussian Approximation
INLA Gaussian Approximation, RW1, a=b=0.001−
2−
10
1
0 5000 10000 15000
INLA Gaussian Approximation, RW1, a=1, b=0.001
−40
−20
020
40
0 5000 10000 15000
INLA Gaussian Approximation, RW2, a=b=0.001
−40
−20
020
40
0 5000 10000 15000
INLA Gaussian Approximation, RW2, a=1, b=0.001
−40
−20
020
40
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 35
![Page 37: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/37.jpg)
Thomas Kneib Credit Scoring Data
• Effects of age obtained with one outlier excluded: Different random walk orders andhyperparameters for Simplified Laplace
INLA Simplified Laplace, RW1, a=b=0.001−
1.5
−1.
0−
0.5
0.0
0.5
1.0
1.5
2.0
20 30 40 50 60 70
INLA Simplified Laplace, RW1, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
INLA Simplified Laplace, RW2, a=b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
INLA Simplified Laplace, RW2, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX and INLA - Opponents or Partners? 36
![Page 38: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/38.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount obtained with one outlier excluded: Different random walk ordersand hyperparameters for Simplified Laplace
INLA Simplified Laplace, RW1, a=b=0.001−
80−
60−
40−
200
20
0 5000 10000 15000
INLA Simplified Laplace, RW1, a=1, b=0.001
−80
−60
−40
−20
020
0 5000 10000 15000
INLA Simplified Laplace, RW2, a=b=0.001
−80
−60
−40
−20
020
0 5000 10000 15000
INLA Simplified Laplace, RW2, a=1, b=0.001
−80
−60
−40
−20
020
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 37
![Page 39: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/39.jpg)
Thomas Kneib Credit Scoring Data
• Effects of age obtained with one outlier excluded: Different random walk orders andhyperparameters for BayesX
BayesX, RW1, a=b=0.001−
1.5
−1.
0−
0.5
0.0
0.5
1.0
1.5
2.0
20 30 40 50 60 70
BayesX, RW1, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX, RW2, a=b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX, RW2, a=1, b=0.001
−1.
5−
1.0
−0.
50.
00.
51.
01.
52.
0
20 30 40 50 60 70
BayesX and INLA - Opponents or Partners? 38
![Page 40: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/40.jpg)
Thomas Kneib Credit Scoring Data
• Effects of amount obtained with one outlier excluded: Different random walk ordersand hyperparameters for BayesX
BayesX, P−Spline RW1, a=b=0.001−
2−
10
1
0 5000 10000 15000
BayesX, P−Spline RW1, a=1, b=0.001
−2
−1
01
0 5000 10000 15000
BayesX, P−Spline RW2, a=b=0.001
−2
−1
01
0 5000 10000 15000
BayesX, P−Spline RW2, a=1, b=0.001
−2
−1
01
0 5000 10000 15000
BayesX and INLA - Opponents or Partners? 39
![Page 41: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/41.jpg)
Thomas Kneib Summary and Discussion
Summary and Discussion
• Conditionally Gaussian models provide a rich class of regression models.
• BayesX and INLA provide comparable estimates in well-behaved examples but resultsmay differ substantially in difficult situations.
• In particular, covariates with outliers seem to yield highly variable estimates withINLA.
• Differences in computing times not always as expected (full Laplace approximationmay be slow).
• In particular, covariates with a large number of different covariate values yield longcomputing times.
BayesX and INLA - Opponents or Partners? 40
![Page 42: BayesX and INLA - Opponents or Partners?...Thomas Kneib Conditionally Gaussian Hierarchical Models † Bayesian P-splines can be made more adaptive by replacing the homoscedastic random](https://reader036.vdocument.in/reader036/viewer/2022071412/6109ad6593864a1232658a8b/html5/thumbnails/42.jpg)
Thomas Kneib Summary and Discussion
• Suggestions for improving INLA:
– Provide characterisations for “difficult” data sets?
– Implement Bayesian P-splines instead of random walk priors (faster and morestable)?
– Revise default prior choice for hyperparameters?
• Further questions:
– Flexibility in terms of hyperprior choices (further hierarchical levels)?
– Partial impropriety of the conditionally Gaussian priors and model choice quantities.
BayesX and INLA - Opponents or Partners? 41