Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least Squares Support Vector Regression with
Applications to Large-Scale Data a Statistical
Approach
Kris De Brabanter
Public Defense
April 27 2011
Promotor Prof dr ir B De MoorCo-Promotor Prof dr ir J Suykens
139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Goal of the Thesis
Goal of the Thesis
Study the properties of Least Squares Support Vector Machines forregression with an emphasis on statistical aspects and develop aframework for large scale data
439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Overview
Least Squares
Support Vector
Machines
Introduction
(Chapter 1)
Model
Building
(Chapter 2)
Model
Selection
(Chapter 3)Large Scale
Data Sets
(Chapter 4)
Robustness
(Chapter 5)
Correlated
Errors
(Chapter 6)
Confidence
Intervals
(Chapter 7)
Applications
amp
Case Studies
(Chapter 8)
Conclusions
amp
Further
Research
(Chapter 9)
539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Goal of the Thesis
Goal of the Thesis
Study the properties of Least Squares Support Vector Machines forregression with an emphasis on statistical aspects and develop aframework for large scale data
439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Overview
Least Squares
Support Vector
Machines
Introduction
(Chapter 1)
Model
Building
(Chapter 2)
Model
Selection
(Chapter 3)Large Scale
Data Sets
(Chapter 4)
Robustness
(Chapter 5)
Correlated
Errors
(Chapter 6)
Confidence
Intervals
(Chapter 7)
Applications
amp
Case Studies
(Chapter 8)
Conclusions
amp
Further
Research
(Chapter 9)
539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Goal of the Thesis
Goal of the Thesis
Study the properties of Least Squares Support Vector Machines forregression with an emphasis on statistical aspects and develop aframework for large scale data
439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Overview
Least Squares
Support Vector
Machines
Introduction
(Chapter 1)
Model
Building
(Chapter 2)
Model
Selection
(Chapter 3)Large Scale
Data Sets
(Chapter 4)
Robustness
(Chapter 5)
Correlated
Errors
(Chapter 6)
Confidence
Intervals
(Chapter 7)
Applications
amp
Case Studies
(Chapter 8)
Conclusions
amp
Further
Research
(Chapter 9)
539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Goal of the Thesis
Goal of the Thesis
Study the properties of Least Squares Support Vector Machines forregression with an emphasis on statistical aspects and develop aframework for large scale data
439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Overview
Least Squares
Support Vector
Machines
Introduction
(Chapter 1)
Model
Building
(Chapter 2)
Model
Selection
(Chapter 3)Large Scale
Data Sets
(Chapter 4)
Robustness
(Chapter 5)
Correlated
Errors
(Chapter 6)
Confidence
Intervals
(Chapter 7)
Applications
amp
Case Studies
(Chapter 8)
Conclusions
amp
Further
Research
(Chapter 9)
539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Overview
Least Squares
Support Vector
Machines
Introduction
(Chapter 1)
Model
Building
(Chapter 2)
Model
Selection
(Chapter 3)Large Scale
Data Sets
(Chapter 4)
Robustness
(Chapter 5)
Correlated
Errors
(Chapter 6)
Confidence
Intervals
(Chapter 7)
Applications
amp
Case Studies
(Chapter 8)
Conclusions
amp
Further
Research
(Chapter 9)
539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
A simple example
minus5 0 5minus30
minus20
minus10
0
10
20
30
Y = aX + b
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
Y =
PARAMETRIC FORM IS NOT ALWAYS EASY TO FIND
739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Construction of a nonparametric estimate NW smoother
0 02 04 06 08 1
minus1
minus05
0
05
1
15
x
mn(x)
mn(x) =nsum
i=1
K(xminusXih )
sumnj=1K(
xminusXjh )
Yi
839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Other nonparametric regression estimates
Local constant regression (Nadaraya 1964 Watson 1964)
Regression trees (Breiman et al 1984)
Wavelets (Daubechies 1992)
Nearest Neighbors (Devroye et al 1994)
Local linear regression (Fan amp Gijbels 1996)
Support vector machines (Vapnik 1995)
Splines (Wahba 1990 Eubank 1999)
Partitioning estimates (Gyorfi et al 2002)
Least squares support vector machines (Suykens et al 2002)
939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Least squares support vector machines
Primal formulation (LS-SVM formulation for regression)
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆⋆
input space feature space
⋆
⋆
⋆
⋆⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
⋆
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
ϕ( )
⋆
ϕ(middot)
1039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVM solution + model selection
Dn = (Xk Yk) Xk isin Rd Yk isin R k = 1 n
iidsim (XY )
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Dual formulation(
0 1Tn
1n Ω+ Inγ
)
(
b
α
)
=
(
0
Y
)
Ωkl = ϕ(Xk)Tϕ(Xl) = K(XkXl) = (2π)minusd2 exp(minusXkminusXl2
2h2 )
K has to be positive definite ieint
exp(minusjωx)K(x) dx ge 0
Model in dual space mn(x) =sumn
k=1 αkK(xXk) + b
γ and h tuning parameters rArr cross-validation1139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Effect of the tuning parameters
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too small
0 10 20 30 40 50 60minus150
minus100
minus50
0
50
100
h too large
Model selection criteria are ABSOLUTELY needed
1239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Estimation in Primal Space
LS-SVM formulation for regression
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wTϕ(Xk) + b+ ek = Yk k = 1 n
Can we solve the LS-SVM in primal space instead of dual
Approximation of feature map ϕ needed
Is it possible to compute such a mapping
ϕ can be infinite dimensional
Solutionuse a fixed size m of support vectors to approximate ϕ
Solve the above as primal ridge regression
1439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with Large Scale Data
1 Calculation andor storage kernel matrix Ω
N = 1000 rArr Ω rArr 8 MBN = 10000 rArr Ω rArr 763 MBN = 20000 rArr Ω rArr 3051 MB
2 If possible to compute how long would it take
rArr Solution Matrix Approximations (Nystrom 1930)
ϕi(x)m≪nasymp
radicm
λ(m)i
summk=1K(Xk x)u
(m)ki
1539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Fixed Size LS-SVM formulation
Given approximation to the feature map
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st wT ϕ(Xk) + b+ ek = Yk k = 1 n
Solution(
wb
)
=(
ΦTe Φe +
Im+1
γ
)minus1ΦTe Y
with
Φe =
ϕ1(X1) middot middot middot ϕm(X1) 1
ϕ1(Xn) middot middot middot ϕm(Xn) 1
1639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Selection of support vectors Renyi Entropy
Maximize quadratic Renyi entropy HmR2 = minus log
int
f(x)2 dx
Theorem (Maximizing Entropy)
The Renyi entropy on a closed interval [a b] with a b isin R and no
additional moment constraints is maximized for the uniform
density 1(bminus a)
(Selecting SV) (Wrong Bandwidth)
1739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
0 1000 2000 3000 4000 5000 6000 7000 800077
775
78
785
79
795
Tem
p
Time (s) 1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
0 1000 2000 3000 4000 5000 6000 7000 800077
775
78
785
79
795
Tem
p
Time (s) 1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
0 1000 2000 3000 4000 5000 6000 7000 800077
775
78
785
79
795
Tem
p
Time (s) 1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
0 1000 2000 3000 4000 5000 6000 7000 800077
775
78
785
79
795
Tem
p
Time (s) 1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Identification of a pilot scale distillation column
Joint work with Bart Huyck (CIT)
Task identify bottom temperature column with LS-SVM
0 1000 2000 3000 4000 5000 6000 7000 800077
775
78
785
79
795
Tem
p
Time (s) 1839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions1939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in parametric regression
model Yk = aXk + b+ ek k = 1 n
(a b) estimated from data
LS principle (a b) = argminabisinR2
1n
sumnk=1 [Yk minus (aXk + b)]2
0 1 2 3 4 5minus10
minus5
0
5
10
15
20
25
30
35
40
X
Y
LS principle is NOT robust
Solution LAD (a b) = argminabisinR2
1n
sumnk=1 |Yk minus (aXk + b)|
2039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
f(X)
X
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Problems with outliers in nonparametric regression
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X)
LS principle rArr sensitive to outliers (and leverage points)
Linearpolynomial kernel rArr non-robust methods
Using appropriate CV rArr robust CV
2139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 e
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Iterative Reweighting amp Weight Functions
Primal formulation
minwbe
JP (w e) =12w
Tw + γ2
sumnk=1 vke
2k
st Yk = wTϕ(Xk) + b+ ek k = 1 n
Huber Hampel Logistic Myriad
V (r)
1 if |r| lt ββ|r| if |r| ge β
1 if |r| lt b1 b2minus|r|b2minusb1
if b1 le |r| le b2
0 if |r| gt b2
tanh(r)r
δ2
δ2+r2
ψ(r)
L(r)
r2 if |r| lt β
β|r| minus 1
2β2 if |r| ge β
r2 if |r| lt b1b2r2minus|r3|
b2minusb1 if b1 le |r| le b2
0 if |r| gt b2
r tanh(r) log(δ2+r2)
2239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Properties of the Myriad
if δ rarr infin =rArr Myriad converges to sample mean
if δ rarr 0 =rArr Myriad converges to the sample mode
X1 X2X3X4 X5
δ = 075
δ = 2
δ = 20
2339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
To obtain a fully robust solution
robust smootherbounded kernel
robust CV rArr Lprime bounded
RCV (θ) =1
n
nsum
i=1
L(
Yi minus m(minusi)n (Xi θ)
)
0 02 04 06 08 1minus15
minus1
minus05
0
05
1
15
2
25
outliers
X
f(X
)
0 02 04 06 08 1
minus2
minus1
0
1
2
X
f(X
)
2439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions2539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
We have a problem
0 02 04 06 08 1minus2
minus1
0
1
2
3
4
5
6
x
m(x)m
n(x)
VIOLATION OF IID ASSUMPTION2639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What went wrong
Yi = m(xi) + ei model selection =rArr Cov[ei ej ] = 0
In previous example Cov[ei ej ] 6= 0
correlation (covariance) Strength of relationship between ei and ej
Example 1
Suppose there are two technology stocks If they are affected bythe same industry trends their prices will tend to rise or falltogether
Example 2
Housing prices Sensitive to offer and demand
Example 3
In our toy example ei+1 was affected by the value of ei and soon
2739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Removing correlation effects main theorem
Breakdown bandwidth selection procedures smoother stays consistent
Theorem
Assume x equiv in x isin [0 1] E[e] = 0 Cov[ei ei+k] = E[eiei+k] = γk and γk sim kminusa
for some a gt 2 Assume that Yi = m(xi) + ei and
(C1) K is Lipschitz continuous at x = 0
(C2)intK(u) du = 1 lim|u|rarrinfin |uK(u)| = 0
int|K(u)| du lt infin supu |K(u)| lt infin
(C3)int|k(u)| du lt infin and K is symmetric
Further assume that boundary effects are ignored and that h rarr 0 as n rarr infin such
that nh2 rarr infin then for the NW smoother it follows that
E[CV(h)]=1
nE
nsum
i=1
[
m(xi)minusm(minusi)n (xi)
]2
+σ2minus
4K(0)
nhminusK(0)
infinsum
k=1
γk+o(nminus1hminus1)
No prior knowledge about correlation structure needed
2839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Suitable kernels amp drawback
minus08 minus06 minus04 minus02 0 02 04 06 080
05
1
15
2
25
minus3 minus2 minus1 0 1 2 30
005
01
015
02
025
03
035
04
045
minus5 0 50
002
004
006
008
01
012
014
016
018
02
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
0 02 04 06 08 1minus1
0
1
2
3
4
5
6
rArrrArr Decreased Mean Squared Error rArrrArr2939
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Some real life examples
Beveridge index of wheat prices
1500 1550 1600 1650 1700 1750 1800 1850 19002
25
3
35
4
45
5
55
6
Yearlog(B
everidge
wheatprices)
US monthly birth rate
1940 1941 1942 1943 1944 1945 1946 1947 19481800
2000
2200
2400
2600
2800
3000
Year
Birth
rate
3039
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3139
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimates
Can we say something about the true function m given mn
We want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level α
Pointwise vs simultaneousuniform confidence intervals
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
What are confidence intervals
How accurate are our nonparametric estimatesCan we say something about the true function m given mnWe want something of the formLn(x) le m(x) le Un(x) forallx for some confidence level αPointwise vs simultaneousuniform confidence intervals
minus1 minus05 0 05 1minus05
0
05
1
15
2
25
X
Ym
n(X
)
3239
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
In practice
In general
These intervals give the user the ability to see how well a certainmodel explains the true underlying process while taking statisticalproperties of the estimator into account
Fault detection
In fault detection CI are used for reducing the number of falsealarms
3339
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Outline
1 Goal amp Overview
2 IntroductionParametric vs nonparametric regressionNonparametric regression estimates an overview
3 Fixed-Size Least Squares Support Vector MachinesFixed Size LS-SVM formulationSelection of Support VectorsPractical identification problem
4 Robust Nonparametric MethodsProblems with outliersRobust nonparametric regression
5 Correlated ErrorsProblems with correlation in nonparametric regressionRemoving correlation effects
6 Confidence Intervals
7 Conclusions3439
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Conclusions
Goal of the Thesis
Study the properties of LS-SVM for regression with an emphasison statistical aspects and develop a framework for large scale data
Main Achievements
Framework for large data sets
Method for minimizing model selection criteria score functions
Robustification of kernel based method
Weight function with attractive properties
Framework for correlated errors based on bimodal kernels
Asymptotic normality of linear smoothers
Bias amp variance estimators for LS-SVM
Pointwise amp simultaneous CI + comparison bootstrap
3539
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
LS-SVMLab software
Free available (for research purposes) Matlab toolboxhttpwwwesatkuleuvenacbesistalssvmlab
Userrsquos guide with applications (p 113 De Brabanter et al 2010)
3639
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
Falck T Dreesen P De Brabanter K Pelckmans K De Moor B SuykensJAK Least-Squares Support Vector Machines for the Identification ofWiener-Hammerstein Systems Submitted 2011
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression in the Presence of Correlated Errors Submitted 2011
De Brabanter K Karsmakers P De Brabanter J Suykens JAK De Moor BConfidence Bands for Least Squares Support Vector Machine Classifiers ARegression Approach Submitted 2010
De Brabanter K De Brabanter J Suykens JAK De Moor B ApproximateConfidence and Prediction Intervals for Least Squares Support VectorRegression IEEE Transactions on Neural Networks 22(1)110ndash120 2011
Sahhaf S De Brabanter K Degraeve R Suykens JAK De Moor BGroeseneken G Modelling of Charge TrappingDe-trapping Induced VoltageInstability in High-k Gate Dielectrics Submitted 2010
Karsmakers P Pelckmans K De Brabanter K Van Hamme H SuykensJAK Sparse Conjugate Directions Pursuit with Application to Fixed-sizeKernel Models Submitted 2010
Sahhaf S Degraeve R Cho M De Brabanter K Roussel PhJ Zahid MBGroeseneken G Detailed Analysis of Charge Pumping and Id minus Vg Hysteresis forProfiling Traps in SiO2HfSiO(N) Microelectronic Engineering87(12)2614ndash2619 2010
3739
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B OptimizedFixed-Size Kernel Models for Large Data Sets Computational Statistics amp DataAnalysis 54(6)1484ndash1504 2010
Lopez J De Brabanter K Dorronsoro JR Suykens JAK Sparse LS-SVMswith L0-Norm Minimization Accepted for publication in Proc of the 19thEuropean Symposium on Artificial Neural Networks Computational Intelligenceand Machine Learning (ESANN) Brugge (Belgium) 2011
Huyck B De Brabanter K Logist F De Brabanter J Van Impe J De MoorB Identification of a Pilot Scale Distillation Column A Kernel Based ApproachAccepted for publication in 18th World Congress of the International Federationof Automatic Control (IFAC) 2011
De Brabanter K Karsmakers P De Brabanter J Pelckmans K SuykensJAK De Moor B On Robustness in Kernel Based Regression NIPS 2010Robust Statistical Learning (ROBUSTML) (NIPS 2010) Whistler CanadaDecember 2010
De Brabanter K Sahhaf S Karsmakers P De Brabanter J Suykens JAKDe Moor B Nonparametric Comparison of Densities Based on StatisticalBootstrap in Proc of the Fourth European Conference on the Use of ModernInformation and Communication Technologies (ECUMICT) Gent BelgiumMarch 2010 pp 179ndash190
3839
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-
Goal amp Overview Introduction FS-LSSVM Robust Nonparametric Methods Correlated Errors Confidence Intervals Conclusions
Publications
De Brabanter K De Brabanter J Suykens JAK De Moor B KernelRegression with Correlated Errors in Proc of the the 11th InternationalSymposium on Computer Applications in Biotechnology (CAB) LeuvenBelgium July 2010 pp 13ndash18
De Brabanter K Pelckmans K De Brabanter J Debruyne M SuykensJAK Hubert M De Moor B Robustness of Kernel Based Regression aComparison of Iterative Weighting Schemes in Proc of the 19th InternationalConference on Artificial Neural Networks (ICANN) Limassol Cyprus September2009 pp 100ndash110
De Brabanter K Dreesen P Karsmakers P Pelckmans K De Brabanter JSuykens JAK De Moor B Fixed-Size LS-SVM Applied to theWiener-Hammerstein Benchmark in Proc of the 15th IFAC Symposium onSystem Identification (SYSID 2009) Saint-Malo France July 2009 pp826ndash831
De Brabanter K Karsmakers P Ojeda F Alzate C De Brabanter JPelckmans K De Moor B Vandewalle J Suykens JAK LS-SVMlab ToolboxUserrsquos Guide version 17rdquo Internal Report 10-146 ESAT-SISTA KULeuven(Leuven Belgium) 2010
3939
- Goal amp Overview
- Introduction
-
- Parametric vs nonparametric regression
- Nonparametric regression estimates an overview
-
- Fixed-Size Least Squares Support Vector Machines
-
- Fixed Size LS-SVM formulation
- Selection of Support Vectors
- Practical identification problem
-
- Robust Nonparametric Methods
-
- Problems with outliers
- Robust nonparametric regression
-
- Correlated Errors
-
- Problems with correlation in nonparametric regression
- Removing correlation effects
-
- Confidence Intervals
- Conclusions
-