introduction to kernel smoothing - max planck...
TRANSCRIPT
Introduction to Kernel Smoothing
Wilcoxon score
Den
sity
700 800 900 1000 1100 1200 1300
0.00
00.
002
0.00
40.
006
M. P. Wand & M. C. Jones
Kernel Smoothing
Monographs on Statistics and Applied Probability
Chapman & Hall, 1995.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 1
Introduction
Histogram of some p−values
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 2
Introduction
– Estimation of functions such as regression functions or probability
density functions.
– Kernel-based methods are most popular non-parametric estimators.
– Can uncover structural features in the data which a parametric
approach might not reveal.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 3
Univariate kernel density estimator
Given a random sample X1, . . . , Xn with a continuous, univariate
density f . The kernel density estimator is
f̂(x, h) =1nh
n∑i=1
K
(x−Xi
h
)with kernel K and bandwidth h. Under mild conditions (h
must decrease with increasing n) the kernel estimate converges in
probability to the true density.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 4
The kernel K
– Can be a proper pdf. Usually chosen to be unimodal and symmetric
about zero.
⇒ Center of kernel is placed right over each data point.
⇒ Influence of each data point is spread about its neighborhood.
⇒ Contribution from each point is summed to overall estimate.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 5
● ● ● ● ● ● ● ● ● ●
0 5 10 15
0.0
0.5
1.0
1.5
Gaussian kernel density estimate
● ● ● ● ● ● ● ● ● ●
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 6
The bandwidth h
– Scaling factor.
– Controls how wide the probability mass is spread around a point.
– Controls the smoothness or roughness of a density estimate.
⇒ Bandwidth selection bears danger of under- or oversmoothing.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 7
From over- to undersmoothing
KDE with b=0.1
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 8
From over- to undersmoothing
KDE with b=0.05
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 9
From over- to undersmoothing
KDE with b=0.02
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 10
From over- to undersmoothing
KDE with b=0.005
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 11
Some kernels
−2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
Uniform
−2 −1 0 1 2
0.0
0.2
0.4
0.6
Epanechnikov
−2 −1 0 1 2
0.0
0.2
0.4
0.6
0.8
Biweight
−4 −2 0 2 4
0.0
0.1
0.2
0.3
0.4
Gauss
−2 −1 0 1 2
0.0
0.2
0.4
0.6
0.8
1.0
Triangular
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 12
Some kernels
K(x, p) =(1− x2)p
22p+1B(p+ 1, p+ 1)1{|x|<1}
with B(a, b) = Γ(a)Γ(b)/Γ(a+ b).
– p = 0: Uniform kernel.
– p = 1: Epanechnikov kernel.
– p = 2: Biweight kernel.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 13
Kernel efficiency
– Perfomance of kernel is measured by MISE (mean integrated
squared error) or AMISE (asymptotic MISE).
– Epanechnikov kernel minimizes AMISE and is therefore optimal.
– Kernel efficiency is measured in comparison to Epanechnikov kernel.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 14
Kernel Efficiency
Epanechnikov 1.000
Biweight 0.994
Triangular 0.986
Normal 0.951
Uniform 0.930
⇒ Choice of kernel is not as important as choice of bandwidth.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 15
Modified KDEs
– Local KDE: Bandwidth depends on x.
– Variable KDE: Smooth out the influence of points in sparse regions.
– Transformation KDE: If f is difficult to estimate (highly skewed,
high kurtosis), transform data to gain a pdf that is easier to
estimate.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 16
Bandwidth selection
– Simple versus high-tech selection rules.
– Objective function: MISE/AMISE.
– R-function density offers several selection rules.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 17
bw.nrd0, bw.nrd
– Normal scale rule.
– Assumes f to be normal and calculates the AMISE-optimal
bandwidth in this setting.
– First guess but oversmoothes if f is multimodal or otherwise not
normal.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 18
bw.ucv
– Unbiased (or least squares) cross-validation.
– Estimates part of MISE by leave-one-out KDE and minimizes this
estimator with respect to h.
– Problems: Several local minima, high variability.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 19
bw.bcv
– Biased cross-validation.
– Estimation is based on optimization of AMISE instead of MISE (as
bw.ucv does).
– Lower variance but reasonable bias.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 20
bw.SJ(method=c("ste", "dpi"))
– The AMISE optimization involves the estimation of density
functionals like integrated squared density derivatives.
– dpi: Direct plug-in rule. Estimates the needed functionals by KDE.
Problem: Choice of pilot bandwidth.
– ste: Solve-the-equation rule. The pilot bandwidth depends on h.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 21
Comparison of bandwidth selectors
– Simulation results depend on selected true densities.
– Selectors with pilot bandwidths perform quite well but rely on
asymptotics ⇒ less accurate for densities with “sharp features”
(e.g. multiple modes).
– UCV has high variance but does not depend on asymptotics.
– BCV performs bad in several simulations.
– Authors’ recommendation: DPI or STE better than UCV or BCV.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 22
KDE with Epanechnikov kernel and DPI rule
p−values
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 23