non-parametric learning prof. a.l. yuille stat 231. fall 2004. chp 4.1 – 4.3

17
Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3.

Upload: chrystal-hancock

Post on 29-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Non-Parametric Learning

Prof. A.L. Yuille

Stat 231. Fall 2004.

Chp 4.1 – 4.3.

Page 2: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Parametric versus Non-Parametric

• Previous lectures on MLE learning assumed a functional form for the probability distribution.

• We now consider an alternative non-parametric method based on window function.

Page 3: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Non-Parametric

• It is hard to develop probability models for some data.

• Example: estimate the distribution of annual rainfall in the U.S.A. Want to model p(x,y) – probability that a raindrop hits a position (x,y).

• Problems: (i) multi-modal density is difficult for parametric models, (ii) difficult/impossible to collect enough data at each point (x,y).

Page 4: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Intuition

• Assume that the probability density is locally smooth.

• Goal: estimate the class density model p(x) from data

• Method 1: Windows based on points x in space.

Page 5: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Windows

• For each point x, form a window centred at x with volume Count the number of samples that fall in the window.

• Probability density is estimated as:

Page 6: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Non-Parametric

• Goal: to design a sequence of windows so that at each point x• • (f(x) is the true density).• Conditions for window design:(i) increasing spatial resolution.

(ii) many samples at each point

(iii)

Page 7: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Two Design Methods

• Parzen Window: Fix window size:• K-NN: Fix no. samples in window:

Page 8: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Parzen Window

• Parzen window uses a window function

• Example:

• (i) Unit hypercube:

and 0 otherwise.

• (ii) Gaussian in d-dimensions.

Page 9: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Parzen Windows

• No. of samples in the hypercube is

• Volume

• The estimate of the distribution is:

• More generally, the window interpolates the data.

Page 10: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Parzen Window Example

• Estimate a density with five modes using Gaussian windows at scales h=1,0.5, 0.2.

Page 11: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Convergence Proof.

• We will show that the Parzen window estimator converges to the true density at each point x with increasing number of samples.

Page 12: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Proof Strategy.

• Parzen distribution is a random variable which depends on the

samples used to estimate it.

• We have to take the expectation of the distribution with respect to the samples.

• We show that the expected value of the Parzen distribution will be the true distribution. And the expected variance of the Parzen distribution will tend to 0 as no. samples gets large.

Page 13: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Convergence of the Mean

• Result follows.

Page 14: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Convergence of Variance

• Variance:

Page 15: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Example of Parzen Window

• Underlying density is Gaussian. Window volume decreases as

Page 16: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Example of Parzen Window

• Underlying Density is bi-modal.

Page 17: Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3

Parzen Window and Interpolation.

• In practice, we do not have an infinite number of samples.

• The choice of window shape is important. This effectively interpolates the data.

• If the window shape fits the local structure of the density, then Parzen windows are effective.