Transcript
Page 1: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Lecture 6

Bootstraps

Maximum Likelihood Methods

Page 2: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Boostrapping

A way to generateempirical probability distributions

Very handy for makingestimates of uncertainty

Page 3: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

100 realizations of a normal distribution p(y) with

y=50 y=100

Page 4: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

What is the distribution of

yest = i yi

?

N1

Page 5: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

We know this should be a Normal distribution with

expectation=y=50and variance=y/N=10

p(y)

y

p(yest)

yest

Page 6: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Here’s an empirical way of determining the distribution

called

bootstrapping

Page 7: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

y1

y2

y3

y4

y5

y6

y7

yN

y’1

y’2

y’3

y’4

y’5

y’6

y’7

y’N

4

3

7

11

4

1

9

6

N o

rigi

nal d

ata

Ran

dom

inte

gers

in

the

rang

e 1-

N

N r

esam

pled

dat

aN1

i y’i

Compute estimate

Now repeat a gazillion times and examine the resulting distribution of estimates

Page 8: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Note that we are doing

random sampling with replacement

of the original dataset y

to create a new dataset y’

Note: the same datum, yi, may appear several times in the new dataset, y’

Page 9: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

pot of an infinite number of y’s with

distribution p(y)

cup of N y’s drawn from

the pot

Does a cup drawn from the pot

capture the statistical behavior of what’s in the

pot?

Page 10: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

More or less the same thing in the 2 pots ?

Take 1 cup

p(y)D

uplic

ate

cup

an in

fini

te

num

ber

of ti

mes

Pour into new pot

p(y)

Page 11: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Random sampling easy to code in MatLab

yprime = y(unidrnd(N,N,1));

vector of N random integers between 1 and N

original dataresampled data

Page 12: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

The theoretical and bootstrap results match pretty well !

theoretical

Bootstrap with 105 realizations

Page 13: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Obviouslybootstrapping is of limited utility when we know the theoretical

distribution

(as in the previous example)

Page 14: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

but it can be very useful when we don’t

for example

what’s the distribution of yest

where (yest)2 = 1/(N-1) i (yi-yest)2

and yest= (1/N) i yi

(Yes, I know a statistician would know it follows Student’s T-distribution …)

Page 15: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

To do the bootstrap

we calculate

y’est= (1/N) i y’i

(y’est)2 = 1/(N-1) i (y’i-y’est)2

and y’est = (y’

est)2

many times – say 105 times

Page 16: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Here’s the bootstrap result …

Bootstrap with 105 realizations

ytrue

I numerically calculate an expected value of 92.8 and a variance of 6.2

Note that the distribution is not quite centered about the true value of 100

This is random variation. The original N=100 data are not quite representative of the an infinite ensemble of normally-distributed values

pyest)

yest

Page 17: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

So we would be justified saying

y 92.6 ± 12.4

that is, 26.2, the 95% confidence interval

Page 18: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

The Maximum Likelihood Distribution

A way to fitparameterized probability distributions

to data

very handy when you have good reasonto believe the data follow a particular

distribution

Page 19: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Likelihood Function, L

The logarithm ofthe probable-ness of a given dataset

Page 20: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

N data y are all drawn from the same distribution p(y)

the probable-ness of a single measurement yi is p(yi)

So the probable-ness of the whole dataset is

p(y1) p(y2) … p(yN) = i p(yi)

L = ln i p(yi) = i ln p(yi)

Page 21: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Now imagine that the distribution p(y) is known up to a vector m of unknown parameters

write p(y; m) with semicolon as a reminder

that its not a joint probabilty

The L is a function of m

L(m) = i ln p(yi; m)

Page 22: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

The Principle of Maximum Likelihood

Chose m so that it maximizes L(m)

L/mi = 0

the dataset that was in fact observed is the most probable one that could have been observed

Page 23: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Example – normal distribution of unknown mean y and variance 2

p(yi) = (2)-1/2 -1 exp{ -½ -2 (yi-y)2 }

L = i ln p(yi) =

-½Nln(2) –Nln() -½ -2 i (yi-y)2

L/y = 0 = -2 i (yi-y)

L/ = 0 = - N -1 + -3 i (yi-y)2

N’s arise because sum is

from 1 to N

Page 24: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Solving for y and

0 = -2 i (yi-y) y = N-1 iyi

0 = -N-1 + -3 i (yi-y)2 2 = N-1 i (yi-y)2

Page 25: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

y = N-1 iyi

2 = N-1 i (yi-y)2

Sample mean is the maximum likelihood estimate of the expected value of the normal distribution

Sample variance (more-or-less*) is the maximum likelihood estimate of the variance of the normal distribution

*issue of N vs. N-1 in the formula

Interpreting the results

Page 26: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Example – 100 data drawn from a normal distribution

truey=50=100

Page 27: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

L(y,)

y

maxat

y=62=107

Page 28: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Another Example – exponential distribution

p(yi) = ½ -1 exp{ - -1 |yi-y| }

Check normalization … use z= yi-y

p(yi)dy = ½-1 -+

exp{ - -1 |yi-y| } dyi

= ½ -1 2 0

+ exp{ - -1 z } dz

= -1 (-) exp{--1z}|0+ = 1

Is this parameter really the expectation ?

Is this parameter really variance ?

Page 29: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Is y the expectation ?

E(yi) = -+

yi ½ -1 exp{ - -1 |yi-y| } dyi

use z= yi-y

E(yi) = ½ -1 -+

(z+y) exp{ - -1|z| } dz

= ½ -1 2 y o

+exp{ - -1 z } dz

= - y exp{ - -1 z }|o+

= y

z exp(--1|z|) is odd function times even function so integral is zero

YES !

Page 30: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Is the variance ?

var(yi) = -+

(yi-y)2 ½ -1 exp{ - -1 |yi-y| } dyi

use z= -1(yi-y)

E(yi) = ½ -1 -+ 2 z2 exp{ -|z| } dz

= 2 0

+ z2 exp{ -z } dz

= 2 2 2

CRC Math Handbook gives this integral as equal to 2

Not Quite …

Page 31: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Maximum likelihood estimate

L = Nln(½) – Nln() - -1 i |yi-y|

L/y = 0 = - -1 i sgn (yi-y)

L/ = 0 = - N -1 + -2 i |yi-y|

y such that i sgn (yi-y) = 0

x

|x|

x

d|x|/dx

+1

-1

Zero when half the yi’s bigger than y, half of them smallery is the median of the yi’s

Page 32: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Once y is known then …

L/ = 0 = - N -1 + -2 i |yi-y|

= N-1 i |yi-y| with y = median(y)

Note that when N is even, y is not unique,

but can be anything between the two middle values in a sorted list of yi’s

Page 33: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Comparison

Normal distribution:

best estimate of expected value is sample mean

Exponential distribution

best estimate of expected value is sample median

Page 34: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

ComparisonNormal distribution:

short tailedoutlier extremely uncommonexpected value should be chosen to make

outliers have as small a deviation as possible

Exponential distribution:relatively long-tailedoutlier relatively commonexpected value should ignore actual value of outliers

yi

median mean

outlier

yi

median mean

Page 35: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

another important distributionGutenberg-Richter distribution

(e.g. earthquake magnitudes)

for earthquakes greater than some threshhold magnitude m0, the probability that the earthquake will have a magnitude greater than m is

–b (m-m0)

or P(m) = exp{ – log(10) b (m-m0) }

= exp{-b’ (m-m0) } with b’= log(10) b

P(m)=10

Page 36: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

This is a cumulative distribution, thus the probability that magnitude is greater than m0 is unity

P(m) = exp{ –b’ (m-m0) } = exp{0} = 1

Probability density distribution is its derivative

p(m) = b’ exp { –b’ (m-m0) }

Page 37: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Maximum likelihood estimate of b’ is

L(m) = N log(b’) – b’ i (mi-m0)

L/b’ = 0 = N/b’ - i (mi-m0)

b’ = N / i (mi-m0)

Page 38: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Originally Gutenberg & Richtermade a mistake …

magnitude, m

Log

10 P

(m)

slope = -b

… by estimating slope, b using least-squares, and not the Maximum Likelihood formula

least-squares fit

Page 39: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

yet another important distributionFisher distribution on a sphere

(e.g. paleomagnetic directions)

given unit vectors xi that scatter around some mean direction x, the probability distribution for the angle between xi and x (that is, cos()=xix) is

p() = sin() exp{ cos() }

2 sinh() is called the “precision parameter”

Page 40: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

Rationale for functional form

p() exp{ cos() }

For close to zero 1 – ½2 so

p() exp{ cos() } = exp{ exp{ – ½2 }

which is a gaussian

Page 41: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of

I’ll let you figure out the

maximum likelihood estimate of

the central direction, x,

and the precision parameter,


Top Related