# 17 sampling dist

Embed Size (px)

TRANSCRIPT

Hadley Wickham

Stat310Sampling distributions

Monday, 22 March 2010

Quiz

• Pick up quiz on your way in

• Start at 1pm

• Finish at 1:10pm

• Closed book

Monday, 22 March 2010

1. Quiz

2. CLT & approximations

3. Sampling distributions

4. Example

5. More theory

Monday, 22 March 2010

CLT

Central limit theorem.

The distribution of a mean is normal when gets big.

Monday, 22 March 2010

Approximation

This implies that if n is big then ...

Monday, 22 March 2010

Sampling distributions

Monday, 22 March 2010

Random experiment“A random experiment is an experiment, trial, or observation that can be repeated numerous times under the same conditions... It must in no way be affected by any previous outcome and cannot be predicted with certainty.” (http://cnx.org/content/m13470/latest/)

i.e. it is uncertain (we don’t know ahead of time what the answer will be) and repeatable (ideally).

Monday, 22 March 2010

Where we are

Univariate random variables: an experiment with one output

Bivariate random variables: an experiment with two outputs

Sequences of random variables:An experiment performed repeatedly.Repeatable = i.i.d

Monday, 22 March 2010

A sampling distribution:Summary statistics from a repeated experiment

Monday, 22 March 2010

Definitions

Sample = results of n random experiments.

Random sample = result of a random experimented repeated n times. Therefore, they’re iid.

Both are sequences of random variables.

Statistic = A function of random variables with no unknown parameters.

Monday, 22 March 2010

Example

Spin a bottle and record the angle in degrees in which it points. Repeat.

How would you write this mathematically?

Monday, 22 March 2010

First time

x1 = 205, x2 = 256, x3 = 86, x4 = 119, x5 = 16, x6 = 278, x7 = 55, x8 = 16, x9 = 295, x10 = 341, x11 = 299, x12 = 270,x13 = 118, x14 = 360, x15 = 97, x16 = 282, x17 = 42, x18 = 283, x19 = 259, x20 = 326

Monday, 22 March 2010

Second time

x1 = 184, x2 = 344, x3 = 118, x4 = 226, x5 = 208, x6 = 106, x7 = 332, x8 = 310, x9 = 339, x10 = 95, x11 = 7, x12 = 274, x13 = 120, x14 = 346, x15 = 211, x16 = 166, x17 = 84, x18 = 102, x19 = 32, x20 = 128

Monday, 22 March 2010

Value

Experim

ent

5

10

15

20

● ● ●●●● ●● ● ●●● ● ●● ● ●● ●●

●● ●● ●●●● ● ●● ●● ●● ●● ●● ●

● ●● ●● ●●● ●● ●●● ●● ●●●●●

●● ●● ●●● ● ●● ●● ●● ●● ●● ●●

● ● ●●● ●●● ●● ● ●● ●● ●● ● ●●

●● ● ● ●●● ● ●●●● ●●● ●● ●●●

●●● ● ●●●● ●● ●● ●● ● ● ●● ●●

●● ●●● ●●● ● ● ● ●●● ●● ● ●●●

●● ●●● ●● ● ●● ● ●● ●● ●● ●● ●

● ●● ● ● ● ● ●● ●●● ●● ● ●●● ●●

●● ● ●● ●●● ●● ●● ●● ●●● ●●●

● ●●● ●● ● ●●● ● ●● ●● ●●● ●●

● ● ●● ●●● ●●● ● ●●● ●● ●● ● ●

● ●●● ● ●● ●●● ● ● ●●●●● ●●●

●● ●● ●● ●● ●●●● ● ● ●● ●● ●●

●● ●● ●●●●●● ●● ● ●● ● ● ●●●

●● ●● ●● ●●● ● ●● ● ●● ●● ● ●●

●●● ●● ● ● ●●● ●●● ●●● ● ●● ●

● ●● ● ●●● ●● ●● ● ●● ●● ●● ●●

● ● ●●●●● ●● ● ●● ● ● ●● ●● ●●

50 100 150 200 250 300 350

Monday, 22 March 2010

Value

Experim

ent

5

10

15

20

● ● ●●●● ●● ● ●●● ● ●● ● ●● ●●

●● ●● ●●●● ● ●● ●● ●● ●● ●● ●

● ●● ●● ●●● ●● ●●● ●● ●●●●●

●● ●● ●●● ● ●● ●● ●● ●● ●● ●●

● ● ●●● ●●● ●● ● ●● ●● ●● ● ●●

●● ● ● ●●● ● ●●●● ●●● ●● ●●●

●●● ● ●●●● ●● ●● ●● ● ● ●● ●●

●● ●●● ●●● ● ● ● ●●● ●● ● ●●●

●● ●●● ●● ● ●● ● ●● ●● ●● ●● ●

● ●● ● ● ● ● ●● ●●● ●● ● ●●● ●●

●● ● ●● ●●● ●● ●● ●● ●●● ●●●

● ●●● ●● ● ●●● ● ●● ●● ●●● ●●

● ● ●● ●●● ●●● ● ●●● ●● ●● ● ●

● ●●● ● ●● ●●● ● ● ●●●●● ●●●

●● ●● ●● ●● ●●●● ● ● ●● ●● ●●

●● ●● ●●●●●● ●● ● ●● ● ● ●●●

●● ●● ●● ●●● ● ●● ● ●● ●● ● ●●

●●● ●● ● ● ●●● ●●● ●●● ● ●● ●

● ●● ● ●●● ●● ●● ● ●● ●● ●● ●●

● ● ●●●●● ●● ● ●● ● ● ●● ●● ●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

50 100 150 200 250 300 350

Monday, 22 March 2010

samp

count

0

1

2

3

4

140 160 180 200

Monday, 22 March 2010

V1

count

0

2000

4000

6000

8000

100 150 200 250

Monday, 22 March 2010

V1

count

0

2000

4000

6000

8000

100 150 200 250

What will happen as I vary the number of samples I average over? (What theorem applies here?)

Monday, 22 March 2010

mean

count 0

100

200

300

400

0

100

200

300

400

1

4

0 50 100 150 200 250 300 350

2

5

0 50 100 150 200 250 300 350

3

0 50 100 150 200 250 300 350

Monday, 22 March 2010

mean

coun

t 0

1000

2000

3000

4000

0

1000

2000

3000

4000

1

100

0 50 100 150 200 250 300 350

10

1000

0 50 100 150 200 250 300 350

Monday, 22 March 2010

mean

coun

t 0

1000

2000

3000

4000

0

1000

2000

3000

4000

1

100

0 50 100 150 200 250 300 350

10

1000

0 50 100 150 200 250 300 350

How can I transform this random variable to make it comparable? (What theorem applies here?)

Monday, 22 March 2010

(mean − 180) * sqrt(n)

coun

t

0

200

400

600

800

0

200

400

600

800

0

200

400

600

800

1

5

1000

−400−200 0 200 400

2

10

10000

−400−200 0 200 400

3

20

−400−200 0 200 400

4

100

−400−200 0 200 400

Monday, 22 March 2010

sqrt(var)

count 0

200

400

600

800

1000

0

200

400

600

800

1000

2

4

0 50 100 150 200 250

3

5

0 50 100 150 200 250

We can do the same thing for other statistics...

Monday, 22 March 2010

sqrt(var)

coun

t

0100200300400500600

0

200

400

600

0

200

400

600

800

2

0 50 100 150 200 250 5

50 100 150 100

90 95 100 105 110 115 120

0100200300400500

0

200

400

600

800

0200400600800

1000

3

0 50 100 150 200 10

40 60 80 100 120 140 160 1000

98 100 102 104 106 108 110

0100200300400500600700

0200400600800

1000

0

200

400

600

800

4

0 50 100 150 20

60 80 100 120 14010000

102.5103.0103.5104.0104.5105.0105.5

Monday, 22 March 2010

Theory

We’ll start with the mean of normally distributed random variables, then try to extend in various ways.

Monday, 22 March 2010

Your turnX1, X2, ... are iid N(μ, σ2)

Find their mgfs. What do you notice?

Hint:

Sn =n�

1

Xi X̄n =Sn

n

MX(t) = exp�µt + σ2t2

�

Monday, 22 March 2010

Reading

4.2, 4.2.1

4.2.2, 4.4

Monday, 22 March 2010