sampling distribution. 2 introduction in real life calculating parameters of populations is usually...

Post on 24-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SAMPLING DISTRIBUTION

2

Introduction

• In real life calculating parameters of populations is usually impossible because populations are very large.

• Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference.

3

STATISTIC

• Let X1, X2,…,Xn be a r.s. of size n from a population and let T(x1,x2,…,xn) be a function which does not depend on any unknown parameters. Then, the r.v. or a random vector Y=T(X1, X2,…,Xn) is called a statistic.

4

STATISTIC• The sample mean is the arithmetic average of

the values in a r.s.1 2

1

1 nn

ii

X X XX X

n n

• The sample variance is the statistic defined by

22

1

11

n

ii

S X Xn

• The sample standard deviation is the statistic defined by S.

SAMPLING DISTRIBUTION

• A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.

6

Sampling Distribution of the Mean• An example– A die is thrown infinitely many times. Let X

represent the number of spots showing on any throw.

– The probability distribution of X is

x 1 2 3 4 5 6p(x) 1/6 1/6 1/6 1/6 1/6 1/6

E(X) = 1(1/6) +2(1/6) + 3(1/6)+………………….= 3.5

V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92

7

• Suppose we want to estimate the mean of a population from the mean of a sample, , of size n = 2.

• What is the distribution of ?X

Throwing a die twice – sample mean

X

8

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Throwing a die twice – sample mean

9

XThe distribution of when n = 2 Sample Mean Sample Mean Sample Mean

1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

6/365/36

4/36

3/36

2/36

1/36

x

E( ) =1.0(1/36)+1.5(2/36)+….=3.5

V( ) = (1.0-3.5)2(1/36)+(1.5-3.5)2(2/36)... = 1.46

x

22:

2x

x x xNote and

22:

2x

x x xNote and

x

10

6

)5

(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Sampling Distribution of the Mean

11

Sampling Distribution of the Mean

)5

(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Notice that is smaller than . The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

Notice that is smaller than x. The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

2

12

SAMPLING FROM THE NORMAL DISTRIBUTION

Properties of the Sample Mean and Sample Variance

• Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then, 2) and are independent rvs.a X S

2) ~ , /b X N n

22

12

1) ~ n

n Sc

13

SAMPLING FROM THE NORMAL DISTRIBUTION

• Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,

~ 0,1/

XN

n

•Most of the time is unknown, so we use:

./

X

S n

14

SAMPLING FROM THE NORMAL DISTRIBUTION

In statistical inference, Student’s t distribution is very important.

15

SAMPLING FROM THE NORMAL DISTRIBUTION

• Let X1, X2,…,Xn be a r.s. of size n from a N(X,X

2) distribution and let Y1,Y2,…,Ym be a r.s. of size m from an independent N(Y,Y

2).

• If we are interested in comparing the variability of the populations, one quantity of interest would be the ratio

2 2 2 2/ /X Y X YS S

16

SAMPLING FROM THE NORMAL DISTRIBUTION

• The F distribution allows us to compare these quantities by giving the distribution of

2 2 2 2

1, 12 2 2 2

/ /~

/ /X Y X X

n m

X Y Y Y

S S SF

S

• If X~Fp,q, then 1/X~Fq,p.

• If X~tq, then X2~F1,q.

17

CENTRAL LIMIT THEOREMIf a random sample is drawn from any population, the sampling

distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

Random Sample

(X1, X2, X3, …,Xn)

Sample Mean Distribution

XX

Random Variable (Population) Distribution

as n

X

18

Sampling Distribution of the Sample Mean

If X is normal, is normal.

If X is non-normal, is approximately normally distributed for sample size greater than or equal to 30.

X

2

2 or X Xn n

X

X

2 XX ~ N( , / n ) Z ~ N(0,1)

/ n

19

• The amount of soda pop in each bottle is normally distributed with a mean of 32.2 ounces and a standard deviation of 0.3 ounces.– Find the probability that a bottle bought by a customer

will contain more than 32 ounces.– Solution• The random variable X is the

amount of soda in a bottle.

= 32.2

0.7486

x = 327486.0)67.z(P

)3.

2.3232x(P)32x(P

x

EXAMPLE 1

20 = 32.2

0.7486

x = 32

• Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per bottle.

• Solution– Define the random variable as the mean amount of soda per

bottle.

9082.0)33.1z(P

)43.

2.3232x(P)32x(P

x

32x

0.9082

2.32x

EXAMPLE 1 (contd.)

21

The estimate of p = The estimate of p =

• The parameter of interest for nominal data is the proportion of times a particular outcome (success) occurs.

• To estimate the population proportion ‘p’ we use the sample proportion.

Sampling Distribution of a Proportion

pp̂̂ == XXnn

The number of successes

22

• Since X is binomial, probabilities about can be calculated from the binomial distribution.

• Yet, for inference about we prefer to use normal approximation to the binomial whenever this approximation is appropriate.

pp̂̂

Sampling Distribution of a Proportion

pp̂̂

23

Approximate Sampling Distribution of a Sample Proportion

• From the laws of expected value and variance, it can be shown that E( ) = p and V( )=p(1-p)/n

• If both np ≥ 5 and n(1-p) ≥ 5, then

• Z is approximately standard normally distributed.

ˆ

(1 )

p pz

p p

n

ˆ

(1 )

p pz

p p

n

p̂ p̂

24

EXAMPLE– A state representative received 52% of the

votes in the last election.– One year later the representative wanted to

study his popularity.– If his popularity has not changed, what is the

probability that more than half of a sample of 300 voters would vote for him?

25

EXAMPLE (contd.)

Solution• The number of respondents who prefer the representative is

binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 andn(1-p) = 300(1-.52) = 144 (both greater than 5)

7549.300)52.1)(52(.

52.50.

)1(

ˆ)50.ˆ(

npp

ppPpP

top related