1 chapter 12 inference about a population 2 introduction in this chapter we utilize the approach...

46
1 Chapter 12 Inference About a Inference About a Population Population

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

1

Chapter 12

Inference About a Inference About a PopulationPopulation

Inference About a Inference About a PopulationPopulation

Page 2: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

2

IntroductionIntroduction

• In this chapter we utilize the approach In this chapter we utilize the approach developed before to describe a population.developed before to describe a population.– Identify the parameter to be estimated or tested.Identify the parameter to be estimated or tested.– Specify the parameter’s estimator and its sampling Specify the parameter’s estimator and its sampling

distribution.distribution.– Construct a confidence interval estimator or perform Construct a confidence interval estimator or perform

a hypothesis test.a hypothesis test.

Page 3: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

3

• We shall develop techniques to estimate and test three population parameters.– The expected value – The variance 2

– The population proportion p (for qualitative data)

IntroductionIntroduction

Page 4: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

4

• Recall: By the central limit theorem, when 2 is known is normally distributed if:

• the sample is drawn from a normal population, or • the population is not normal but the sample is sufficiently large.

• When 2 is unknown, another random variable

describes the distribution of

x

12.1 Inference About a Population Mean 12.1 Inference About a Population Mean When the Population Standard Deviation When the Population Standard Deviation is Unknownis Unknown

x

Page 5: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

5

The t - StatisticThe t - Statistic

n

x

n

x

Z s

When the sampled population is normally distributed,the statistic t is Student t distributed. See next.

When is unknown, we use s2 instead, and the Z statistic is then replaced by the t-statistic

t

Page 6: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

6

The t - StatisticThe t - Statistic

n

x

n

x

s

0

The Student- t distribution is mound-shaped, and symmetrical around zero.

Degrees of freedom = n2

Degrees of freedom= n1

n1 < n2

t

Using the t-table

The degrees of freedom determine the distribution shape

Page 7: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

7

Testing Testing when when is unknown is unknown

• Example 12.1 - Productivity of newly hired Trainees

Page 8: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

8

• Example 12.1– In order to determine the number of workers required to meet

demand, the productivity of newly hired trainees is studied.

– It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring.

– Can we conclude that this belief is correct, based on productivity observation of 50 trainees (raw data is presented later in the file Xm12-01).

Testing Testing when when is unknown is unknown

Page 9: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

9

• Example 12.1 – Solution– The problem objective is to describe the population

of the number of packages processed in one hour.– The data is quantitative.

H0: = 450H1: > 450

– The t statistic

d.f. = n - 1 = 49ns

xt

We want to prove that the trainees

reach 90% productivity of experienced workers

We want to prove that the trainees

reach 90% productivity of experienced workers

Testing Testing when when is unknown is unknown

Page 10: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

10

After transforming into a t-statistic we express the rejection region in terms of the statistic t.

• Solution - continued

Observe: H1 has the form of > 0, thus

The rejection region is

Testing Testing when when is unknown is unknown

Lxx

t t,n-1t t,n-1

x

Page 11: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

11

• Solution continued (solving by hand)

The rejection region is t > t,n – 1.

t,n - 1 = t.05,49

Testing Testing when when is unknown is unknown

The critical value (table entry)

t.05,50 = 1.676

You can use the Excel function =tinv to obtain the critical value. This function gives the two-tail probability ‘t value’. That is, for a two tail test with significance level of alpha, it returns the critical value of t,n – 1. Since our test is one-tail, we’ll use 2 instead of . Thus, type in =tinv(.1,49), to obtain the result 1.676551.

2(.05) = .1

Page 12: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

12

89.15083.38

45038.460

ns

xt

• Since 1.89 > 1.676 we reject the null hypothesis in favor of the alternative.

• Conclusion: There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level.

1.676 1.89

Rejection region

Testing Testing when when is unknown is unknown

The test statistic is calculated based on the data provided in Xm12-01

Page 13: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

13

Testing Testing when when is unknown is unknown

.05

.0323

Xm12-01.xls

Using Data Analysis Plus and the p-value approachto test the mean.

t-Test: Mean

Packages

Mean 460.38

Standard Deviation 38.8271

Hypothesized Mean 450

df 49

t Stat 1.8904

P(T<=t) one-tail 0.0323

t Critical one-tail 1.6766

Since .02323 < .05, we reject the null hypothesis in favor of the alternative. There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level.

1.89

Page 14: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

14

Estimating Estimating when when is unknown is unknown

• Confidence interval estimator of when s2 is unknown

1n.f.dn

stx 2 1n.f.d

n

stx 2

Page 15: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

15

• Example 12.2– An investor is trying to estimate the return on

investment in companies that won quality awards last year.

– A random sample of 83 such companies is selected, and the return on investment is calculated had he invested in them.

– Construct a 95% confidence interval for the mean return.

Estimating Estimating when when is unknown is unknown

Page 16: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

16

• Solution (solving by hand)– The problem objective is to describe the population

of annual returns from buying shares of quality award-winners.

– The data is quantitative.– Solving by hand

• From the data we determine

8.3168.98s

68.98s15.02x 2

835.16,205.138331.8

990.102.15n

stx 1n,2

t.025,82 t.025,80

Estimating Estimating when when is unknown is unknown

Page 17: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

17

Estimating Estimating when when is unknown is unknown

t - estimate: MeanReturns

Mean 15.0172Standard Deviation 8.3054LCL 13.0237UCL 16.8307

Using Data Analysis Plus

Page 18: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

18

Checking the required conditionsChecking the required conditions

• We need to check that the population is normally distributed, or at least not extremely non-normal.

• There are statistical methods that can be used to test for normality (to be introduced later in the book, but not discussed here).

• From the sample histograms we see…

Page 19: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

19

0

5

10

15

20

25

30

-4 2 8 14 22 30 More

02468

101214

400 425 450 475 500 525 550 575 More

A Histogram for XM-11- 01

PackagesA Histogram for XM-11- 02

Returns

Page 20: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

20

12.2 Inference About a Population Variance12.2 Inference About a Population Variance

• Some times we are interested in making inference about the variability of processes.

• Examples:– The consistency of a production process for quality control

purposes.– To evaluate the risk associated with different investments.

• To draw inference about variability, the parameter of interest is 2.

Page 21: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

21

• The population variance can be estimated or its value tested using the sample variance s2.

• The sample variance s2 is an unbiased, consistent and efficient point estimator for 2.

• The inference about 2 is made by using a sample statistic that incorporates s2 and 2.

Inference About a Population VarianceInference About a Population Variance

Page 22: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

22

• This statistic is .

• It has a distribution called Chi-squared, if the population is normally distributed.

2

2s)1n(

Inference About a Population VarianceInference About a Population Variance

1ndfσ

1)s(n2

22

1ndf

σ1)s(n2

22

Page 23: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

23

Inference About a Population VarianceInference About a Population Variance

1ndfσ

1)s(nχ 2

22

1ndf

σ1)s(n

χ 2

22

0

0.02

0.04

0.06

0.08

0.1

0 5 10 15 20 25

DF = 5

DF=10

The Chi-squared distribution

The degfrees of freedom (df)determines the distribution shape

Page 24: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

24

• Example 1 (operation management application)– A container-filling machine is believed to fill 1 liter

containers so consistently, that the variance of the filling will be less than 1 cc (.001 liter).

– To test this belief a random sample of 25 1-liter fills was taken, and the results recorded (Xm12-03.xls)

– Do these data support the belief that the variance is less than 1cc at 5% significance level?

Testing the population variance – Testing the population variance – Left hand tail testLeft hand tail test

Page 25: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

25

• Solution– The problem objective is to describe the population of 1-liter fills

from a filling machine. – The data are quantitative, and we are interested in the variability

of the fills.– The two hypotheses are:

H0: 2 = 1

H1: 2 <1We want to prove that the process is consistent

Testing the population varianceTesting the population variance

s2 Critical Values2 Critical Value

The rejection region has the form:

Page 26: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

26

Testing the population varianceTesting the population variance• Solution

– The two hypotheses are:H0: 2 = 1

H1: 2 <1

21n,1

2 21n,1

2

The rejection region in terms of 2 is:

Page 27: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

27

• Solving by hand– Note that (n - 1)s2 = (xi - x)2 = xi

2 – (xi)2/n – From the sample (data is presented in units of cc-1000

to avoid rounding) we can calculate xi = 24,996.4, and

xi2 = 24,992,821.3

– Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78

Testing the population varianceTesting the population variance

Page 28: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

28

There is insufficient evidence to reject the hypothesis thatthe variance is equal to 1cc.

There is insufficient evidence to reject the hypothesis thatthe variance is equal to 1cc.

Testing the population varianceTesting the population variance

Using the 2 table

Rejection Region

20.7813.84

Since 20.78>13.8484 do not rejectthe null hypothesis

.8484.13

,78.201

78.20s)1n(

2125,95.

21n,1

22

22

Page 29: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

29

• A right hand tail test:• H0: 2 = value

H1: 2 > value

• Rejection region

Testing the population variance – Testing the population variance – Right hand tail test; Two tail test;Right hand tail test; Two tail test;

21n,

2 21n,

2

Click

Page 30: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

30

• A right hand tail test:– H0: 2 = value

H1: 2 > value

– Rejection region

• A two tail test– H0: 2 value

H1: 2 value

– Rejection region:

21n,

2 21n,

2

21n,2

221n,21

2 or 21n,2

221n,21

2 or

Testing the population variance – Testing the population variance – Right hand tail test; Two tail test;Right hand tail test; Two tail test;

Page 31: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

31

Estimating the population varianceEstimating the population variance

From the following probability statement

P(21-/2 < 2 < 2

/2) = 1-

we have (by substituting 2 = [(n - 1)s2]/2.)

22/1

22

22/

2 s)1n(s)1n(

22/1

22

22/

2 s)1n(s)1n(

This is the confidence interval for 2

with 1- % confidence level.

Page 32: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

32

Estimating the population varianceEstimating the population variance

• Example 2– Estimate the variance of fills in example 12.3 with

99% confidence.• Solution

– We have (n-1)s2 = 20.78.From the Chi-squared table we have2

/2,n-1 = 2.005, 24 = 45.5585

2/2,n-1 = 2

.0995, 24 = 9.88623

Page 33: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

33

• The confidence interval is

10.246.

88623.978.20

5585.4578.20

s)1n(s)1n(

2

2

2

2/1

22

2

2/

2

Estimating the population varianceEstimating the population variance

Page 34: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

34

12.4 Inference About a Population 12.4 Inference About a Population ProportionProportion

• When the population consists of nominal or categorical data, the only inference we can make is about the proportion of occurrence of a certain value.

• The parameter “p” was used before to calculate these proportion under the binomial distribution.

Page 35: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

35

size.samplensuccesses.ofnumberthex

wherenx

p

ˆ

size.samplensuccesses.ofnumberthex

wherenx

p

ˆ

• Statistic and sampling distribution– the statistic used when making inference about ‘p’ is:

– Under certain conditions, [np > 5 and n(1-p) > 5], is approximately normally distributed, with

= p and 2 = p(1 - p)/n.p̂

12.4 Inference About a Population 12.4 Inference About a Population ProportionProportion

Page 36: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

36

Testing and estimating the ProportionTesting and estimating the Proportion

• Test statistic for p

• Interval estimator for p (1- confidence level)

5)p1(nand5npwhere

n/)p1(ppp̂

Z

5)p1(nand5npwhere

n/)p1(ppp̂

Z

5)p̂1(nand5p̂nprovided

n/)p̂1(p̂zp̂ 2/

5)p̂1(nand5p̂nprovided

n/)p̂1(p̂zp̂ 2/

Page 37: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

37

• Example 12.5 (Predicting the winner in election day)– Voters are asked by a certain network to participate in an

exit poll in order to predict the winner on election day.– Based on the data presented in Xm12.5.xls (where

1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote?

Testing the ProportionTesting the Proportion

Page 38: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

38

• Solution– The problem objective is to describe the population

of votes in the state.– The parameter to be tested is ‘p’.– Success is defined as “Republican vote”.– The hypotheses are:

H0: p = .5

H1: p > .5 More than 50% vote republicanMore than 50% vote republican

Testing the ProportionTesting the Proportion

Page 39: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

39

– Solving by hand• The rejection region is z > z = z.05 = 1.645.• From file Xm12.5.xls we count 407 success. Number of

voters participating is 765.• The sample proportion is• The value of the test statistic is

• The p-value is = P(Z>1.77) = .0382

532.765407p̂

77.1765/)5.1(5.

5.532.

n/)p1(p

pp̂Z

Testing the ProportionTesting the Proportion

Page 40: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

40

z-Test : Proportion

Sample Proportion 0.5321Observations 765Hypothesized Proportion 0.5z Stat 1.7739P(Z<=z) one-tail 0.0382z Critical one-tail 1.6449P(Z<=z) two-tail 0.0764z Critical two-tail 1.96

There is sufficient evidence to reject the null hypothesisin favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican.

Using Data Analysis Plus we have:

< 0.05

Testing the ProportionTesting the Proportion

Page 41: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

41

• Example (marketing application)– In a survey of 2000 TV viewers at 11.40 p.m. on a

certain night, 226 indicated they watched “The Tonight Show”.

– Estimate the number of TVs tuned to the Tonight Show in a typical night, if there are 100 million potential television sets. Use 95% confidence level.

– Solution

014.113.

2000/)887(.113.96.1113.n/)p̂1(p̂zp̂ 2/

Estimating the ProportionEstimating the Proportion

226/2000 = .1131-.113 = .887

Page 42: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

42

• Solution

Estimating the ProportionEstimating the Proportion

z - Estimate: Proportion

Sample Proportion 0.113Observations 2000LCL 0.0991UCL 0.1269

Using Excel we have:

LCL = .0991(1,000,000)= 9.9 millionUCL = .1269(1,000,000)=12.7 million

A confidence interval for the number of viewers who watched the tonight Show:

Page 43: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

43

Selecting the Sample Size to Estimate Selecting the Sample Size to Estimate the Proportionthe Proportion

• Recall: The confidence interval for the proportion is

• Thus, to estimate the proportion to within W, we can write

n/)p̂1(p̂zp̂ 2/

n/)p̂1(p̂zW 2/

Page 44: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

44

Selecting the Sample Size to Estimate Selecting the Sample Size to Estimate the Proportionthe Proportion

• The required sample size is

2

2/

Wn/)p̂1(p̂z

n

2

2/

Wn/)p̂1(p̂z

n

Page 45: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

45

• Example– Suppose we want to estimate the proportion of customers

who prefer our company’s brand to within .03 with 95% confidence.

– Find the sample size needed to guarantee that this requirement is met.

– SolutionW = .03; 1 - = .95, therefore /2 = .025, so z.025 = 1.96

2

03.)p̂1(p̂96.1

n

Since the sample has not yet been taken, the sample proportionis still unknown.

We proceed using either one of the following two methods:

Sample Size to Estimate the ProportionSample Size to Estimate the Proportion

Page 46: 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter

46

• Method 1:– There is no knowledge about the value of

• Let . This results in the largest possible n needed for a 1- confidence interval of the form .

• If the sample proportion does not equal .5, the actual W will be narrower than .03 with the n obtained by the formula below.

5.p̂ 03.p̂

068,103.

)5.1(5.96.1n

2

68303.

)2.1(2.96.1n

2

Sample Size to Estimate the ProportionSample Size to Estimate the Proportion

• Method 2:– There is some idea about the value of

• Use the value of to calculate the sample sizep̂p̂