9.estimation additional topics

Chapter 9

Estimation: Additional Topics

Statistics for Business and Economics

6th Edition

After completing this chapter, you should be able to:

Form confidence intervals for the mean difference from dependent samples

Form confidence intervals for the difference between two independent population means (standard deviations known or unknown)

Compute confidence interval limits for the difference between two independent population proportions

Create confidence intervals for a population variance Find chi-square values from the chi-square distribution table Determine the required sample size to estimate a mean or

proportion within a specified margin of error

Chapter Goals

Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values:

Eliminates Variation Among Subjects Assumptions:

Both Populations Are Normally Distributed

Dependent Samples

Dependent samples di = xi - yi

The ith paired difference is di , where

Mean Difference

di = xi - yi

The point estimate for the population mean paired difference is d : n

dd

n

1ii

n is the number of matched pairs in the sample

1n

)d(dS

n

1i

2i

d

The sample standard deviation is:

The confidence interval for difference between population means, μd , is

Confidence Interval forMean Difference

Where n = the sample size (number of matched pairs in the paired sample)

nStdμ

nStd d

α/21,ndd

α/21,n

Six people sign up for a weight loss program. You collect the following data:

Paired Samples Example

Weight: Person Before (x) After (y) Difference, di

1 136 125 11 2 205 195 10 3 157 150 7 4 138 140 - 2 5 175 165 10 6 166 160 6 42

d = di

n

4.82

1n)d(d

S2

id

= 7.0

For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,.025 = 2.571

The 95% confidence interval for the difference between means, μd , is

12.06μ1.94

64.82(2.571)7μ

64.82(2.571)7

nStdμ

nStd

d

d

dα/21,nd

dα/21,n

Paired Samples Example

Since this interval contains zero, we cannot be 95% confident, given this limited data, that the weight loss program helps people lose weight

Different data sources Unrelated Independent

Sample selected from one population has no effect on the sample selected from the other population

The point estimate is the difference between the two sample means:

Difference Between Two Means

Population means, independent

samples

Goal: Form a confidence interval for the difference between two population means, μx – μy

x – y

σx2 and σy

2 Unknown,Assumed Equal


samples

Assumptions: Samples are randomly and independently drawn

Populations are normally distributed

Population variances are unknown but assumed equal*σx

2 and σy2

assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

σx2 and σy2 Unknown,

Assumed Equal


samples

The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ

use a t value with (nx + ny – 2) degrees of freedom

*σx2 and σy

2 assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

2nn1)s(n1)s(n

syx

2yy

2xx2

p

Confidence Interval, σx

2 and σy2 Unknown, Equal

The confidence interval for μ1 – μ2 is:

*σx2 and σy

2 assumed equal

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

x y

x y

2 2p p

n n 2,α/2 X Yx y

2 2p p

n n 2,α/2x y

s s(x y) t μ μ

n n

s s(x y) t

n n

You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz):

CPUx CPUyNumber Tested 17 14Sample mean 3004 2538Sample std dev 74 56

Pooled Variance Example

Assume both populations are normal with equal variances, and use 95% confidence

Calculating the Pooled Variance

4427.031)141)-(17

56114741171)n(n

S1nS1nS

22

y

2yy

2xx2

p

(()1x

The pooled variance is:

The t value for a 95% confidence interval is:

2.045tt 0.025 , 29α/2 , 2nn yx

The 95% confidence interval is

Calculating the Confidence Limits

y

2p

x

2p

α/22,nnYXy

2p

x

2p

α/22,nn ns

ns

t)yx(μμns

ns

t)yx(yxyx

144427.03

174427.03(2.054)2538)(3004μμ

144427.03

174427.03(2.054)2538)(3004 YX

515.31μμ416.69 YX

We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 9-15

Two Population Proportions

Goal: Form a confidence interval for the difference between two population proportions, Px – Py

The point estimate for the difference is

Population proportions

Assumptions: Both sample sizes are large (generally at least 40 observations in each sample)

yx pp ˆˆ

The random variable

is approximately normally distributed


Two Population Proportions


(continued)

y

yy

x

xx

yxyx

n)p(1p

n)p(1p

)p(p)pp(Z

ˆˆˆˆ

ˆˆ


Confidence Interval forTwo Population Proportions


The confidence limits for Px – Py are:

y

yy

x

xxyx n

)p(1pn

)p(1pZ )pp(ˆˆˆˆˆˆ

2/

Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees.

In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree


Example: Two Population Proportions

Men:

Women:



0.101240

0.70(0.30)50

0.52(0.48)n

)p(1pn

)p(1p

y

yy

x

xx

ˆˆˆˆ

0.525026px ˆ

0.704028py ˆ

(continued)

For 90% confidence, Z/2 = 1.645

The confidence limits are:

so the confidence interval is

-0.3465 < Px – Py < -0.0135Since this interval does not contain zero we are 90% confident

that the two proportions are not equal Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 9-20


(continued)

(0.1012)1.645.70)(.52

n)p(1p

n)p(1pZ)pp(

y

yy

x

xxα/2yx

ˆˆˆˆˆˆ

The required sample size can be found to reach a desired margin of error (ME) with a specified level of confidence (1 - )

The margin of error is also called sampling error the amount of imprecision in the estimate of the

population parameter the amount added and subtracted to the point

estimate to form the confidence interval


Margin of Error


Sample Size Determination

For the Mean

DeterminingSample Size

nσzx α/2

nσzME α/2

Margin of Error (sampling error)



For the Mean


nσzME α/2

(continued)

2

22α/2

MEσzn Now solve

for n to get

To determine the required sample size for the mean, you must know:

The desired level of confidence (1 - ), which determines the z/2 value

The acceptable margin of error (sampling error), ME

The standard deviation, σ


Sample Size Determination(continued)

If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?


Required Sample Size Example

(Always round up)

219.195

(45)(1.645)ME

σzn 2

22

2

22α/2

So the required sample size is n = 220



n)p(1pzp α/2

ˆˆˆ

n)p(1pzME α/2

ˆˆ


For theProportion

Margin of Error (sampling error)


Sample Size DeterminationDeterminingSample Size

For theProportion

2

2α/2

MEz 0.25n

Substitute 0.25 for and solve for n to get

(continued)

n)p(1pzME α/2

ˆˆ

cannot be larger than 0.25, when = 0.5

)p(1p ˆˆ

p̂)p(1p ˆˆ

The sample and population proportions, and P, are generally not known (since no sample has been taken yet)

P(1 – P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence)

To determine the required sample size for the proportion, you must know: The desired level of confidence (1 - ), which determines

the critical z/2 value The acceptable sampling error (margin of error), ME Estimate P(1 – P) = 0.25Statistics for Business and Economics, 6e © 2007 Pearson

Education, Inc. Chap 9-28

Sample Size Determination(continued)

p̂

How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence?





Solution:

For 95% confidence, use z0.025 = 1.96

ME = 0.03

Estimate P(1 – P) = 0.25

So use n = 1068

(continued)

1067.11(0.03)

6)(0.25)(1.9ME

z 0.25n 2

2

2

2α/2

9.estimation additional topics

Documents