chapter 10 statistical inference for two samples

43
Chapter 10 Statistical Inference for Two Samples

Upload: delphia-dickerson

Post on 24-Dec-2015

325 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: Chapter 10 Statistical Inference for Two Samples

Chapter 10

Statistical Inference

for Two Samples

Page 2: Chapter 10 Statistical Inference for Two Samples

Learning Objectives

• Comparative experiments involving two samples• Test hypotheses on the difference in means of

two normal distributions• Test hypotheses on the ratio of the variances or

standard deviations of two normal distributions• Test hypotheses on the difference in two

population proportions• Compute power, type II error probability, and

make sample size decisions for two-sample tests• Explain and use the relationship between

confidence intervals and hypothesis tests

Page 3: Chapter 10 Statistical Inference for Two Samples

Assumptions• Interested on statistical inferences on the

difference in means of two normal distributions

• Populations represented by X1 and X2

• Expected Value

Page 4: Chapter 10 Statistical Inference for Two Samples

Assumptions

• Quantity

• Has a N(0, 1) distribution• Used to form tests of hypotheses and

confidence intervals on μ1-μ2

Page 5: Chapter 10 Statistical Inference for Two Samples

Hypothesis Tests for a Difference in Means, Variances Known

• Difference in means μ1-μ2 is equal to a specified value ∆0

– H0: μ1-μ2 =∆0

– H1: μ1-μ2 #∆0

• Test statistic

Page 6: Chapter 10 Statistical Inference for Two Samples

Hypothesis Tests for a Difference in Means, Variances Known

• Alternative Hypothesis

• H1: μ1-μ2 #∆0

– Rejection Criterion

• z0> zα/2 or z0<-zα/2

• H1: μ1-μ2 >∆0

– Rejection Criterion

• z0> zα

• H1: μ1-μ2<∆0

– Rejection Criterion

• Z0< -zα

Page 7: Chapter 10 Statistical Inference for Two Samples

Choice of Sample Size

• Use of OC Curves– Use OC curves in Appendix Charts VIa, VIb,

VIc, and VId– Abscissa scale of the OC curves

Page 8: Chapter 10 Statistical Inference for Two Samples

Choice of Sample Size• Two-sided Sample Size

– Sample size n=n1=n2 required to detect a true difference in means ∆ of with power at least 1-β

– Where ∆ is the true difference in means of interest

• One-sided Sample Size

Page 9: Chapter 10 Statistical Inference for Two Samples

Type II Error

• Follows the singe-sample case• Two-sided alternative

Page 10: Chapter 10 Statistical Inference for Two Samples

C.I. on a Difference in Means, Variances Known, and Choice of

Sample Size

• Confidence Interval– 100(1-α)% C.I. on the difference in two

means μ1-μ2

Page 11: Chapter 10 Statistical Inference for Two Samples

Choice of Sample Size

• Choice of Sample Size– Error in estimating μ1-μ2 by less than

E at 100(1-α)% confidence21xx

Page 12: Chapter 10 Statistical Inference for Two Samples

Example• Two machines are used for filling plastic bottles with a net

volume of 16.0 ounces• The fill volume can be assumed normal, with standard

deviation 1=0.020 and 2=0.025 ounces

• A member of the quality engineering staff suspects that both machines fill to the same mean net volume, whether or not this volume is 16.0 ounces. A random sample of 10 bottles is taken from the output of each machine as follows

Page 13: Chapter 10 Statistical Inference for Two Samples

Questions

1. Do you think the engineer is correct? Use =0.052. What is the P-value for this test?3. What is the power of the test in part (1) for a true

difference in means of 0.04?4. Find a 95% confidence interval on the difference in

means. Provide a practical interpretation of this interval.5. Assuming equal sample sizes, what sample size should

be used to assure that =0.05 if the true difference in means is 0.04? Assume that =0.05

Page 14: Chapter 10 Statistical Inference for Two Samples

Solution-Part 11. Parameter of interest is the difference in fill volume,

2. H0 : or

3. H1 : or

4. = 0.05

5. The test statistic is

6. Reject H0 if z0 < z/2 = 1.96 or z0 > z/2 = 1.96

7. 16.015, 16.005, = 0, 0.025, 0.02, n1 = 10, and

n2 = 10

8. Since -1.96 < 0.99 < 1.96, do not reject the null hypothesis

1 2

1 2 0

1 2

1 2 0 1 2

zx x

n n

01 2 0

12

1

22

2

( )

x1 x2 1 2

z02 2

16 015 16 005 0

0 02

10

0 025

10

0 99

( . . )

( . ) ( . ).

Page 15: Chapter 10 Statistical Inference for Two Samples

Solution-Part 2 and 32. P-value =

3.

= 0 0 = 0

Hence, the power = 1 0 = 1

2 1 0 99 2 1 0 8389 0 3222( ( . )) ( . ) .

z

n n

z

n n

/ /20

12

1

22

2

20

12

1

22

2

1960 08

0 02

10

0 025

10

1960 08

0 02

10

0 025

10

2 2 2 2.

.

( . ) ( . ).

.

( . ) ( . )

196 7 9 196 7 9 5 94 9 86. . . . . .

Page 16: Chapter 10 Statistical Inference for Two Samples

Solution-Part 44. Confidence interval

With 95% confidence, we believe the true difference in the mean fill volumes is between 0.0098 and 0.0298. Since 0 is contained in this interval, we can conclude there is no significant difference between the means.

x x zn n

x x zn n1 2 2

12

1

22

21 2 1 2 2

12

1

22

2

/ /

16 015 16 005 1960 02

10

0 025

1016 015 16 005 196

0 02

10

0 025

10

2 2

1 2

2 2

. . .( . ) ( . )

. . .( . ) ( . )

0 0098 0 02981 2. .

Page 17: Chapter 10 Statistical Inference for Two Samples

Solution-Part 5

5. Assume the sample sizes are to be equal, use = 0.05, = 0.05, and = 0.08

Hence, n = 3, use n1 = n2 = 3

n

z z

/ . . ( . ) ( . )

( . ). ,

22

12

22

2

2 2 2

2

196 1645 0 02 0 025

0 082 08

Page 18: Chapter 10 Statistical Inference for Two Samples

Hypotheses Tests for a Difference in Means, Variances Unknown

• Tests of hypotheses on the difference in means μ1-μ2 of two normal distributions

• If n1 and n2 exceed 40, use the CLT

• Otherwise base our hypotheses tests and C.I. on the t distribution

• Two cases for the variances

Page 19: Chapter 10 Statistical Inference for Two Samples

Case I: 12=2

2= 2: Pooled Test• Two normal populations with unknown means and

unknown but equal variances• Expected value

• Form an estimator of 2

• Pooled estimator of 2, denoted by S2p

• Test statistic

Page 20: Chapter 10 Statistical Inference for Two Samples

Hypotheses Tests

• Test hypothesis– H0: μ1-μ2 =∆0

– H1: μ1-μ2 #∆0

• Test statistic

• Where Sp is the pooled estimator of

Page 21: Chapter 10 Statistical Inference for Two Samples

Critical Regions

• Alternative Hypothesis– H1: μ1-μ2 #∆0

– Rejection Criterion

• t0>tα/2, n1+n2-2 or

• t0<-tα/2, n1+n2-2

– H1: μ1-μ2 >∆0

– Rejection Criterion

• t0>tα, n1+n2-2

– H1: μ1-μ2 <∆0

– Rejection Criterion

• t0<-tα, n1+n2-2

Page 22: Chapter 10 Statistical Inference for Two Samples

Case 2: 12#2

2

• Not able to assume that the unknown variances 1

2, 22 are equal

• Test statistic

• With v degrees of freedom

• Critical regions– Identical to the case I– Degrees of freedom will be replaced by v

Page 23: Chapter 10 Statistical Inference for Two Samples

Confidence Interval on the Difference in Means

• Case 12=2

2

– 100(1-)% CI on the difference in means μ1-μ2

• Case 12#2

2

– 100(1- )% CI on the difference in means μ1-μ2

Page 24: Chapter 10 Statistical Inference for Two Samples

Example• The diameter of steel rods manufactured on two

different extrusion machines is being investigated• Two random samples of of sizes n1=15 and n2=17

are selected, and the sample means and sample variances are 8.73, s1

2=0.35, 8.68, and s2

2=0.40, respectively• Assume that equal variances and that the data

are drawn from a normal distribution– Is there evidence to support the claim that the two

machines produce rods with different mean diameters? Use α=0.05 in arriving at this conclusion

– Find the P-value for the t-statistic you calculated in part (1)

– Construct a 95% confidence interval for the difference in mean rod diameter. Interpret this interval

1x 2x

Page 25: Chapter 10 Statistical Inference for Two Samples

1. Parameter of interest,

2. H0 : or

3. H1 : or

4. = 0.05

5. Test statistic is

6. Reject the null hypothesis if t0 < where = 2.042 or t0 > where = 2.042

7. 8.73, 8.68, 0 = 0, 0.35, 0.40, n1 = 15, and n2 = 17,

021 21

021 21

21

0210

11

)(

nns

xxt

p

Solution

2,2/ 21 nnt

30,025.0t2,2/ 21 nnt30,025.0t

1x 2x 21s 2

2s

2 1

Page 26: Chapter 10 Statistical Inference for Two Samples

Solution

8. Since 2.042 < 0.230 < 2.042, do not reject the null hypothesis

2

)1()1(

21

222

211

nn

snsnsp

614.030

)40.0(16)35.0(14

230.0

171

151

614.0

0)68.873.8(0

t

Page 27: Chapter 10 Statistical Inference for Two Samples

Solution-Cont.

• P-value = 2P 2( 0.40), P-value > 0.80

• 95% confidence interval: t0.025,30 = 2.042

• Since zero is contained in this interval, we are 95% confident that machine 1 and machine 2 do not produce rods whose diameters are significantly different

230.0t

21

2,2/212121

2,2/21

11)(

11)(

2121 nnstxx

nnstxx pnnpnn

17

1

15

1)643.0(042.268.873.8

17

1

15

1)614.0(042.2)68.873.8( 21

515.0415.0 21

Page 28: Chapter 10 Statistical Inference for Two Samples

Paired t Test

• Special case of the two-sample t-tests

• When the observations are collected in pairs

• Each pair of observations is taken under homogeneous conditions

• Conditions may change from one pair to another

• Testing– H0: μD=∆0

– H1: μD#∆0

Page 29: Chapter 10 Statistical Inference for Two Samples

Paired t Test

• Test statistic

– D (bar) is the sample average of the n differences

• Rejection Region– t0>tα/2, n-1 or t0<-tα/2, n-1

• 100(1-α)% C.I. on the difference in means in means

Page 30: Chapter 10 Statistical Inference for Two Samples

Example• Ten individuals have participated in a diet-modification

program to stimulate weight loss• Their weight both before and after participation in the

program is shown in the following list– Is there evidence to support the claim that this particular diet-

modification program is effective in producing a mean weight

reduction? Use α=0.05.Subject Before After

1 195 187

2 213 195

3 247 221

4 201 190

5 187 175

6 210 197

7 215 199

8 246 221

9 294 278

10 310 285

Page 31: Chapter 10 Statistical Inference for Two Samples

Solution1. Parameter of interest is the difference in mean weight, d

where di =Weight Before Weight After.

2. H0 :

3. H1 :

4. = 0.05

5. Test statistic is

6. Reject the null hypothesis if t0 > where = 1.833

7. 17, 6.41, n=10

8) Since 8.387 > 1.833 reject the null

0d0d

ns

dt

d /0

9,05.0t 9,05.0td ds

387.810/41.6

170 t

Page 32: Chapter 10 Statistical Inference for Two Samples

Inferences on the Variances of Two Normal Populations

• Both populations are normal and independent

• Test the hypotheses– H0: 1

2=22

– H1: 12≠2

2

• Requires a new probability distribution, the F distribution

Page 33: Chapter 10 Statistical Inference for Two Samples

The F Distribution

• Define rv F as the ratio of two independent chi-square r.v., each divided by its number of dof

• F=(W/u) /(Y(v))

• Follows the F distribution with u dof in the numerator and v dof in the denominator.

• Usually abbreviated as Fu,v

Page 34: Chapter 10 Statistical Inference for Two Samples

The F Distribution• Shape of pdf with two dof

• Table V provides the percentage points of the F distribution

• Note that f1-α,u,v =1/fα,v, u

Page 35: Chapter 10 Statistical Inference for Two Samples

Hypothesis Tests on the Ratio of Two Variances

• Suppose H0: 12=2

2

• S12 and S2

2 are sample variances• Test statistics

• F0= S12 / S2

2

• Suppose H1: 12#2

2

• Rejection Criterion• f0>fα/2,n1-1,n2-1 or f0<f1-α/2,n1-1, n2-1

Page 36: Chapter 10 Statistical Inference for Two Samples

Example• Two chemical companies can supply a raw material.• The concentration of a particular element in this

material is important.• The mean concentration for both suppliers is the

same, but we suspect that the variability in concentration may differ between the two companies

• The standard deviation of concentration in a random sample of n1=10 batches produced by company 1 is s1=4.7 grams per liter, while for company 2, a random sample of n2=16 batches yields s2=5.8 grams per liter.

• Is there sufficient evidence to conclude that the two population variances differ? Use α=0.05.

Page 37: Chapter 10 Statistical Inference for Two Samples

Solution1. Parameters of interest are the variances of concentration,

2. H0 :

3. H1 :

4. = 0.05

5. Test statistic is

6. Reject the null hypothesis if f0 < where = 0.265 or f0 > where =3.12

7. n1=10, n2=16, s1= 4.7, and s2=5.8

8. Since 0.265 < 0.657 < 3.12 do not reject the null hypothesis

22

21 ,

22

21

22

21

22

21

0 s

sf

15,9,975.0f 15,9,975.0f

15,9,025.0f 15,9,025.0f

657.0)8.5(

)7.4(2

2

0 f

Page 38: Chapter 10 Statistical Inference for Two Samples

Hypothesis Tests on Two Population Proportions

• Suppose two binomial parameters of interest, p1and p2

• Large-Sample Test

• Test statistic

• Critical regions

Page 39: Chapter 10 Statistical Inference for Two Samples

β-Error

• If the H1 is two sided, the β-error

• Where

Page 40: Chapter 10 Statistical Inference for Two Samples

Confidence Interval on the Difference in Means

• Two sided 100(1-α)% C.I. on the difference in the true proportions p1-p2

Page 41: Chapter 10 Statistical Inference for Two Samples

Example

• Two different types of injection-molding machines are used to form plastic parts. A part is considered defective if it has excessive shrinkage or is discolored

• Two random samples, each of size 300, are selected, and 15 defective parts are found in the sample from machine 1 while 8 defective parts are found in the sample from machine 2

• Is it reasonable to conclude that both machines produce the same fraction of defective parts, using α=0.05?

Page 42: Chapter 10 Statistical Inference for Two Samples

Solution1. Parameters of interest are the proportion of defective parts, p1

and p2

2. H0 :

3. H1 :

4. = 0.05

5. Test statistic is

6. Reject the null hypothesis if z0 < where = 1.96 or z0

> where = 1.96

7. n1=300, n2=300, x1=15, x2=8, 0.05, 0.0267

p p1 2

p p1 2

zp p

p pn n

01 2

1 21

1 1

( )

px x

n n

1 2

1 2

z0 025. z0 025.

z0 025.z0 025.

p1 p2

.p

15 8

300 3000 0383 z0

0 05 0 0267

0 0383 1 0 03831

300

1

300

149

. .

. ( . )

.

Page 43: Chapter 10 Statistical Inference for Two Samples

Solution-Cont

• Since 1.96 < 1.49 < 1.96 do not reject the null hypothesis  

• P-value = 2(1P(z < 1.49)) = 0.13622