CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 343
CH9: Testing the Difference Between Two Means, Two Proportions, and Two
Variances
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 344
Section 9-1 Testing the Difference Between Two Means: Using the Z Test
Suppose we are interested in determining if a certain medication relieves patients’ headaches. We give the drug/treatment to one group and give a placebo to a control group and compare the mean incidences of patient relief from the headache between the two groups. If the treatment group had a statistically significant improvement in headache symptoms over the control group, then we can conclude the drug works.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 345
So our question might be, “Is the mean incidence of headache relief different for the two groups?” Let
1 mean headache relief from treatment group and
2 mean headache relief from control group. Then our hypotheses would be:
H0 :
H1 : Alternatively, we could state the hypotheses as:
H0 :
H1 :
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 346
Assumptions for the Test to Determine the Difference Between Two Means
The samples must be independent of each other. That is, there can be no relationship between the subjects in each sample.
The populations from which the samples come must be (approximately) normally distributed or the sample sizes of both groups should be at least 30.
The standard deviations of both populations must be known.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 347
We can compare the groups by the difference in their population means,
1 2, where
1 is the population mean for group 1 and
2 is the population mean for group 2.
We estimate
1 2 with
x 1 x 2
The standard deviation of
x 1 x 2 is
1
2
n1
2
2
n2
When both populations are normally distributed or the
samples size for each group is at least 30, then
x 1 x 2 has a normal distribution.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 348
Formula for the z test for Comparing Two Means from Independent Populations
H0 :1 2 k (or
k or
k) Note: We often
k 0, but it doesn’t have to be. Test value:
z* (x 1 x 2 ) (1 2 )
12
n1
2
2
n2
(x 1 x 2 )k
12
n1
2
2
n2
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 349
The observed difference between the sample means may be due to chance, in which case the null hypothesis will not be rejected. If the difference is statistically significant, the null hypothesis is rejected and the researcher can conclude the population means are different. The same approach to finding critical values and P-values that was used in Section 8-2 will be used here (Table E or Table F with d.f. = ∞).
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 350
Example: Dr. Cribari would like to determine if there is a statistically significant difference between her two Math 2830 classes. To make this comparison she will compare the results from exam 1. Class one had 35 students take the exam with a mean of 82.6 and a population standard deviation of 1.41. Class two had 32 students take the exam with a mean of 84 and a population standard deviation of 3.63. Can Dr. Cribari conclude that there is difference in the mean test grades between the two classes at α=0.05? Ho: µ 1 = µ 2 Ho: µ 1 ≥ µ 2 Step 1 State the hypotheses and identify the claim.
0 1 2
1 1 2 CLAIM
:
:
H
H
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 351
Step 2 Find the critical value(s) from the appropriate table.
As stated, the problem is giving the population standard deviations. This means that we will be doing a z-test.
Two-sided test critical value = 1.96 =0.05
Step 3 Compute the test value and determine the P-value.
* 1 2 1 2
2 2 2 2
1 2
1 2
( ) ( ) (82.6 84) 0 -2.05
1.41 3.63
35 32
x xz
n n
p-value = 2*.0202 = 0.0404
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 352
Step 4 Make the decision to reject or not reject the null hypothesis.
Since the p-value is smaller than our , the null hypothesis is rejected. [OR, Since, our test value, -2.05, falls within the rejection region, the null hypothesis is rejected]
Step 5 Summarize the results.
That is, there is evidence to support the claim that the exam 1 grades differ between the two sections of MATH2830.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 353
Example: A survey found that the average hotel room in New Orleans is $88.42 and the average room rate in Phoenix is $80.61. Assume that the data were obtained from two samples of 50 hotels each and that the (population) standard deviations were $5.62 and $4.83, respectively. At α = 0.01, can it be concluded that the average hotel room in New Orleans costs more than in Phoenix? Step 1 State the hypotheses and identify the claim.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 354
Step 2 Find the critical value(s) from the appropriate table. Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 355
Step 4 Make the decision to reject or not reject the null hypothesis. Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 356
Formula for the z Confidence Interval for Difference Between Two Means Assumptions:
1. The data for each group are independent random samples. 2. The data are from normally distributed populations and/or
the sample sizes of the groups are greater than 30. 3. The population standard deviation is (assumed) known.
Formula:
Note: When
n1 and
n2 are at least 30, then
s1 and
s2 can be used
in place of
1 and
2.
2 2
1 21 2 /2
1 2
( )x x zn n
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 357
Example: Two brands of cigarettes are selected, and their nicotine content is compared. The data are shown below. Find the 95% confidence interval of the true difference in the means. Brand A Brand B
X1 28.6 mg
X2 32.9 mg
1 5.1 mg
2 4.4 mg
n1 30
n2 40
2 2 2 2
1 21 2 /2
1 2
5.1 4.4( ) (28.6 32.9) 1.96
30 40
( 4.3) 2.278158
(-6.58,-2.02)
x x zn n
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 358
At
0.05., is there convincing evidence that the mean amount of nicotine differs between the brands?
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 359
Example: For the hotel example, construct a 98% confidence interval of the true difference in the means.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 360
Section 9-2: Testing the Difference Between Two Means of Independent Samples: Using the t Test
Many times the conditions set forth by the z test in Section 9-1 cannot be met (e.g., the population standard deviations are not known). In these cases, a t test is used to test the difference between means when the two samples are independent and when the samples are taken from two normally or approximately normally distributed populations.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 361
Formula for the t Test for Testing the Difference Between Two Means: Independent Samples. Variances are assumed to be unequal:
t (X 1 X 2 ) (1 2 )
s12
n1
s2
2
n2
where degrees of freedom is equal to the smaller of
n1 1 or
n2 1. We will use Table F to find our critical values and our p-values.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 362
WARNING: Your calculator will perform a 2 sample t-test (its #4 under STATS then TESTS). However, it uses a complicated formula to determine the degrees of freedom that will ultimately affect how the calculator deals with confidence intervals and p-values. We will come back to this point at the end of the section.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 363
Example: A real estate agent wishes to determine whether tax assessors and real estate appraisers agree on the values of homes. A random sample of the two groups appraised 10 homes. Is there a significant difference in the values of the homes for each group? Let α = 0.05. Assume the data are from normally distributed populations. Real Estate Appraisers Tax Assessors
X1 $83,256
X2 $88,354
s1 $3256
s2 $2341
n1 10
n2 10 Step 1 State the hypotheses and identify the claim.
H0: 1=2 H1: 12
Sample standard deviations given!
Use a t-test
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 364
Step 2 Find the critical value(s) from the appropriate table.
T-test means use the t-table (Table F). We have 9 degrees of freedom since n1=10 and n2=10.
The smallest of n1-1 and n2-1 is 9. Information we need: two-tailed test, =0.05, df=9 T critical value is 2.262
Step 3 Compute the test value and determine the P-value.
1 2 1 2
2 2 2 2
1 2
1 2
( ) ( ) (83,256 88,354) (0)
3256 2341
10 10
-5098
1268.141
=-4.02
X Xt
s s
n n
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 365
-6 -4 -2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Two tailed
t* = -4.02
t
P(t
)
2 ( 4.02)
2(0.0015)
0.003
p value P t
CRITICAL REGION
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 366
Step 4 Make the decision to reject or not reject the null hypothesis.
The null hypothesis is rejected. This decision can be based on: the fact that the test value (-4.02) is within the critical
region since it is less than -2.262 or the fact that the p-value (0.003) is smaller than =0.05
Step 5 Summarize the results.
There is significant evidence that tax assessors and real estate appraisers disagree on the values of homes.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 367
Example: A researcher suggests that male nurses earn more than female nurses. A survey of 16 male nurses and 20 female nurses reports these data. Is there enough evidence to support the claim that male nurses earn more than female nurses? Use α = 0.01. Assume the data are from normally distributed populations. Females Males
X1 = $23,750
X2 = $23,900
s1 = $250
s2 = $300
n1 = 20
n2 = 16 Step 1 State the hypotheses and identify the claim.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 368
Step 2 Find the critical value(s) from the appropriate table. Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 369
Step 4 Make the decision to reject or not reject the null hypothesis. Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 370
Confidence Intervals for the Difference of Two Means: Small Independent Samples Variances assumed to be unequal:
(X 1 X 2 )t 2
s12
n1
s2
2
n2
where d.f. = smaller value of
n1 1 or
n2 1.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 371
WARNING: The way our calculator determines the degrees of freedom is not the same as the book. So you will NOT be able to use your calculator STAT/TESTS function to calculate your confidence interval because you will get a VERY different confidence interval. This is due to the fact that the t-multiplier will be sufficiently different then what the calculator will find.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 372
Example: Let’s find the 95% Confidence Interval for the first problem.
2 22 2
1 21 2 2
1 2
3256 2341( ) (83256 88354) 2.262
10 10
-5098 2868.535
(-7967,-2229)
s sX X t
n n
Example: Let’s find the 99% Confidence Interval for the second problem.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 373
Section 9-3: Testing the Difference Between Two Means: Dependent Samples
So far we have only compared two means when the samples were independent. Samples are considered to be dependent when the subjects are paired or matched in some way.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 374
Examples of paired data:
Each person is measured twice where the 2 measurements measure the same thing but under different conditions
Similar individuals are paired prior to an experiment and each member of a pair receives a different treatment
Two different variables are measured for each individual and there is interest in the amount of difference between the 2 variables
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 375
When using paired data, you are interested primarily in the “difference” and not the data itself. When samples are dependent, a special t test for dependent means is used. The test uses the difference in the values of the matched pairs.
IMPORTANT: We cannot use the t test we had learned for a difference in independent means.
To determine whether one set of observations tend to be larger or different than the paired observations, we take the difference between the matched observations and perform analysis on the differences.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 376
Classic example would be studies of weight loss Weight before
Weight after
We are interested in the CHANGE!
An aside: this study also used a placebo group. Why?
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 377
Hypotheses:
Right-tailed: H0: D=0 H1: D>0
Left-tailed: H0: D=0 H1: D<0
Two-tailed: H0: D=0 H1: D0
D population mean of differences = 1 – 2
Here,
1 is the mean of the population of the first set of measurements and
2 is the mean of the population of the second set of measurements.
Actually, you can also use 2 1D as long as you are consistent with your statement of hypotheses and calculation of D.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 378
Formulas for the t Test for Dependent Samples
t DD
sD
n
with d.f. =
n1 and
where
D D
n is the mean of the sample of differences and
sD n D2 ( D)2
n(n1)
is the sample standard deviation of the sample of differences.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 379
The good news is we can find the mean of the differences,
D, and the standard deviation of the differences,
sD, using the LIST and STAT functions in your TI-83/84.
1. Go to STAT -> EDIT -> Edit . 2. Enter the first set of observations under L1. 3. Enter the second set of observations under L2. 4. Highlight L3 in list, type L1 – L2 and hit enter. The set of
differences should now be calculated. 5. Go to STAT -> CALC -> 1-Var Stats, hit enter. Type L3
(after 1-Var Stats on your screen) and hit enter. Your calculator will calculate the sample mean and standard deviation for you.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 380
Example: A physical education director claims by taking a special vitamin, a weight lifter can increase his strength. Eight athletes are selected and given a test of strength, using the standard bench press. After 2 weeks of regular training, supplemented with the vitamin, they are tested again. Test the effectiveness of the vitamin regimen at α = 0.05. Each value in these data represents the maximum number of pounds the athlete can bench press. Assume the variable is approximately normally distributed.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 381
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 382
Step 1 State the hypotheses and identify the claim.
I will base my differences on: D = strength after – strength before. H0: D=0 H1: D>0
Step 2 Find the critical value(s) from the appropriate table. We have 8 lifters which gives 7 degrees of freedom. Our =0.05. We have a right tailed test critical value will be positive. t critical value = 1.895 (see next page)
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 383
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 384
Step 3 Compute the test value and determine the P-value.
Use your calculator to get the mean difference and standard deviation of the differences.
2.375 01.388
4.838
8
D
D
Dt
s
n
p-value = P(t1.388)=0.104 Found using tcdf(1.388,E99,7) on the TI calculator
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 385
Step 4 Make the decision to reject or not reject the null hypothesis.
The null hypothesis is NOT rejected. We can base this decision on either of the two facts: The p-value is larger than = 0.05 The test value (1.388) is smaller than the critical value
(1.895). That is, our test value is within the non-rejection region:
-3 -2 -1 0 1 2 3
0.0
0.1
0.2
0.3
Right tailed
Test value = 1.388
t
P(t
) REJECTION REGION
NON-REJECTION REGION
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 386
Step 5 Summarize the results.
There is not sufficient evidence to support the education director claims by taking a special vitamin, a weight lifter can increase his strength.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 387
Example: A sample of 10 college students in a class were asked how many hours per week they watch TV and how many hours a week they used a computer. Is there a difference in the mean number of hours a college student spends on a computer versus watching TV at α = 0.01? Assume the population of differences is approximately normally distributed.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 388
The data:
Student Comp TV 1 30 2 2 20 1.5 3 10 14 4 10 2 5 10 6 6 0 20 7 35 14 8 20 1 9 2 14 10 5 10
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 389
Step 1 State the hypotheses and identify the claim. Step 2 Find the critical value(s) from the appropriate table.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 390
Step 3 Compute the test value and determine the P-value. Step 4 Make the decision to reject or not reject the null hypothesis.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 391
Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 392
Confidence Interval for the Mean Difference
Dt 2
sD
n where d.f. =
n1
Let’s find the 99% confidence interval for the mean difference of the last example of TV watching vs. Computer Usage.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 393
Section 9-4: Testing the Difference Between Proportions
Let p1 be the proportion of a population having some characteristic of interest.
Similarly, let p2 be the proportion of a different population having that characteristic. We estimate these parameters by taking samples from each population and using the sample proportions as estimates.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 394
Let x1 be the number of observations in sample 1 having the
characteristic of interest and x2 be the number of observations in sample 2 having that characteristic.
The sample proportion for the first sample is
ˆ p 1 x1
n1
and the
sample proportion for the second sample is
ˆ p 2 x2
n2
.
We will learn how to perform a hypothesis test for the difference in population proportions.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 395
Hypotheses: Right-tailed test: H0: p1 = p2 or H0: p1-p2=k H1: p1 > p2 or H0: p1-p2>k Left-tailed test: H0: p1 = p2 or H0: p1-p2=k H1: p1 < p2 or H0: p1-p2<k Two-tailed test: H0: p1 = p2 or H0: p1-p2=k H1: p1 p2 or H0: p1-p2k
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 396
Formula for the z Test for Comparing Two Proportions
H0 : p1 p2 0 (or more generally p1-p2 = k) Test value:
z ( ˆ p 1 ˆ p 2 ) (p1 p2 )
p q 1
n1
1
n2
( ˆ p 1 ˆ p 2 )
p q 1
n1
1
n2
where
p x1 x2
n1 n2
n1 ˆ p 1 n2 ˆ p 2
n1 n2
and
q 1p .
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 397
What is
p ? We are assuming in the null hypothesis that
p1 p2 p, where p is the value of the common proportion. Under this assumption, we should combine the information from both samples to estimate the common population proportion p.
p is an estimate of p combining the information from both samples.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 398
P-values: (computed as before, depending on the alternative hypothesis)
Right-tailed test:
P(Z z*)
Left-tailed test:
P(Z z*)
Two-tailed test:
2P(Z z* )
Since we are performing a z test we will use Table E for p-values and Table F (d.f. = ∞) for critical values.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 399
Assumptions:
1. The samples are independent random samples
2. All counts must all be at least 5: a.
n1 ˆ p 1 x1 b.
n1 ˆ q 1 n1 x1 c.
n2 ˆ p 2 x2 d.
n2 ˆ q 2 n2 x2
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 400
Example: It is believed that a sweetener called xylitol helps prevent ear infections. In a randomized experiment
n1 165 children took a
placebo and 68 of them got ear infections. Another sample of
n2 159 children took xylitol and 46 of them got ear infections. We
believe that the proportion of ear infections in the placebo group will be greater than the xylitol group. Test this hypothesis at α = 0.025. Step 1 State the hypotheses and identify the claim.
H0: p1 = p2 H1: p1 > p2 (CLAIM) Step 2 Find the critical value(s) from the appropriate table.
Test of two proportions, right-tailed positive Z value Based on = 0.025, we get Z = 1.96. (I pulled my critical value from the bottom of Table F)
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 401
Step 3 Compute the test value and determine the P-value.
Let’s find p first: 68 46
0.352165 159
p
.
Using this:
p-value = P(Z2.31) = 0.0104 (I found my p-value using the Z table - Table E).
1 2 1 2
1 2
68 46ˆ ˆ( ) ( ) 165 159
1 11 10.352(1 0.352)
165 159
0.1228132.31
0.0530751
p p p pz
pqn n
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 402
Step 4 Make the decision to reject or not reject the null hypothesis.
Given that our p-value = 0.0104 which is smaller than our =0.025, the null hypothesis is rejected. We could also reach this conclusion by noting that our test value = 2.31 is greater than our critical value = 1.96.
Step 5 Summarize the results.
There is significant evidence to support the claim that xylitol helps prevent ear infections. Specifically, infants that took xylitol had a lower proportion of ear infections than infants that were given an placebo.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 403
Example: In a sample of 200 surgeons, 15% thought the government should control health care. In a sample of 200 general practitioners, 21% felt the same way. At α = 0.01, is there a difference in the proportions? Step 1 State the hypotheses and identify the claim. Step 2 Find the critical value(s) from the appropriate table.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 404
Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 405
Step 4 Make the decision to reject or not reject the null hypothesis. Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 406
Confidence Interval for the Difference Between Two Proportions
( ˆ p 1 ˆ p 2 ) z 2
ˆ p 1 ˆ q 1n1
ˆ p 2 ˆ q 2n2
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 407
Example: Let’s find the 95% confidence interval for the xylitol problem.
First, note that 1
68ˆ 0.412
165p , 2
46ˆ 0.289
159p and z/2=1.96
1 1 2 21 2 2
1 2
ˆ ˆ ˆ ˆˆ ˆ( )
0.412(1 0.412) 0.289(1 0.289)(0.412 0.289) 1.96
165 159
0.123 0.103
( 0.020, 0.226)
p q p qp p z
n n
We can say with 95% confidence that infants receiving xylitol have between 0.2% and 2.26% fewer ear infections than those receiving placebo.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 408
Example: Let’s find the 90% confidence interval for the health care problem.
CH9: Testing the Difference Between Two Means or Two Proportions Santorico - Page 409
Decision Tree for Deciding Which Hypothesis Test to Use: