stats chapter 13

Chapter 13

Comparing Two Population Parameters

13.1 COMPARING TWO MEANS

Two-Sample Problems

The goal of this type of inference • compare the responses of two treatments

-or- • compare the characteristics of two

populations

• Separate samples from each population• Responses of each group are independent

of those in the other group

Before We Begin

• This is another set of PHANTOMS procedures

• It is important to note that “two populations” means that there is no overlap in the samples

• The sample sizes do not need to be equal

Hypotheses

There are two styles of writing hypotheses

Style 1H0: 1 = 2

Ha: 1 2, or

Ha: 1 > 2, or

Ha: 1 < 2

Hypotheses


Style 2H0: 1 - 2 = 0

Ha: 1 - 2 0, or

Ha: 1 - 2 > 0 (this implies 1 > 2), or

Ha: 1 - 2 < 0 (this implies 1 < 2)

Hypotheses


Style 2H0: 1 - 2 = 0

Ha: 1 - 2 0, or

Ha: 1 - 2 > 0 (this implies 1 > 2), or

Ha: 1 - 2 < 0 (this implies 1 < 2)

This style is more versatilesince it allows you to use adifference other than zero

Assumptions

Simple Random SampleEach sample must be from an SRS

IndependenceSamples may not influence each otherNo paired data!N1 > 10n1 and N2 > 10n2

(if sampling w/o replacement)

Assumptions

Normality (of sampling distibution)large samples (n1 > 30 and n2 > 30)this is the Central Limit Theorem

-OR-medium samples (15<n1<30 and 15<n2<30)-Histogram symmetric or slight skew and single peak-Norm prob plots for n1 and n2 are linear-No Outliers

-OR-

Assumptions

Normality (of sampling distibution)small samples (n1<15 and n2<15)-Histogram symmetric and single peak-Norm prob plots for n1 and n2 are linear-No Outliers

2-sample test statistic

z-tests

t-tests

1 2 1 2

2 21 2

1 2

x xz

n n

1 2 1 2

2 21 2

1 2

df

x xt

s sn n

df = smaller of n1 -1 or n2 - 1

Example 13.2

Researchers designed a randomized comparative experiment to establish the relationship between calcium intake and blood pressure in black men. Group 1 (n1 = 10) took calcium supplement, Group 2 (n2 =11) took a placebo. The response is the decrease in systolic blood pressure

Group 1: 7, -4, 18, 17, -3, -5, 1, 10, 11, -2Group 2: -1, 12, -1, -3, 3, -5, 5, 2, -11, -1, -3

Example 13.2

Parameter1 - 2 = difference in average systolic blood pressure in healthy black men between the calcium regimen and the placebo regimen xbar1 - xbar2 = difference in average systolic blood pressure in healthy black men in the two samples between the calcium regimen and the placebo regimen

Example 13.2

Hypotheses H0: 1 - 2 = 0

Ha: 1 - 2 > 0

Example 13.2

AssumptionsSimple Random Sample

We are told that both samples come from a randomized design

IndependenceBoth samples are independent, and (n1) N1 > 10(10) =100, (n2) N2 > 10(11)=110the population of black men is greater than 110

Example 13.2

Assumptions (cont)Sample 1 Sample 2

Example 13.2

Assumptions (cont)Normality

Both samples are single peaked with moderate skewness and approximately normal with no outliers.Although sample 1 shows some skewness, the t-procedures are robust enough to handle this skew.

Example 13.2

Name of TestWe will conduct a 2-sample t-test for population means

Test Statistic

1 1 1

2 2 2

10, 5, 8.7433

11, 0.2727, 5.9007

9

n x s

n x s

df

1 2 1 2

2 21 2

1 2

df

x xt

s sn n

9 2 2

5 0.2727 0

87433 5.900710 11

t

9 1.604t

Example 13.2

P Value

DecisionFail to Reject H0 at the 5% significance level

9PValue = P 1.604t

PValue = 0.0716

Example 13.2

SummaryApproximately 7% of the time, our samples of size 10 and 11 would produce a difference at least as extreme as 5.2727Since this p-value is not less than the presumed = 0.05, we will fail to reject H0

We do not have enough evidence to conclude that calcium intake reduces the average blood pressure in healthy black men.

Confidence Intervals

Confidence Interval for a difference to two sample means

2 2

1 221

1 2

*s s

C x x tn n

Robustness

2-sample t-procedures are more robust than one sample procedures. They can be used for sample sizes as small as n1 = n2 = 5 when the samples have similar shapes.

Guidelines for using t-procedures• n1 + n2 < 15: data must be approx normal,

no outliers• n1 + n2 >15: data can have slight skew,

no outliers• n1 + n2 > 30: data can have skew

Degrees of Freedom

• We have been using the smaller of n1 or n2 to determine the df

• This will ensure that our pvalue is smaller than the calculated pvalue and confidence intervals are smaller than calculated.

• These are “worst case scenario” calculations• There is a more exact df formula on p792• Your calculator also uses a df formula for two

samples• You do not need to memorize these other

formulas!

Calculators

• The tests we are using are located in the [STAT] -> “TESTS” menu

• 2-SampZTest = two sample z-test for means

• 2-SampTTest = two sample t-test for mean• 2-SampZInt = two sample z Confidence

Interval for difference of means• 2-SampTInt = two sample t Confidence

Interval for difference of means

Calculators

• Freq1 and Freq2 should be set to “1”• Pooled should be set to “NO”

13.2 COMPARING TWO PROPORTIONS

2-Sample Inference for Proportions

• We are testing to see if– Two populations have the same

proportionOR

– A treatment affects the proportion

• Remember: this is not a procedure for paired data (matched pair design/pre- and post-test)

Combined Proportion

• One of the underlying assumptions of the test is that the two proportions actually come from the same population.

• The test makes use of the “combined proportion” as below: 1 2

1 2

combined successes

combined individualsc

X Xp

n n

Hypotheses


Style 1H0: p1 = p2

Ha: p1 p2, or

Ha: p1 > p2, or

Ha: p1 < p2

Hypotheses


Style 2H0: p1 - p2 = 0

Ha: p1 - p2 0, or

Ha: p1 - p2 > 0 (this implies p1 > p2), or

Ha: p1 - p2 < 0 (this implies p1 < p2)

Hypotheses


Style 2H0: p1 - p2 = 0

Ha: p1 - p2 0, or

Ha: p1 - p2 > 0 (this implies p1 > p2), or

Ha: p1 - p2 < 0 (this implies p1 < p2)

This style is more versatilesince it allows you to use adifference other than zero

Assumptions

• Simple Random SampleBoth samples must be viewed as an SRS from their respective population or two groups from a randomized experiment

• IndependenceN1 > 10n1 and N2 > 10n2

• Normalityn1(pchat)> 5, n1(qchat)> 5 and n2(pchat)> 5, n2(qchat)> 5

Test Statistic

• The test statistic for proportions is always from the Normal distribution

1 2 1 2

1 2

1 1c c

p p p pz

p qn n

Example 13.9

A study was conducted to find the effects of preschool programs in poor children. Group 1 (n=61) had no preschool and group 2 (n=62) had similar backgrounds and attended preschool. The study measured the need for social services when the children became adults. After investigation it was found that p1hat = 49/61 and p2hat = 38/62.Does the data support the claim that preschool reduced the social services claimed?

Example 13.9

Parameters• p1 = proportion of adults who did not receive

preschool and file for social services• p2 = proportion of adults who received

preschool and filed for social services• p1hat = proportion of adults in group 1who did

not receive preschool and file for social services

• p2hat = proportion of adults in group 2 who received preschool and filed for social services

Example 13.9

Hypotheses• H0: p1 – p2 = 0

• Ha: p1 – p2 > 0

• The proportion of non-preschool is greater than that of pre-school

49 38 870.7073

61 62 123cp

Example 13.9

AssumptionsSimple Random Sample

Since the measurements are from a randomized experiment, we can assume that they are from an SRS

IndependenceN1 > 10(61) = 610: more than 610 do not attend preschoolN2 > 10(62) = 620: more than 620 attend preschool

Normality61(.70) = 42.7 > 5, 61(.30) = 18.3 > 562(.70) = 43.4 > 5, 62(.30) = 18.6 > 5

Example 13.9

Name of Test2-Sample Z-test for proportionsTest Statistic

1 2 1 2

1 2

1 1c c

p p p pz

p qn n

0.803 0.613 0

1 10.7073 0.2927

61 62

2.316

Example 13.9

PvaluePval = P(z > 2.316) = 0.0103

Make DecisionReject H0

Example 13.9

SummaryApproximately 1% of the time, two samples of size 61 and 62 will produce a difference of at least 0.190.Since our p value is less than an of 0.05, we will reject our H0.

Our evidence supports the claim that enrollment in preschool reduces the proportion of adults who file social services claims.


The confidence interval for the difference between the proportions of two samples is given as:

• Notice that the Confidence Interval does not use pchat and qchat.

1 1 2 21 2

1 2

CI *p q p q

p p zn n


Assumptions• Simple Random Sample

Both samples must be viewed as an SRS from their respective population or two groups from a randomized experiment

• IndependenceN1 > 10n1 and N2 > 10n2

• Normalityn1(p1)> 5, n1(q1)> 5 and n2(p2)> 5, n2(q2)> 5(again, not pc or qc)

Calculators

The tests we are using are located in the [STAT] -> “TESTS” menu

• 2-PropZTest = 2 proportion z-test• 2-PropZInt = 2 proportion confidence

interval

stats chapter 13

Documents

p1 p2

sample ttest

sample procedures

sample inference

sample ztest

sample t confidence

p1 p2this style

sample z confidence