agresti/franklin statistics, 1 of 114 section 8.2 significance tests about proportions

58
Agresti/Franklin Statistics, 1 of 114 Section 8.2 Significance Tests About Proportions

Upload: whitney-ford

Post on 17-Dec-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 1 of 114

Section 8.2

Significance Tests About

Proportions

Page 2: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 2 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Scientific “test of astrology” experiment:

• For each of 116 adult volunteers, an astrologer prepared a horoscope based on the positions of the planets and the moon at the moment of the person’s birth

• Each adult subject also filled out a California Personality Index Survey

Page 3: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 3 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

For a given adult, his or her birth data and horoscope were shown to an astrologer together with the results of the personality survey for that adult and for two other adults randomly selected from the group

The astrologer was asked which personality chart of the 3 subjects was the correct one for that adult, based on his or her horoscope

Page 4: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 4 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

28 astrologers were randomly chosen to take part in the experiment

The National Council for Geocosmic Research claimed that the probability of a correct guess on any given trial in the experiment was larger than 1/3, the value for random guessing

Page 5: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 5 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Put this investigation in the context of a significance test by stating null and alternative hypotheses

Page 6: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 6 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

With random guessing, p = 1/3 The astrologers’ claim: p > 1/3 The hypotheses for this test:

•Ho: p = 1/3

•Ha: p > 1/3

Page 7: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 7 of 114

What Are the Steps of a Significance Test about a Population Proportion?

Step 1: Assumptions• The variable is categorical

• The data are obtained using randomization

• The sample size is sufficiently large that the sampling distribution of the sample proportion is approximately normal:

•np ≥ 15 and n(1-p) ≥ 15

Page 8: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 8 of 114

What Are the Steps of a Significance Test about a Population Proportion?

Step 2: Hypotheses The null hypothesis has the form:

• Ho: p = po

The alternative hypothesis has the form:

• Ha: p > po (one-sided test) or

• Ha: p < po (one-sided test) or

• Ha: p ≠ po (two-sided test)

Page 9: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 9 of 114

What Are the Steps of a Significance Test about a Population Proportion?

Step 3: Test Statistic The test statistic measures how far the sample

proportion falls from the null hypothesis value, po, relative to what we’d expect if Ho were true

The test statistic is:

npppp

z)1(

ˆ

00

0

Page 10: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 10 of 114

What Are the Steps of a Significance Test about a Population Proportion?

Step 4: P-value The P-value summarizes the evidence It describes how unusual the data

would be if H0 were true

Page 11: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 11 of 114

What Are the Steps of a Significance Test about a Population Proportion?

Step 5: Conclusion We summarize the test by reporting

and interpreting the P-value

Page 12: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 12 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Step 1: Assumptions• The data is categorical – each prediction

falls in the category “correct” or “incorrect” prediction

• Each subject was identified by a random number. Subjects were randomly selected for each experiment.

• np=116(1/3) > 15

• n(1-p) = 116(2/3) > 15

Page 13: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 13 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Step 2: Hypotheses

• H0: p = 1/3

• Ha: p > 1/3

Page 14: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 14 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Step 3: Test Statistic:• In the actual experiment, the astrologers were

correct with 40 of their 116 predictions (a success rate of 0.345)

26.0

116)3/2)(3/1(3/1345.0

(

z

Page 15: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 15 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Step 4: P-value The P-value is 0.40

Page 16: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 16 of 114

Example: Are Astrologers’ Predictions Better Than Guessing?

Step 5: Conclusion The P-value of 0.40 is not especially

small It does not provide strong evidence

against H0: p = 1/3 There is not strong evidence that

astrologers have special predictive powers

Page 17: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 17 of 114

How Do We Interpret the P-value?

A significance test analyzes the strength of the evidence against the null hypothesis

We start by presuming that H0 is true

The burden of proof is on Ha

Page 18: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 18 of 114

How Do We Interpret the P-value?

The approach used in hypotheses testing is called a proof by contradiction

To convince ourselves that Ha is true, we must show that data contradict H0

If the P-value is small, the data contradict H0 and support Ha

Page 19: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 19 of 114

Two-Sided Significance Tests

A two-sided alternative hypothesis has the form Ha: p ≠ p0

The P-value is the two-tail probability under the standard normal curve

We calculate this by finding the tail probability in a single tail and then doubling it

Page 20: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 20 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Study: investigate whether dogs can be trained to distinguish a patient with bladder cancer by smelling compounds released in the patient’s urine

Page 21: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 21 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

• Experiment:

•Each of 6 dogs was tested with 9 trials

• In each trial, one urine sample from a bladder cancer patient was randomly place among 6 control urine samples

Page 22: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 22 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Results:

In a total of 54 trials with the six dogs, the dogs made the correct selection 22 times (a success rate of 0.407)

Page 23: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 23 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Does this study provide strong evidence that the dogs’ predictions were better or worse than with random guessing?

Page 24: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 24 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Step 1: Check the sample size requirement: Is the sample size sufficiently large to use

the hypothesis test for a population proportion?

• Is np0 >15 and n(1-p0) >15?

• 54(1/7) = 7.7 and 54(6/7) = 46.3 The first, np0 is not large enough

• We will see that the two-sided test is robust when this assumption is not satisfied

Page 25: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 25 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Step 2: Hypotheses

• H0: p = 1/7

• Ha: p ≠ 1/7

Page 26: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 26 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Step 3: Test Statistic

6.5

54)7/6)(7/1()7/1407.0( z

Page 27: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 27 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Step 4: P-value

Page 28: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 28 of 114

Example: Dr Dog: Can Dogs Detect Cancer by Smell?

Step 5: Conclusion Since the P-value is very small and

the sample proportion is greater than 1/7, the evidence strongly suggests that the dogs’ selections are better than random guessing

Page 29: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 29 of 114

Summary of P-values for Different Alternative Hypotheses

Alternative Hypothesis

P-value

Ha: p > p0 Right-tail probability

Ha: p < p0 Left-tail probability

Ha: p ≠ p0 Two-tail probability

Page 30: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 30 of 114

The Significance Level Tells Us How Strong the Evidence Must Be

Sometimes we need to make a decision about whether the data provide sufficient evidence to reject H0

Before seeing the data, we decide how small the P-value would need to be to reject H0

This cutoff point is called the significance level

Page 31: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 31 of 114

The Significance Level Tells Us How Strong the Evidence Must Be

Page 32: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 32 of 114

Significance Level

The significance level is a number such that we reject H0 if the P-value is less than or equal to that number

In practice, the most common significance level is 0.05

When we reject H0 we say the results are statistically significant

Page 33: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 33 of 114

Possible Decisions in a Test with Significance Level = 0.05

P-value: Decision about H0:

≤ 0.05 Reject H0

> 0.05 Fail to reject H0

Page 34: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 34 of 114

Report the P-value

Learning the actual P-value is more informative than learning only whether the test is “statistically significant at the 0.05 level”

The P-values of 0.01 and 0.049 are both statistically significant in this sense, but the first P-value provides much stronger evidence against H0 than the second

Page 35: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 35 of 114

“Do Not Reject H0” Is Not the Same as Saying “Accept H0”

Analogy: Legal trial

• Null Hypothesis: Defendant is Innocent

• Alternative Hypothesis: Defendant is Guilty

• If the jury acquits the defendant, this does not mean that it accepts the defendant’s claim of innocence

• Innocence is plausible, because guilt has not been established beyond a reasonable doubt

Page 36: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 36 of 114

One-Sided vs Two-Sided Tests

Things to consider in deciding on the alternative hypothesis:

• The context of the real problem

• In most research articles, significance tests use two-sided P-values

• Confidence intervals are two-sided

Page 37: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 37 of 114

The Binomial Test for Small Samples

The test about a proportion assumes normal sampling distributions for and the z-test statistic.

• It is a large-sample test the requires that the expected numbers of successes and failures be at least 15. In practice, the large-sample z test still performs quite well in two-sided alternatives even for small samples.

• Warning: For one-sided tests, when p0 differs from 0.50, the large-sample test does not work well for small samples

Page 38: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 38 of 114

Section 8.3

Significance Tests about Means

Page 39: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 39 of 114

What Are the Steps of a Significance Test about a Population Mean?

Step 1: Assumptions

• The variable is quantitative

• The data are obtained using randomization

• The population distribution is approximately normal. This is most crucial when n is small and Ha is one-

sided.

Page 40: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 40 of 114

What Are the Steps of a Significance Test about a Population Mean?

Step 2: Hypotheses: The null hypothesis has the form:

• H0: µ = µ0

The alternative hypothesis has the form:

• Ha: µ > µ0 (one-sided test) or

• Ha: µ < µ0 (one-sided test) or

• Ha: µ ≠ µ0 (two-sided test)

Page 41: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 41 of 114

What Are the Steps of a Significance Test about a Population Mean?

Step 3: Test Statistic

• The test statistic measures how far the sample mean falls from the null hypothesis value µ0 relative to what we’d expect if H0 were true

• The test statistic is:

ns

xt

/0

Page 42: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 42 of 114

What Are the Steps of a Significance Test about a Population Mean?

Step 4: P-value

• The P-value summarizes the evidence

• It describes how unusual the data would be if H0 were true

Page 43: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 43 of 114

What Are the Steps of a Significance Test about a Population Mean?

Step 5: Conclusion

• We summarize the test by reporting and interpreting the P-value

Page 44: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 44 of 114

Summary of P-values for Different Alternative Hypotheses

Alternative Hypothesis

P-value

Ha: µ > µ0 Right-tail probability

Ha: µ < µ0 Left-tail probability

Ha: µ ≠ µ0 Two-tail probability

Page 45: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 45 of 114

Example: Mean Weight Change in Anorexic Girls

A study compared different psychological therapies for teenage girls suffering from anorexia

The variable of interest was each girl’s weight change: ‘weight at the end of the study’ – ‘weight at the beginning of the study’

Page 46: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 46 of 114

Example: Mean Weight Change in Anorexic Girls

One of the therapies was cognitive therapy

In this study, 29 girls received the therapeutic treatment

The weight changes for the 29 girls had a sample mean of 3.00 pounds and standard deviation of 7.32 pounds

Page 47: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 47 of 114

Example: Mean Weight Change in Anorexic Girls

Page 48: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 48 of 114

Example: Mean Weight Change in Anorexic Girls

How can we frame this investigation in the context of a significance test that can detect a positive or negative effect of the therapy?

Null hypothesis: “no effect” Alternative hypothesis: therapy has

“some effect”

Page 49: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 49 of 114

Example: Mean Weight Change in Anorexic Girls

Step 1: Assumptions• The variable (weight change) is

quantitative

• The subjects were a convenience sample, rather than a random sample. The question is whether these girls are a good representation of all girls with anorexia.

• The population distribution is approximately normal

Page 50: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 50 of 114

Example: Mean Weight Change in Anorexic Girls

Step 2: Hypotheses

• H0: µ = 0

• Ha: µ ≠ 0

Page 51: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 51 of 114

Example: Mean Weight Change in Anorexic Girls

Step 3: Test Statistic

21.2

2932.7

)000.3(0 n

sx

t

Page 52: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 52 of 114

Example: Mean Weight Change in Anorexic Girls

Step 4: P-value

• Minitab Output

Test of mu = 0 vs not = 0

Variable N Mean StDev SE Mean wt_chg 29 3.000 7.3204 1.3594 CI

95% CI T P (0.21546, 5.78454) 2.21 0.036

Page 53: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 53 of 114

Example: Mean Weight Change in Anorexic Girls

Step 5: Conclusion

• The small P-value of 0.036 provides considerable evidence against the null hypothesis (the hypothesis that the therapy had no effect)

Page 54: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 54 of 114

Example: Mean Weight Change in Anorexic Girls

“The diet had a statistically significant positive effect on weight (mean change = 3 pounds, n = 29, t = 2.21, P-value = 0.04)”

The effect, however, may be small in practical terms

• 95% CI for µ: (0.2, 5.8) pounds

Page 55: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 55 of 114

Results of Two-Sided Tests and Results of Confidence Intervals Agree

Conclusions about means using two-sided significance tests are consistent with conclusions using confidence intervals

• If P-value ≤ 0.05 in a two-sided test, a 95% confidence interval does not contain the H0 value

• If P-value > 0.05 in a two-sided test, a 95% confidence interval does contain the H0 value

Page 56: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 56 of 114

What If the Population Does Not Satisfy the Normality Assumption

For large samples (roughly about 30 or more) this assumption is usually not important• The sampling distribution of x is

approximately normal regardless of the population distribution

Page 57: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 57 of 114

What If the Population Does Not Satisfy the Normality Assumption

In the case of small samples, we cannot assume that the sampling distribution of x is approximately normal• Two-sided inferences using the t

distribution are robust against violations of the normal population assumption

• They still usually work well if the actual population distribution is not normal

Page 58: Agresti/Franklin Statistics, 1 of 114  Section 8.2 Significance Tests About Proportions

Agresti/Franklin Statistics, 58 of 114

Regardless of Robustness, Look at the Data

Whether n is small or large, you should look at the data to check for severe skew or for severe outliers

• In these cases, the sample mean could be a misleading measure