goodness of fit tests - math.uh.edu

Post on 05-May-2022

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Goodness of Fit TestsSection 8.5

Cathy Poliak, Ph.D.cathy@math.uh.edu

Office hours: T Th 2:30 pm - 5:15 pm 620 PGH

Department of MathematicsUniversity of Houston

April 21, 2016

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 1 / 20

Outline

1 Beginning Questions

2 Beginning Example

3 Chi-Square

4 Examples

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 2 / 20

Popper Set Up

Fill in all of the proper bubbles.

Use a #2 pencil.

This is popper number 22.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 3 / 20

Steps of a Significance Test

When performing a significance test, we follow these steps:1. Check assumptions.

2. State the null and alternative hypothesis.

3. Graph the rejection region, labeling the critical values.

4. Calculate the test statistic.

5. Find the p-value. If this answer is less than the significance level,α, we can reject the null hypothesis in favor of the alternativehypothesis.

6. Give your conclusion using the context of the problem. Whenstating the conclusion give results with a confidence of(1 − α)(100)%.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 4 / 20

What if we are not given α?

If the P-value for testing H0 is less than:0.1 we have some evidence that H0 is false.

0.05 we have strong evidence that H0 is false.

0.01 we have very strong evidence that H0 is false.

0.001 we have extremely strong evidence that H0 is false.

If the P-value is greater than 0.1, we do not have any evidence thatH0 is false.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 5 / 20

Popper #22 Questions

1. In a hypothesis test if the computed P-value is 0.35, our decisionis toa) retest with a different sample.b) fail to reject the null hypothesis.c) reject the null hypothesis.d) accept the null hypothesis.

2. Consumer Reports (January 1993) stated that the mean retailcost of an AT&T model 3730 cellular phone was $600. A randomsample of 10 stores in Los Angeles had a mean cost of $586.5with standard deviation of $26.77. Does this indicate that themean cost in Los Angeles is less than $600? To answer thisquestion which test should be used?a) One Sample T Test for Meansb) χ2 Goodness of Fit Testc) Two Sample T Test for Meansd) One Sample Z Test for Means

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 6 / 20

Candy

Mars Inc. claims that they produce M&Ms with the followingdistributions:

Brown 30% Red 20% Yellow 20%Orange 10% Green 10% Blue 10%

A bag of M&Ms was randomly selected from the grocery store shelf,and the color counts were:

Brown 14 Red 14 Yellow 5Orange 7 Green 6 Blue 10

We want to know if the distribution of color the same as themanufacturer’s claim.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 7 / 20

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Chi-Square Test

Test Statistic: Called the chi-square statistic is a measure of howmuch the observed cell counts diverge from the expected cell counts.To calculate for each problem you will make a table with the followingheadings:

Observed Expected (O−E)2

ECounts (O) Counts (E)

The sum of the third column is called the Chi-square test statistic, χ2.

χ2 =∑ (observed − expected)2

expected

Where expected counts = total count × proportion of each category.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 9 / 20

Chi-square of M&Ms

Color Observed Proportions Expected (O − E)2

Counts (O) Counts (E) EBrown 14 0.3

Red 14 0.2

Yellow 5 0.2

Orange 7 0.1

Green 6 0.1

Blue 10 0.1

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 10 / 20

Chi-square

Chi-square distributions have only positive values and are skewedright.

This has a degrees of freedom which is n − 1.

As the degrees of freedom increases it become more like aNormal distribution.

The total area under the χ2 curve is 1.

To find area under the curveI Table providedI In R: 1 - pchisq(x,df)I In TI-83(84): χ2cdf(x,1e99, df).

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 11 / 20

Chi-Square

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 12 / 20

Assumptions for a Chi-Square Goodness-of-fit Test

1. The sample must be an SRS from the populations of interest.

2. The population size is at least 10 times the size of the sample.

3. All expected cell counts must be at least 5.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 13 / 20

Is the manufacturers claim correct?

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 14 / 20

Using R

chisq.test(c(list of observed values),correct = FALSE, p = c(list ofproportions))If we are not given a list of proportions then p = 1/n and that is adefault for R so we do not need to give that information.

> chisq.test(c(14,14,5,7,6,10),correct=FALSE,p=c(.3,.2,.2,.1,.1,.1))

Chi-squared test for given probabilities

data: c(14, 14, 5, 7, 6, 10)X-squared = 8.4345, df = 5, p-value = 0.1339

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 15 / 20

Using R

chisq.test(c(list of observed values),correct = FALSE, p = c(list ofproportions))If we are not given a list of proportions then p = 1/n and that is adefault for R so we do not need to give that information.

> chisq.test(c(14,14,5,7,6,10),correct=FALSE,p=c(.3,.2,.2,.1,.1,.1))

Chi-squared test for given probabilities

data: c(14, 14, 5, 7, 6, 10)X-squared = 8.4345, df = 5, p-value = 0.1339

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 15 / 20

Zodiac SignsDoes your zodiac sign determine how successful you will be in laterlife? Fortune magazine collected the zodiac signs of 256 heads of thelargest 400 companies. The following are the number of births for eachsign:

Sign BirthsAries 23

Taurus 20Gemini 18Cancer 23

Leo 20Virgo 19Libra 18

Scorpio 21Sagittarius 19Capricorn 22Aquarius 24Pisces 29

From: Intro Stats, De Veaux, Velleman, Bock. 2nd Edition, Pearson, pg 604.Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 16 / 20

2. Hypotheses

H0: The number of births are the same over the zodiac signs.

Ha: The number of births are not the same over the zodiac signs.

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 17 / 20

3 & 4. Chi-square Test statistic and P-value

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 18 / 20

5 & 6. Decision and Conclusion

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 19 / 20

Popper #22 Questions

Mars Inc. claims that they produce M&Ms with the followingdistributions:

Brown 30% Red 20% Yellow 20%Orange 10% Green 10% Blue 10%

A bag of M&Ms was randomly selected from the grocery store shelf,and the color counts were:

Brown 25 Red 23 Yellow 21Orange 13 Green 15 Blue 14

3. Using the χ2 goodness of fit test to determine if the proportion ofM&Ms is what is claimed, what is the test statistic?a) χ2 = 9.231b) χ2 = 2.716c) χ2 = 4.616d) χ2 = 1.960

Cathy Poliak, Ph.D. cathy@math.uh.edu Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 20 / 20

top related