chapter 11 inference for tables: chi-square procedures 11.1 target goal:i can compute expected...

26
Chapter 11 Chapter 11 Inference for Inference for Tables: Chi-Square Tables: Chi-Square Procedures Procedures 11.1 11.1 Target Goal: Target Goal: I can compute I can compute expected counts, conditional expected counts, conditional distributions, and contributions distributions, and contributions to the chi-square statistic. to the chi-square statistic. h.w: pg. 621: 1, 3, 5, 9, 11 h.w: pg. 621: 1, 3, 5, 9, 11

Upload: amber-miller

Post on 02-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Chapter 11Chapter 11Inference for Tables: Inference for Tables:

Chi-Square Chi-Square ProceduresProcedures

11.111.1

Target Goal:Target Goal: I can compute I can compute expected counts, conditional expected counts, conditional distributions, and contributions to the distributions, and contributions to the chi-square statistic.chi-square statistic.

h.w: pg. 621: 1, 3, 5, 9, 11 h.w: pg. 621: 1, 3, 5, 9, 11

Page 2: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Test for Goodness of FitTest for Goodness of Fit

To analyze To analyze categorical datacategorical data, we , we construct construct two-way tablestwo-way tables and and examine the examine the counts or percentscounts or percents of of the explanatory and response the explanatory and response variables.variables.

Count and record M&M colors per Count and record M&M colors per bag.bag.

Expected count:Expected count:

Page 3: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

M&Ms Color Distribution % M&Ms Color Distribution % according to their websiteaccording to their website

Brown Yellow Red Blue Orange Green

Plain 13 14 13 24 20 16

Peanut 12 15 12 23 23 15

Peanut Butter/ Almond

10 20 10 20 20 20

Page 4: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

We want to We want to compare the observed compare the observed counts to the expected counts.counts to the expected counts.

The The null hypothesis null hypothesis is that there is is that there is no differenceno difference between the between the observed and expected counts.observed and expected counts.

The The alternative hypothesisalternative hypothesis is that is that there there is a differenceis a difference between the between the observed and expected counts observed and expected counts

Page 5: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Simulate count of M&M’s bagSimulate count of M&M’s bagor use own M&M’s bagor use own M&M’s bag

Label:Label:1-131-13 BrownBrown14-2714-27 YellowYellow28-4028-40 RedRed41-6441-64 Blue Blue65-8465-84 Orange Orange85-0085-00 Green Green

Math:Prb:Math:Prb:Randint(0,99,50)Randint(0,99,50) sto in L1 sto in L1Sort in ascending and tally.Sort in ascending and tally.

Page 6: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Chi-square statisticChi-square statistic

It measures It measures how well the observed how well the observed counts counts fitfit the expected counts the expected counts, , assuming that the null hypothesis is assuming that the null hypothesis is true.true.

2

2 O E

E

Go to Blank student notes.Go to Blank student notes.

Page 7: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

The distribution of the The distribution of the chi-square statisticchi-square statistic is is

called the called the chi-square distribution, chi-square distribution, X X 22..

This distribution is a density curve.This distribution is a density curve.The The total areatotal area under the curve is under the curve is 11. . The curve The curve begins at zerobegins at zero on the on the

horizontal axis and is horizontal axis and is skewed right.skewed right. As the As the degrees of freedom increasedegrees of freedom increase, ,

the the shapeshape of the curve becomes of the curve becomes more more symmetric. symmetric.

Page 8: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Pg. 703

Page 9: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to
Page 10: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

““Goodness of Fit TestGoodness of Fit Test.” .”

Using the M&M Minis® chi-square Using the M&M Minis® chi-square statistic, find the probability of statistic, find the probability of obtaining a obtaining a XX22 value at least this value at least this extreme assuming the null extreme assuming the null hypothesis is true.hypothesis is true.

Use your Use your Chi-square statistic and df = 6-1 = Chi-square statistic and df = 6-1 = 55

P-value = XP-value = X22 cdf(lb,up,df) cdf(lb,up,df)

Page 11: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

CONDITIONS for Individual CONDITIONS for Individual Expected CountsExpected Counts::

The Goodness of Fit Test may be The Goodness of Fit Test may be used when used when all expected all expected counts are at counts are at least 1least 1 and and no more thanno more than 20% of the 20% of the expected counts are less than 5.expected counts are less than 5.

Following the Goodness of Fit Test, Following the Goodness of Fit Test, check to see check to see which component which component made made the greatest contribution to the chi-the greatest contribution to the chi-square statistic to square statistic to see where the see where the biggest changes occurred. biggest changes occurred.

Page 12: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Conditions for Chi-Square Conditions for Chi-Square TestTest

Random:Random: The data come from a The data come from a random sample or a randomized random sample or a randomized experiment.experiment.

Large sample size: Large sample size: All expected All expected counts are counts are at least 5.at least 5.

Independent:Independent: Individual Individual observations are independent. When observations are independent. When sampling without replacement, sampling without replacement, check check the 10% condition.the 10% condition.

Page 13: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Ex: The Graying of AmericaEx: The Graying of America

It is believed that with better It is believed that with better medicine and healthier lifestyles, medicine and healthier lifestyles, people are living longer and people are living longer and consequently a larger percentage of consequently a larger percentage of the population is of retirement age. the population is of retirement age. Compare distribution of 1980 Compare distribution of 1980 population to 1996 population.population to 1996 population.

Page 14: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Step 1:Step 1: State -State - Identify the population of interest and the Identify the population of interest and the parameter you want to draw a conclusion about. parameter you want to draw a conclusion about.

State the hypothesis in words and symbols.State the hypothesis in words and symbols.

We want determine if the We want determine if the distribution distribution of age groupsof age groups in the United States in in the United States in 1996 1996 has changed significantlyhas changed significantly from from the 1980 distribution.the 1980 distribution.

HHoo: the age group dist. in 1996 is the : the age group dist. in 1996 is the same assame as the 1980 dist.the 1980 dist.

HHaa: the age group dist. in 1996 is : the age group dist. in 1996 is different from thedifferent from the 1980 dist. 1980 dist.

Page 15: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Or, Or, State the hypothesis as proportionsState the hypothesis as proportions..

HHoo: : pp0-240-24 = 0.4139, p = 0.4139, p25-4425-44 = 0.2768, p = 0.2768, p45-45-

6464 = 0.1964,= 0.1964, p p65+65+ = = 0.1128.0.1128.

HHaa: : at least one of the proportions at least one of the proportions differsdiffers from the stated values. from the stated values.

Page 16: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Goal of “Goodness of Fit Goal of “Goodness of Fit Tests”Tests”

The The more the observed counts differ more the observed counts differ from the expected countsfrom the expected counts, the more , the more the evidence we have the evidence we have to reject Hto reject Hoo and thus and thus concludeconclude that the that the population dist. in 1996 population dist. in 1996 is is significantly different from significantly different from 1980.1980.

Page 17: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Always a good idea to plot the Always a good idea to plot the data.data.

Page 18: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Step 2:Step 2: Plan -Plan - Choose the appropriate inference Choose the appropriate inference procedure. procedure. Verify the conditionsVerify the conditions for using the for using the

selected procedure.selected procedure.

If the conditions are met, conduct If the conditions are met, conduct a a chi-square goodness of fit test.chi-square goodness of fit test.

Random: Random: We must assume the two We must assume the two distributions of age groups come distributions of age groups come from a from a randomized experiment. randomized experiment.

Page 19: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Calculate expected countsCalculate expected counts in each age in each age category and verify that they are large category and verify that they are large enough (see conditions). enough (see conditions). Yes, all > 5; Proceed

with Chi – square calculations

Page 20: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Independent:Independent:We clearly have two independent We clearly have two independent

age groups, one from 1980 and one age groups, one from 1980 and one from 1996. from 1996. We must check the 10% We must check the 10% condition. condition.

There are at least 10(286,598) U.S There are at least 10(286,598) U.S citizens in 1980 and at least 10(500) citizens in 1980 and at least 10(500) U.S citizens in 1996.U.S citizens in 1996.

Page 21: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Step 3:Step 3: Do -Do - If the conditions are met, carry If the conditions are met, carry out the inference procedure.out the inference procedure.

Calculate theCalculate the x x 22 statistic statistic to measure to measure how well the observed counts (O) how well the observed counts (O) differ form the expected counts (E) differ form the expected counts (E) under Hunder Hoo.. 2

2 O E

E

Page 22: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

A A large value oflarge value of x x 22 shows shows more more evidence against Hevidence against Hoo and also results and also results in a in a small P-value.small P-value.

Page 23: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Calculate P-valueCalculate P-value

df: use df: use n-1 n-1 degrees of freedom.degrees of freedom.This is because This is because X X 22 the family of the family of

curves is used to assess evidence curves is used to assess evidence against Hagainst Hoo..

Since we are using percentagesSince we are using percentages, 3 of , 3 of the 4 percentages are allowed to the 4 percentages are allowed to vary, the vary, the 4th is not.4th is not.

Df = Df = 44-1 = 3, -1 = 3,

Page 24: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Table C for a P-value of 0.05, critical Table C for a P-value of 0.05, critical value is value is 7.817.81. .

Calc: 2nd VARS: Calc: 2nd VARS: X X 22 cdf(8.2275,E99,3)cdf(8.2275,E99,3)

.0415.0415

Page 25: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

Step 4.Step 4. Conclude -Conclude - Interpret the Interpret the results in the context of the problem.results in the context of the problem.

Since Since our value of 8.2275 is more our value of 8.2275 is more extreme than 7.81extreme than 7.81, , we reject Hwe reject Hoo and conclude that the and conclude that the population population dist. in 1996 is significantly differentdist. in 1996 is significantly different from the 1980 dist. at the 5% level.from the 1980 dist. at the 5% level.

Page 26: Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to

To be cont.To be cont.