m&ms two-way tables ellen gundlach stat 301 course coordinator purdue university

31
M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Upload: adriel-goulding

Post on 14-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

M&Ms Two-way Tables

Ellen Gundlach

STAT 301 Course Coordinator

Purdue University

Page 2: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

M&Ms Color Distribution % according to their website

Brown Yellow Red Blue Orange Green

Plain 13 14 13 24 20 16

Peanut 12 15 12 23 23 15

Peanut Butter/ Almond

10 20 10 20 20 20

Page 3: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Skittles Color Distribution % according to their hotline

Red Orange Yellow Green Purple

Skittles 20 20 20 20 20

Page 4: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

My M&Ms data in counts

Brown Yellow Red Blue Orange Green Total

Plain 14 10 10 8 4 8 54

Peanut 2 3 5 0 8 4 22

Total 16 13 15 8 12 12 76

Page 5: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

My M&Ms data: joint %(divide counts by total = 76)

Brown Yellow Red Blue Orange Green

Plain 18.4 13.2 13.2 10.5 5.3 10.5

Peanut 2.6 3.9 6.6 0 10.5 5.3

Page 6: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

My M&Ms data: marginal %s for color

(add down the columns)Brown Yellow Red Blue Orange Green Total

Plain 18.4 13.2 13.2 10.5 5.3 10.5

Peanut 2.6 3.9 6.6 0 10.5 5.3

Marg. for color

21.0 17.1 19.8 10.5 15.8 15.8 100

Page 7: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

My M&Ms data: marginal %s for flavor

(add across the rows)

Brown Yellow Red Blue Orange Green Marg. for flavor

Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1

Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9

Total 100

Page 8: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

My M&Ms data: joint and marginal %s

Brown Yellow Red Blue Orange Green Marg. for flavor

Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1

Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9

Marg. for color

21.0 17.1 19.8 10.5 15.8 15.8 100

Page 9: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conditional distribution of flavor for color

• We know the color of our M&M already, but now how is flavor distributed for this color?

joint % of color and flavormarginal % of color

Page 10: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conditional distribution example

• We know we have a red M&M, so what is the probability it is a plain M&M?

joint % of red and plain 13.266.7%

marginal % of red 19.8

Page 11: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conditional distribution of color for flavor

• We know the flavor of our M&M already, but now how is color distributed for this color?

joint % of color and flavormarginal % of flavor

Page 12: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conditional distribution example

• We know we have a peanut M&M, so what is the probability it is green?

joint % of peanut and green 5.318.3%

marginal % of peanut 28.9

Page 13: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conditional distributions in general

Conditional distribution of X for Y (we know Y for sure already, but we want to know the probability or % of having X be true as well):

joint % of X and Ymarginal % of Y (what we know for sure)

Page 14: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Bar graphs for conditional distribution of color for both flavors

blue brown green orange red yellow

color for milk chocolate M&Ms

0

5

10

15

20

25

30

Per

cen

t

Cases weighted by percentages for plain M&Ms

Conditional distribution of color for Milk Chocolate M&Ms

brown green orange red yellow

color for peanut M&Ms

0

10

20

30

40

Per

cen

t

Cases weighted by percentages for peanut M&Ms

Conditional distribution of color for Peanut M&Ms

Page 15: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared hypothesis test

H0: There is no association between color

distribution and flavor for M&Ms.

Ha: There is association between color

distribution and flavor for M&Ms.

Use an = 0.01 for this story.

Page 16: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Full-class M&Ms data in counts(large sample size necessary for test)

Brown Yellow Red Blue Orange Green

Plain 147 302 264 407 330 373

Peanut 69 110 70 162 148 123

Page 17: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared test SPSS results

Chi-Square Tests

14.396a 5 .013

14.623 5 .012

2505

Pearson Chi-Square

Likelihood Ratio

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)

0 cells (.0%) have expected count less than 5. Theminimum expected count is 58.81.

a.

Page 18: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared test conclusions

• Test statistic = 14.396 and P-value = 0.013

• Since P-value is > our of 0.01, we do not reject H0.

• We do not have enough evidence to say there is association between color distribution and flavor for M&Ms.

Page 19: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Skittles vs. M&Ms

• Now we will compare the proportion of yellow candies for Skittles and for M&Ms.

• The previous two-way table with plain and peanut M&Ms was of size 2 x 6.

• This table will be of size 2x2 because we only care about whether a candy is yellow or non-yellow.

Page 20: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Full-class M&Ms and Skittles data in counts

(large sample size necessary for test)

Yellow Non-Yellow

Total

Plain M&Ms

302 1521 1823

Skittles 361 1351 1712

Total 663 2872 3535

Page 21: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared hypothesis test

H0: There is no association between color

distribution and flavor for these candies.

Ha: There is association between color

distribution and flavor for these candies.

Use an = 0.01 for this story.

Page 22: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared test SPSS results

Chi-Square Tests

11.839b 1 .001

11.544 1 .001

11.840 1 .001

.001 .000

3535

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 321.09.

b.

Page 23: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Chi-squared test conclusions

• Test statistic = 11.839 and P-value = 0.001

• Since P-value is < our of 0.01, we reject H0.

• We have evidence that there is association between color distribution and flavor for these candies.

Page 24: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Another way to do this test

Since this is a 2x2 table, and if we are only interested in a 2-sided () hypothesis test, we can use the 2-sample proportions test here.

Page 25: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

2-sample proportion test hypotheses

H0: pM&Ms = pSkittles

Ha: pM&Ms pSkittles

Page 26: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Defining the proportions

M&Ms

Skittles

# yellow M&Msp

total # M&Ms

# yellow Skittlesp

total # Skittles

Page 27: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Test statistic

&

&

ˆ ˆ

1 1ˆ ˆ(1 )

M Ms Skittles

M Ms Skittles

p pZ

p pn n

Page 28: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Results from the proportion test

• Sample proportions:

• Test statistic Z = -3.44

• P-value = 2(0.0003) = 0.0006

• Since P-value < our of 0.01, we reject H0.

& 0.166 and 0.211ˆ ˆM Ms Skittlesp p

Page 29: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

Conclusion to the proportion test

• We have evidence the proportion of yellow M&Ms is not the same as the proportion of yellow Skittles.

• In other words, the type of candy makes a difference to the color distribution.

Page 30: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

How do our results from the 2 tests compare?

• The X2 test statistic = 11.839, which is actually the (Z test statistic = -3.44)2.

• If you take into account the rounding, the P-values for both tests are 0.001.

• We rejected H0 in both tests.

Page 31: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University

When do you use which test?

• Chi-squared tests are best for:

two-sided hypothesis tests only

2x2 or bigger tables

• Proportion (Z) tests are best for:one- or two-sided hypothesis tests

only 2x2 tables