m&ms two-way tables ellen gundlach stat 301 course coordinator purdue university
TRANSCRIPT
M&Ms Two-way Tables
Ellen Gundlach
STAT 301 Course Coordinator
Purdue University
M&Ms Color Distribution % according to their website
Brown Yellow Red Blue Orange Green
Plain 13 14 13 24 20 16
Peanut 12 15 12 23 23 15
Peanut Butter/ Almond
10 20 10 20 20 20
Skittles Color Distribution % according to their hotline
Red Orange Yellow Green Purple
Skittles 20 20 20 20 20
My M&Ms data in counts
Brown Yellow Red Blue Orange Green Total
Plain 14 10 10 8 4 8 54
Peanut 2 3 5 0 8 4 22
Total 16 13 15 8 12 12 76
My M&Ms data: joint %(divide counts by total = 76)
Brown Yellow Red Blue Orange Green
Plain 18.4 13.2 13.2 10.5 5.3 10.5
Peanut 2.6 3.9 6.6 0 10.5 5.3
My M&Ms data: marginal %s for color
(add down the columns)Brown Yellow Red Blue Orange Green Total
Plain 18.4 13.2 13.2 10.5 5.3 10.5
Peanut 2.6 3.9 6.6 0 10.5 5.3
Marg. for color
21.0 17.1 19.8 10.5 15.8 15.8 100
My M&Ms data: marginal %s for flavor
(add across the rows)
Brown Yellow Red Blue Orange Green Marg. for flavor
Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1
Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9
Total 100
My M&Ms data: joint and marginal %s
Brown Yellow Red Blue Orange Green Marg. for flavor
Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1
Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9
Marg. for color
21.0 17.1 19.8 10.5 15.8 15.8 100
Conditional distribution of flavor for color
• We know the color of our M&M already, but now how is flavor distributed for this color?
joint % of color and flavormarginal % of color
Conditional distribution example
• We know we have a red M&M, so what is the probability it is a plain M&M?
joint % of red and plain 13.266.7%
marginal % of red 19.8
Conditional distribution of color for flavor
• We know the flavor of our M&M already, but now how is color distributed for this color?
joint % of color and flavormarginal % of flavor
Conditional distribution example
• We know we have a peanut M&M, so what is the probability it is green?
joint % of peanut and green 5.318.3%
marginal % of peanut 28.9
Conditional distributions in general
Conditional distribution of X for Y (we know Y for sure already, but we want to know the probability or % of having X be true as well):
joint % of X and Ymarginal % of Y (what we know for sure)
Bar graphs for conditional distribution of color for both flavors
blue brown green orange red yellow
color for milk chocolate M&Ms
0
5
10
15
20
25
30
Per
cen
t
Cases weighted by percentages for plain M&Ms
Conditional distribution of color for Milk Chocolate M&Ms
brown green orange red yellow
color for peanut M&Ms
0
10
20
30
40
Per
cen
t
Cases weighted by percentages for peanut M&Ms
Conditional distribution of color for Peanut M&Ms
Chi-squared hypothesis test
H0: There is no association between color
distribution and flavor for M&Ms.
Ha: There is association between color
distribution and flavor for M&Ms.
Use an = 0.01 for this story.
Full-class M&Ms data in counts(large sample size necessary for test)
Brown Yellow Red Blue Orange Green
Plain 147 302 264 407 330 373
Peanut 69 110 70 162 148 123
Chi-squared test SPSS results
Chi-Square Tests
14.396a 5 .013
14.623 5 .012
2505
Pearson Chi-Square
Likelihood Ratio
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)
0 cells (.0%) have expected count less than 5. Theminimum expected count is 58.81.
a.
Chi-squared test conclusions
• Test statistic = 14.396 and P-value = 0.013
• Since P-value is > our of 0.01, we do not reject H0.
• We do not have enough evidence to say there is association between color distribution and flavor for M&Ms.
Skittles vs. M&Ms
• Now we will compare the proportion of yellow candies for Skittles and for M&Ms.
• The previous two-way table with plain and peanut M&Ms was of size 2 x 6.
• This table will be of size 2x2 because we only care about whether a candy is yellow or non-yellow.
Full-class M&Ms and Skittles data in counts
(large sample size necessary for test)
Yellow Non-Yellow
Total
Plain M&Ms
302 1521 1823
Skittles 361 1351 1712
Total 663 2872 3535
Chi-squared hypothesis test
H0: There is no association between color
distribution and flavor for these candies.
Ha: There is association between color
distribution and flavor for these candies.
Use an = 0.01 for this story.
Chi-squared test SPSS results
Chi-Square Tests
11.839b 1 .001
11.544 1 .001
11.840 1 .001
.001 .000
3535
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)Exact Sig.(2-sided)
Exact Sig.(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is 321.09.
b.
Chi-squared test conclusions
• Test statistic = 11.839 and P-value = 0.001
• Since P-value is < our of 0.01, we reject H0.
• We have evidence that there is association between color distribution and flavor for these candies.
Another way to do this test
Since this is a 2x2 table, and if we are only interested in a 2-sided () hypothesis test, we can use the 2-sample proportions test here.
2-sample proportion test hypotheses
H0: pM&Ms = pSkittles
Ha: pM&Ms pSkittles
Defining the proportions
M&Ms
Skittles
# yellow M&Msp
total # M&Ms
# yellow Skittlesp
total # Skittles
Test statistic
&
&
ˆ ˆ
1 1ˆ ˆ(1 )
M Ms Skittles
M Ms Skittles
p pZ
p pn n
Results from the proportion test
• Sample proportions:
• Test statistic Z = -3.44
• P-value = 2(0.0003) = 0.0006
• Since P-value < our of 0.01, we reject H0.
& 0.166 and 0.211ˆ ˆM Ms Skittlesp p
Conclusion to the proportion test
• We have evidence the proportion of yellow M&Ms is not the same as the proportion of yellow Skittles.
• In other words, the type of candy makes a difference to the color distribution.
How do our results from the 2 tests compare?
• The X2 test statistic = 11.839, which is actually the (Z test statistic = -3.44)2.
• If you take into account the rounding, the P-values for both tests are 0.001.
• We rejected H0 in both tests.
When do you use which test?
• Chi-squared tests are best for:
two-sided hypothesis tests only
2x2 or bigger tables
• Proportion (Z) tests are best for:one- or two-sided hypothesis tests
only 2x2 tables