chi-square part ii
DESCRIPTION
Chi-Square Part II. Fenster. Chi-Square Part II. Let us see how this works in another example. Chi-Square Part II. It has been argued that people with favorable attitudes towards research tend to have favorable attitudes towards statistics. - PowerPoint PPT PresentationTRANSCRIPT
Chi-Square Part IIChi-Square Part II
FensterFenster
Chi-Square Part IIChi-Square Part II Let us see how this works in another example.Let us see how this works in another example.
Attitudes towards ResearchAttitudes towards Research
Attitudes Attitudes Towards Towards StatisticsStatistics
FavorableFavorable Neither Neither favorable nor favorable nor unfavorableunfavorable
UnfavorableUnfavorable Row TotalsRow Totals
FavorableFavorable 99 2626 1313 4848
Neither Neither favorable nor favorable nor unfavorableunfavorable
1919 7575 8383 177177
UnfavorableUnfavorable 1616 5656 110110 182182
Col. TotalsCol. Totals 4444 157157 206206 407407
Chi-Square Part IIChi-Square Part II
It has been argued that people with It has been argued that people with favorable attitudes towards research favorable attitudes towards research tend to have favorable attitudes tend to have favorable attitudes towards statistics. towards statistics.
Question: If we knew the attitudes Question: If we knew the attitudes towards research of a respondent, towards research of a respondent, can we predict the attitude toward can we predict the attitude toward statistics?statistics?
Chi-Square Part IIChi-Square Part II
Step 2Step 2 HH11: Knowledge of attitudes toward : Knowledge of attitudes toward
research does help us predict research does help us predict attitudes towards statistics.attitudes towards statistics.
Step 1Step 1 HHOO: Knowledge of attitudes toward : Knowledge of attitudes toward
research does not help us predict research does not help us predict attitudes towards statistics.attitudes towards statistics.
Chi-Square Part IIChi-Square Part II
Selecting a significance level: Let’s use Selecting a significance level: Let’s use =.05. This gives us a χ=.05. This gives us a χ22 critical of 9.488. critical of 9.488. Your book says the χYour book says the χ22 critical of 9.5. critical of 9.5.
Step 4: Collect and summarize sample Step 4: Collect and summarize sample data. data.
We will use the chi-square test with 4 We will use the chi-square test with 4 degrees of freedom. degrees of freedom.
Why four? df=(r-1) X (c-1) Why four? df=(r-1) X (c-1) We have 3 rows and 3 columns.We have 3 rows and 3 columns. so we get df= (3-1) X (3-1)= 2 X 2=4so we get df= (3-1) X (3-1)= 2 X 2=4
Chi-Square Part IIChi-Square Part II
If we find a χIf we find a χ22 greater than or equal to greater than or equal to 9.5 we reject the null hypothesis and 9.5 we reject the null hypothesis and conclude that attitudes towards conclude that attitudes towards research can predict attitudes towards research can predict attitudes towards statistics. statistics.
If we find a χIf we find a χ22 less than 9.5 we fail to less than 9.5 we fail to reject the null hypothesis and conclude reject the null hypothesis and conclude attitudes towards research cannot attitudes towards research cannot predict attitudes towards statistics.predict attitudes towards statistics.
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Expected frequencies= Expected frequencies= (Row total) X (Column (Row total) X (Column Total)Total)
Grand TotalGrand Total
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell a – Favorable attitudes towards Cell a – Favorable attitudes towards both research and statistics.both research and statistics.
(44) X (48)(44) X (48) = 5.18 = 5.18
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell b – Neither favorable or Cell b – Neither favorable or unfavorable attitudes towards unfavorable attitudes towards research, favorable attitudes towards research, favorable attitudes towards statistics.statistics.
(157) X (48)(157) X (48) = 18.51 = 18.51
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell c –Unfavorable attitudes towards Cell c –Unfavorable attitudes towards research, favorable attitudes towards research, favorable attitudes towards statisticsstatistics
(206) X (48)(206) X (48) = 24.29 = 24.29
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell d – Favorable attitudes towards Cell d – Favorable attitudes towards research, neither favorable or research, neither favorable or unfavorable attitudes towards unfavorable attitudes towards statisticsstatistics
(44) X (177)(44) X (177) = 19.13 = 19.13
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell e - Neither favorable or Cell e - Neither favorable or unfavorable attitudes towards both unfavorable attitudes towards both statistics and researchstatistics and research
(157) X (177)(157) X (177) = 68.27 = 68.27
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell f – Unfavorable attitudes towards Cell f – Unfavorable attitudes towards research, neither favorable or research, neither favorable or unfavorable attitudes towards unfavorable attitudes towards statisticsstatistics
(206) X (177)(206) X (177) = 89.58 = 89.58
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell g – Favorable attitudes towards Cell g – Favorable attitudes towards research, unfavorable attitudes research, unfavorable attitudes towards statisticstowards statistics
(44) X (182)(44) X (182) = 19.67 = 19.67
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell h - Neither favorable or Cell h - Neither favorable or unfavorable attitudes towards unfavorable attitudes towards research, unfavorable attitudes research, unfavorable attitudes towards statisticstowards statistics
(157) X (182)(157) X (182) = 70.20 = 70.20
407407
Calculation of Expected FrequenciesCalculation of Expected Frequencies
Cell i – Unfavorable attitudes towards Cell i – Unfavorable attitudes towards both research and statisticsboth research and statistics
(206) X (182)(206) X (182) = 92.11 = 92.11
407407
So we set up our chi-square tableSo we set up our chi-square tableCellCell f observedf observed f expected f expected f observed-f f observed-f
expected expected (i.e., (i.e., RESIDUALRESIDUALS)S)
(f observed-f (f observed-f expected)expected)22
(f observed-f (f observed-f expected)expected)22/f /f expectedexpected
aa 99 5.185.18 3.823.82 14.5914.59 2.812.81
bb 2626 18.5118.51 7.497.49 56.156.1 3.033.03
cc 1313 24.2924.29 -11.29-11.29 127.46127.46 5.255.25
dd 1919 19.1319.13 -0.13-0.13 0.170.17 0.0080.008
ee 7575 68.2768.27 6.736.73 45.2945.29 0.60.6
ff 8383 89.5889.58 -6.58-6.58 43.2943.29 0.50.5
gg 1616 19.6719.67 -3.67-3.67 13.4613.46 0.670.67
hh 5656 70.2070.20 -14.2-14.2 201.64201.64 2.872.87
ii 110110 92.1192.11 17.8917.89 320.05320.05 3.53.5
TotalTotal 407407 407.00407.00 0.000.00 20.220.2
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Step 5: Making a decisionStep 5: Making a decision ΧΧ2 2 observed= 20.2observed= 20.2 χχ22 critical= 9.488. critical= 9.488. Decision: REJECT HDecision: REJECT HOO, and conclude , and conclude
that attitudes towards research allow that attitudes towards research allow us to predict attitudes towards us to predict attitudes towards statistics.statistics.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Notes about chi-square: Notes about chi-square: (1) Σ (f observed - f expected)=0. (1) Σ (f observed - f expected)=0. The RESIDUALS ALWAYS SUM TO The RESIDUALS ALWAYS SUM TO
ZERO. ZERO. If Σ (f observed - f expected) does If Σ (f observed - f expected) does
not equal zero (within rounding not equal zero (within rounding error), you have made a calculation error), you have made a calculation error. Recheck your work. error. Recheck your work.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
The chi-square test itself cannot tell The chi-square test itself cannot tell us anything about directionality. us anything about directionality. One way to get directionality in the One way to get directionality in the chi-square is to look at the (f chi-square is to look at the (f observed- f expected) column. We observed- f expected) column. We see that certain cells occur much less see that certain cells occur much less frequently than we would expect. frequently than we would expect.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
For example cell c (unfavorable attitudes For example cell c (unfavorable attitudes towards research but favorable attitudes towards research but favorable attitudes towards statistics) occurs much less towards statistics) occurs much less frequently than we would expect on the frequently than we would expect on the basis of chance. basis of chance.
Analysis of ResidualsAnalysis of ResidualsCellCell f observedf observed f expected f expected f observed-f f observed-f
expected expected (i.e., (i.e., RESIDUALRESIDUALS)S)
(f observed-f (f observed-f expected)expected)22
(f observed-f (f observed-f expected)expected)22/f /f expectedexpected
aa 99 5.185.18 3.823.82 14.5914.59 2.812.81
bb 2626 18.5118.51 7.497.49 56.156.1 3.033.03
cc 1313 24.2924.29 -11.29-11.29 127.46127.46 5.255.25
dd 1919 19.1319.13 -0.13-0.13 0.170.17 0.0080.008
ee 7575 68.2768.27 6.736.73 45.2945.29 0.60.6
ff 8383 89.5889.58 -6.58-6.58 43.2943.29 0.50.5
gg 1616 19.6719.67 -3.67-3.67 13.4613.46 0.670.67
hh 5656 70.2070.20 -14.2-14.2 201.64201.64 2.872.87
ii 110110 92.1192.11 17.8917.89 320.05320.05 3.53.5
TotalTotal 407407 407.00407.00 0.000.00 20.220.2
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
We can also see that three cells that We can also see that three cells that capture consistency of attitudes between capture consistency of attitudes between research and statistics (cell a favorable research and statistics (cell a favorable attitudes for both, cell e neither favorable attitudes for both, cell e neither favorable or unfavorable attitudes towards both, cell or unfavorable attitudes towards both, cell i unfavorable attitudes for both) all have a i unfavorable attitudes for both) all have a positive values for (f observed- f positive values for (f observed- f expected). expected).
Those three cells are consistent with the Those three cells are consistent with the (unstated and untested) hypothesis that (unstated and untested) hypothesis that individuals tend to have similar attitudes individuals tend to have similar attitudes for both research and statistics for both research and statistics
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Only by examining the (f observed- f Only by examining the (f observed- f expected) can we give any expected) can we give any statement on the directionality of the statement on the directionality of the relationship. [We could also analyze relationship. [We could also analyze the column percentages as we move the column percentages as we move across categories of the independent across categories of the independent variable to give us insight on variable to give us insight on directionality.]directionality.]
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
3) In this example, why do we get 3) In this example, why do we get statistical significance? We can say statistical significance? We can say that the cells d, e, f and g do not that the cells d, e, f and g do not contribute to the statistical contribute to the statistical significance of the overall significance of the overall relationship. The individual chi-relationship. The individual chi-square values for these four cells are square values for these four cells are all very small. The overall all very small. The overall relationship is significant because of relationship is significant because of the other cells. the other cells.
Analysis of ResidualsAnalysis of ResidualsCellCell f observedf observed f expected f expected f observed-f f observed-f
expected expected (i.e., (i.e., RESIDUALRESIDUALS)S)
(f observed-f (f observed-f expected)expected)22
(f observed-f (f observed-f expected)expected)22/f /f expectedexpected
aa 99 5.185.18 3.823.82 14.5914.59 2.812.81
bb 2626 18.5118.51 7.497.49 56.156.1 3.033.03
cc 1313 24.2924.29 -11.29-11.29 127.46127.46 5.255.25
dd 1919 19.1319.13 -0.13-0.13 0.170.17 0.0080.008
ee 7575 68.2768.27 6.736.73 45.2945.29 0.60.6
ff 8383 89.5889.58 -6.58-6.58 43.2943.29 0.50.5
gg 1616 19.6719.67 -3.67-3.67 13.4613.46 0.670.67
hh 5656 70.2070.20 -14.2-14.2 201.64201.64 2.872.87
ii 110110 92.1192.11 17.8917.89 320.05320.05 3.53.5
TotalTotal 407407 407.00407.00 0.000.00 20.220.2
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Chi-square allows us to decompose Chi-square allows us to decompose the overall relationship into its the overall relationship into its component parts. This component parts. This decomposition allows us to assess decomposition allows us to assess whether all categories contribute to whether all categories contribute to the significance of the overall the significance of the overall relationship.relationship.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Limitations for χLimitations for χ22
So far we have stressed the virtues So far we have stressed the virtues for χfor χ22 such as weak assumptions, and such as weak assumptions, and a statistical significance test a statistical significance test appropriate for nominal level data. appropriate for nominal level data. This is why chi-square is so popular.This is why chi-square is so popular.
There are two limitations for χThere are two limitations for χ22, one , one minor and one major.minor and one major.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Minor LimitationMinor Limitation When the expected cell frequency is less When the expected cell frequency is less
than 5, χthan 5, χ22 rejects the null hypothesis too rejects the null hypothesis too easily. (Note: this means the EXPECTED easily. (Note: this means the EXPECTED frequency and NOT the OBSERVED frequency and NOT the OBSERVED frequency).frequency).
Solution: Use Yates' correction Solution: Use Yates' correction Yates’ correction Yates’ correction Take the | (f observed- f expected) | -0.5Take the | (f observed- f expected) | -0.5
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Major LimitationMajor Limitation We have set up a null hypothesis that there is We have set up a null hypothesis that there is
no relationship between two variables and have no relationship between two variables and have tried to reject this hypothesis. tried to reject this hypothesis.
We refer to a relationship as being statistically We refer to a relationship as being statistically significant when we have established, subject significant when we have established, subject to the risk of type I error, that there is a to the risk of type I error, that there is a relationship between two variables. relationship between two variables.
But does rejecting the null hypothesis mean the But does rejecting the null hypothesis mean the relationship is significant in the sense of being a relationship is significant in the sense of being a strong or an important one?strong or an important one?
Not necessarily.Not necessarily.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Remember significance levels are Remember significance levels are dependent upon sample size.dependent upon sample size.
Let us say that you wanted to Let us say that you wanted to investigate the relationship between investigate the relationship between gender and level of tolerance. You gender and level of tolerance. You had no money to investigate this had no money to investigate this relationship, so you handed out relationship, so you handed out questionnaires around UML and questionnaires around UML and found the following:found the following:
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
GenderGender
Attitudes Attitudes towards racial towards racial tolerancetolerance
MalesMales FemalesFemales Row TotalsRow Totals
HighHigh 2424 2626 5050
LowLow 2626 2424 5050
Column Column TotalsTotals
5050 5050 100100
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Is there a significant relationship between Is there a significant relationship between gender and attitudes towards racial gender and attitudes towards racial tolerance? tolerance?
Let us use α=.05. Let us use α=.05. We have one degree of freedom. We have one degree of freedom. χχ22 critical=3.8. χ critical=3.8. χ22 observed=0.16. observed=0.16. Since χSince χ22 observed (0.16) < χ observed (0.16) < χ22 critical (3.8), critical (3.8),
we FAIL to reject the null hypothesis and we FAIL to reject the null hypothesis and conclude that gender does not help us conclude that gender does not help us predict to attitudes towards racial predict to attitudes towards racial tolerance.tolerance.
Now let us say you had an extremely ambitious Now let us say you had an extremely ambitious study and you found the following relationshipstudy and you found the following relationship
GenderGender
Attitudes towards Attitudes towards racial toleranceracial tolerance
MalesMales FemalesFemales Row TotalsRow Totals
HighHigh 24002400 26002600 50005000
LowLow 26002600 24002400 50005000
Column TotalsColumn Totals 50005000 50005000 1000010000
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Is there a significant relationship between Is there a significant relationship between gender and attitudes towards racial gender and attitudes towards racial tolerance? tolerance?
Let us use α=.05. Let us use α=.05. We have one degree of freedom. We have one degree of freedom. χχ22 critical=3.8, χ critical=3.8, χ22 observed=16.0. observed=16.0. Since χSince χ22 observed (16.0) > χ observed (16.0) > χ22 critical (3.8), critical (3.8),
we easily reject the null hypothesis and we easily reject the null hypothesis and conclude that gender does help us predict conclude that gender does help us predict to attitudes towards racial tolerance.to attitudes towards racial tolerance.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
χχ22 is sensitive to the number of cases is sensitive to the number of cases in the sample. Even though the in the sample. Even though the proportions in the cells remain proportions in the cells remain unchanged, the new χunchanged, the new χ22 is 100 times is 100 times the old chi-square because we have the old chi-square because we have 100 times the number of cases.100 times the number of cases.
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
Corrections for the sample size Corrections for the sample size problem problem
Pearson's contingency coefficient Pearson's contingency coefficient (You can ask for the Contingency (You can ask for the Contingency Coefficient with SPSS CROSSTABS’ Coefficient with SPSS CROSSTABS’ output).output).
Hypothesis Testing with Chi-SquareHypothesis Testing with Chi-Square
C= C= χχ22
χχ22 + N + N where N=total number of cases in where N=total number of cases in
samplesample Problem with C: Cannot attain 1.0 in Problem with C: Cannot attain 1.0 in
perfect relationship.perfect relationship. As the syllabus says, there is no ideal As the syllabus says, there is no ideal
solution to the sample size problem solution to the sample size problem with chi-square.with chi-square.