chi-square x 2. chi square symbolized by greek x 2 pronounced “ki square” a test of statistical...
TRANSCRIPT
CHI-SQUARE
x 2
Chi Square
• Symbolized by Greek x 2
• pronounced “Ki square”
• a Test of STATISTICAL SIGNIFICANCE for TABLE data
• “What are the ODDs the relationship in a TABLE using SAMPLE data found in the POPULATION
CHI SQUARE or x 2
A TEST of STATISTICAL SIGNIFICANCE
What do tests of statistical significance tell us?
• Are OBSERVED RESULTS• SIGNIFICANTLY
DIFFERENT than would be expected
• BY CHANCE
• Criteria p < .05
The Logic of Statistical Significance
• Use probability theory to determine whether OBSERVED results SIGNIFICANTLY different than expected BY CHANCE in RANDOM sample data
• By convention criteria p < .05 (1-in-20)
• p <.01 better
• p>.o5 OBSERVED results could be due
• to chance
HYPOTHESIS TESTING
Test of Significance:
Logic of chi-square
x 2 and the Logic of Formal Hypothesis Testing
• Assert RESEARCH HYPOTHESIS (H1) PRIOR to data collection:
• H1: There is a relationship between X & Y
• TEST the NULL HYPOTHESIS of NO RELATIONSHIP (Ho)
• H0: There is NO RELATIONSHIP X & Y
• GOAL: Reject NULL using x 2 (p<.05)
ASSESSING STATISTICAL SIGNIFICANCE
• Chi-square is one standard test of a relationship between 2 variables
• With it we ask the data distribution to tell us about the null hypothesis
• IS THE NULL TRUE?
CONCLUSIONS using x 2 to test“significance” of hypothesis
• Reject NULL HYPOTHESIS (H0) if find p <.05
• ....>conclude OBSERVED RESULT are
• STATISTICALLY SIGNIFICANT
• Cannot reject reject NULL (H0) if p>.05
• ...> conclude results NOT SIGNIFICANT
• ...>could be due to chance
Chi Square (x2) tells us• Whether the
OBSERVED results• we see in a TABLE• are SIGNIFICANTLY• different than would
be due to chance
• Values Expected by• CHANCE• Party• Dem Rep• Vote 50% 50%• Not 50% 50%• 100% 100%
Is relationship “significant”???
• Is pattern or relationship OBSERVED in sample also found in POPULATION??
Chi Square x2
• Compares OBSERVED values
• in a TABLE and• asked:• Are these significantly• different than would be
expected by chance? • * ( ) Expected p-value
• VOTE by PARTY *• Party• Deem Rep
• Vote 5 (5) 5 (5)• NOT 5 (5) 5(5)• 10 10• N=20
Basic Formula for Chi Square x2
• (O-E)2
• ----------
•E
• O = Observed cell value
• E = Expected by chance cell value
Value for x 2 depends upon
THREE factors: #1• 1. (O - E)2
• ________
• E
#1 Size of the observed relationship
– Age
– Young Old
– $lots10% 90%
– little 90% 10%
– 100% 100%
– %d=80%
– N=50
• Age
• Young Old
• $lots 50% 50%
• little 50% 50%
• 100% 100%
• %d= 0% • N=50
#2 THE DEGREE of FREEDOM associated with the TABLE
• d.f. degrees of freedom =
• (c-1) #columns - 1• (r-1) #rows -1• Must calculate d.f. or• (c-1) (r-1) to obtain• probability from x2
• table
• Example:• Party by Vote• Dem Rep• Vote• NOT• d.f. = (c-1) (r-1)• (2-1) (2-1)• d.f. = ?
Degrees of Freedom
• d.f. = (r-1) (c-1)• How many pieces of• independent
information can go into a table before rest of cell values are FIXED
• 2x2 table = 1 d.f.
• Example 2x2 Table• Dem Rep• Vote * x• NOT x • x x• How many cells filled
in before rest of values are fixed? *1
Degrees of Freedom cont.
• d.f. = (c-1)(r-1)• 2x2 table = 1• 2x3 table = ?• How many cell values
free to vary before rest of table is fixed?
• Party and Vote• Dem Rep• Vote * x• No * x• Un- x• decided• x x
Degrees of Freedom cont.
• d.f. = (c-1)(r-1)• 2x2 table = 1 d.f.• 2x3 table = 2 d.f.• 4x4 table = ???• 10x10 table = ???• information used to
calculate p=values or ODDS
#3 Factor used to calculate x2
• SAMPLE SIZE• LARGER samples• higher probability• x2 will be • STATISTICALLY• SIGNIFICANT • p<.05• N=5 N=50 N=500
“Significance” depends partly on SAMPLE SIZE
• Walks Quacks Like• Like a Duck• Duke Yes No• Yes 60% 40%• No 40 60
• (N) (25) (25)
• x2 =2.0;n.s. (p>.05)
• Walks Quacks Like• Like a Duck• Duke Yes No• Yes 60% 40%• No 40 60• 100% 100%
• (N) (50) (50)
• x2 = 4.00 ; p<.05
REMEMBER
• STATISTICAL SIGNIFICANCE does NOT = SUBSTANTIVE SIGNIFICANCE
• Need SAMPLE DATA collected via random sampling (N = 30+)
• For x2 need AT LEAST 5 observations per cell
• p< or = to .05• Used as tool in Formal Hypothesis test NOT• “bare foot empiricism i.e., run 5,000 tests
Conclusions will be tentative. Why?
• Using probability as standard of proof
• p<.05 means 95/100 times this relationship is REAL not RANDOM
• BUT always some chance conclusion
• is wrong
• How much using p<.05?
Probabilistic generalizations
• Will always be tentative
Type I and Type II Errors
• Decision about H0
• H0 is true H0 is false
• Reject H0 Type I No Error
• Do not reject
• H0 No Error Type II
Solution:
• Use “good” theory
• Use lowest possible p-value
• Replicate results.....Null results can be meaningful....why do these results mean?
Null results are important
• Must try to understand what they mean????
• for your• theory
Nothing replaces a good theory
• Can be guide to interpret results
• Sometimes need to be revised
TEST NULL
• REMEMBER:
• Assert H1 to be TRUE
• TEST the NULL
• H0 No relationship
• If H0 is < .05
• What do we conclude?
Correct Conclusion
• YES,“data supports hypothesis” or “statistically significant”
• NEVER “results PROOVE” H1
Example: TESTING NULLusing x2
• 2 Election variables -- Nature of Primary...>
• divisive or not• & Outcome of General
Election ...>win or lose• What would be your• Research hypothesis
• H1: ?????
TWO VARIABLES
• Type of Primary -- we’ll classify them as divisive or nondivisive
• Election Outcome -- we’ll classify as won or lost
Example cont.
• H1: If a candidate experiences a divisive primary race, s/he is more likely to lose the general election
• H0: There is no relationship between type of primary race and election outcome
• Test: calculate x 2 and p-value using d.f.
• and a x2 table (back of any stat book)
x2 = (0-E)2
E
• 0 E 0-E (0-E)2 (0-E)2/E
• ______ _______ Sum zero chi sq value
• x2 = ?; d.f. = ? p< or = ?
TYPE of Primary
• Divisive Nondivisive
• Won 8 ( ) 22 ( ) 30
• Lost 8 ( ) 6 ( ) 14
• 16 28 44
• x2 = ?? ; d.f. = ?? ; p< ??
TYPE of PRIMARY
• Calculate Measure of Association• • Cramer’s V = .295084 x2/mn• where m =(r-1) or• (c-1)• which ever is smaller • Other measures of association could use?
TYPE of PRIMARYConclusion
• Strength of relationship using
• Cramer’s V = .295
• Is this a significant relationship?
• x2 = d.f. = p<
• Overall conclusion????
• Next step: consider OTHER VARIABLES