contingency table analysis mary whiteside, ph.d

21
Contingency Table Contingency Table Analysis Analysis Mary Whiteside, Ph.D.

Upload: lena-smerdon

Post on 31-Mar-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Contingency Table Analysis Mary Whiteside, Ph.D

Contingency Table AnalysisContingency Table Analysis

Mary Whiteside, Ph.D.

Page 2: Contingency Table Analysis Mary Whiteside, Ph.D

OverviewOverview

Hypotheses of equal proportionsHypotheses of independenceExact distributions and Fisher’s testThe Chi squared approximationMedian testMeasures of dependenceThe Chi squared goodness-of-fit testCochran’s test

Page 3: Contingency Table Analysis Mary Whiteside, Ph.D

Contingency Table ExamplesContingency Table Examples

Countries - religion by government States – dominant political party by

geographic region Mutual funds - style by family Companies - industry by location of

headquarters

Page 4: Contingency Table Analysis Mary Whiteside, Ph.D

More examples - More examples -

Countries - government by GDP categories States - divorce laws by divorce rate categories Mutual funds - family by Morning Star rankings Companies - industry by price earnings ratio

category

Page 5: Contingency Table Analysis Mary Whiteside, Ph.D

Statistical Inference hypothesis Statistical Inference hypothesis of equal proportionsof equal proportionsH0: all probabilities (estimated by proportions,

relative frequencies) in the same column are equal,

H1:at least two of the probabilities in the same column are not equal

Here, for an r x c contingency table, r populations are sampled with fixed row totals, n1, n2, … nr.

Page 6: Contingency Table Analysis Mary Whiteside, Ph.D

Hypothesis of independenceHypothesis of independence

H0: no association

i.e. row and column variable are independent,

H1: an association,

i.e. row and column variable are not independent

Here, one populations is sampled with sample size N. Row totals are random variables.

Page 7: Contingency Table Analysis Mary Whiteside, Ph.D

Exact distribution for 2 x 2 tables: Exact distribution for 2 x 2 tables: hypothesis of equal proportions; nhypothesis of equal proportions; n11 = = nn22 = 2 = 2 2 0

2 0

2 0

0 2

0 2

2 0

0 2

0 2

2 0

1 1

0 2

1 1

Page 8: Contingency Table Analysis Mary Whiteside, Ph.D

Fisher’s Exact TestFisher’s Exact Test

For 2 x 2 tables assuming fixed row and column totals r, N-r, c, N-c:

Test statistic = x, the frequency of cell11

Probability = hyper-geometric probability of x successes in a sample of size r from a population of size N with c successes

Page 9: Contingency Table Analysis Mary Whiteside, Ph.D

Large sample approximation for Large sample approximation for either test either test Chi squared

= Observed - Expected]2 /ExpectedObserved frequency for cell ij comes

from cross-tabulation of dataExpected frequency for cell ij

= Probability Cell ij * N

Degrees of freedom (r-1)*(c-1)

Page 10: Contingency Table Analysis Mary Whiteside, Ph.D

Computing Cell ProbabilitiesComputing Cell Probabilities

Assumes independence or equal probabilities (the null hypothesis)

Probability Cell ij = Probability Row i

* Probability Column j

= (R i/N) * (C j/N)

Expected frequency ij = (R/N)*(C/N)*N

= R*C/N.

Page 11: Contingency Table Analysis Mary Whiteside, Ph.D

Distribution of the SumDistribution of the Sum

Chi Square with (r-1)*(c-1) degrees of freedom

Assumes Observed - Expected]2 /Expected

is standard normal squared

Page 12: Contingency Table Analysis Mary Whiteside, Ph.D

ImpliesObserved - Expected] /Square root[Expected]is standard normal

Implies and Observed is a Poisson RV

Poisson is approximately normal if > 5, traditional guideline

Conover’s relaxed guideline page 201

Page 13: Contingency Table Analysis Mary Whiteside, Ph.D

Measures of Strength: Measures of Strength: Categorical VariablesCategorical VariablesPhi 2x2Cramer's V for rxc Pearson's Contingency

CoefficientTschuprow's T

Page 14: Contingency Table Analysis Mary Whiteside, Ph.D

Measures of Strength: Measures of Strength: Ordinal VariablesOrdinal VariablesLambda A .. Rows dependentLambda B .. Columns dependentSymmetric LambdaKendall's tau-BKendall's tau-CGamma

Page 15: Contingency Table Analysis Mary Whiteside, Ph.D

Steps of Statistical AnalysisSteps of Statistical Analysis

Significance - Strength

1- Test for significance of the observed association

2 - If significant, measure the strength of the association

Page 16: Contingency Table Analysis Mary Whiteside, Ph.D

Consider the correlation Consider the correlation coefficientcoefficient a measure of association (linear relationship

between two quantitative variables)significant but not strongsignificant and strongnot significant but “strong”not significant and not strong

Page 17: Contingency Table Analysis Mary Whiteside, Ph.D

r and Prob (p-value)r and Prob (p-value)

r = .20 p-value < .05 r = .90 p-value < .05r = .90 p-value > .05r = .20 p-value > .05

Page 18: Contingency Table Analysis Mary Whiteside, Ph.D

ConceptsConcepts

Predictive associations must be both significant and strong

In a particular application, an association may be important even if it is not predictive (I.e. strong)

Page 19: Contingency Table Analysis Mary Whiteside, Ph.D

More conceptsMore concepts

Highly significant , weak associations result from large samples

Insignificant “strong” associations result from small samples - they may prove to be either predictive or weak with larger samples

Page 20: Contingency Table Analysis Mary Whiteside, Ph.D

ExamplesExamples

Heart attack Outcomes by Anticoagulant Treatment

Admission Decisions by Gender

Page 21: Contingency Table Analysis Mary Whiteside, Ph.D

SummarySummary

Is there an association?– Investigate with Chi square p-value

If so, how strong is it?– Select the appropriate measure of

strength of associationWhere does it occur?

– Examine cell contributions