fighting for fame, scrambling for fortune, where is the end?
DESCRIPTION
Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free forever, who can attain?. Categorical Data Analysis. Chapter 4: Introduction to Generalized Linear Models. 3-Way Tables. - PowerPoint PPT PresentationTRANSCRIPT
1
Fighting for fame, scrambling for fortune, where is the end?
Great wealth and glorious honor, no more than a night dream.
Lasting pleasure, worry-free forever, who can attain?
2
Categorical Categorical Data AnalysisData Analysis
Chapter 4: Chapter 4: Introduction to Introduction to Generalized Generalized Linear ModelsLinear Models
3
3-Way Tables3-Way TablesResponse
Clinic Treatment Success Failure1 A 18 12
B 12 82 A 2 8
B 8 32Total A 20 20
B 20 40
4
Marginal vs. ConditionalMarginal vs. Conditional• Marginal independence: marginal
odds ratio =1
• Conditional independence: conditional odds ratios =1
• The observed effect of X on Y might simply reflect effects of other covariates on both X and Y
5
Generalized Linear Models (GLM)Generalized Linear Models (GLM)3 components:• Random component (Y1, Y2, … , Yn):
n independent observations from a distribution in the exponential family (not necessary to be i.i.d.): for i=1,2,…,n,
where a, b, c are all positive functions.e.g. poisson, binomial, exponential, normal
)](exp[)()()|( iiiiii cyybayf
6
• Systematic component:the right hand side of the model equation; often a linear combination of explanatory variables:
Or in matrix format
or X’BTX
kk xx ...110
7
• Link component:the link between u(=E(Y)) and X’B
In a model equation g(u)=X’B, g(.) is called the link function, a monotonic differentiable function– Most common link: Canonical link
e.g. normal, binomial, poisson
( ) ( )g c
Canonical LinkCanonical Link• Normal data: identity link
• Binary data: logit link
• Count data: log link
8
DevianceDeviance• Deviance measures the loss of
information from data reduction
• Saturated model: the most general model which fits each observation– Could be each subject’s response or– Could be each cell frequency
9
10
GLMs for Binary DataGLMs for Binary Data• Linear probability model (identity
link)
• Logistic regression model (logit link)
• Probit model (probit link)
11
Example: Snoring vs. Heart Example: Snoring vs. Heart DiseaseDisease
Snoring(score)
Observed P(H.D.)
Linear fit Logit fit Probit fit
Never (0) 0.017 0.017 0.021 0.020
Occasionally(2)
0.055 0.057 0.044 0.046
Almost daily (4)
0.099 0.096 0.093 0.095
Daily (5) 0.118 0.116 0.132 0.131
Note: The fits will change if relative spacings between scores change.
Example: 2x2 TablesExample: 2x2 Tables
12
•Binary covariate X and response Y
•Logit link GLM:
13
GLMs for Count DataGLMs for Count Data• Poisson loglinear model• Count data: certain events occur
over time, space or alike, e.g. the # of car accidents of a random sample of 100 drivers in 2005
• Rate data: count/(time or space or alike), e.g. the car accident rates of a random sample of 100 drivers in 2005 (Sec. 9.7.1, p. 385)
Example: Horseshoe Crabs Example: Horseshoe Crabs Data: Table 4.3 (p. 127)
• Y= # of satellites• X= carapace width
1.Poisson loglinear model2.Poisson GLM with identity link
14
15
Inference for GLMsInference for GLMs• Goodness of fit:
– Measure: deviance– Test: Likelihood-Ratio (LR) tests
• Model comparison: L-R tests
• Residuals: Pearson and standardized
16
Example: Insecticide vs BeetlesExample: Insecticide vs Beetles
dosage # of beetles exposed
# of dead beetles
1.691 59 6
… … …
1.884 60 60
17
Types of Models for Statistical AnalysisTypes of Models for Statistical AnalysisRandom component
Link Systematic component
Model Chapter
Normal Identity Continuous Regression
Normal Identity Categorical ANOVA
Normal Identity Mixed ANCOVA
Binomial Logit Mixed Logistic regression
5, 6
Poisson Log Mixed Loglinear 8
multinomial Generali-zed logit
Mixed Multinomial response
7