Download - Lecture 5 Linear Mixed Effects Models
2
Outline
Explore options available when assumptions
of classical linear models are untenable.
In this lecture:
What can we do when observations
(and thus residuals) are not strictly independent ?
3
Defined by three assumptions:
(1) the response variable is continuous.
(2) the residuals (ε) are normally distributed and ...
(3) ... independently and identically distributed.
Today, we will consider a range of options available
when we either know or suspect that our data
are not strictly independent from each other.
(Departures from the other assumptions will be dealt with later)
Classical Linear Models
4
In previous lectures:
We merely checked the independence of our residuals by inspecting the plot of residuals vs. fitted values.
Example from lecture 2:
Non-independent Residuals
A non-linear trend
which suggested that our linear model was probably misspecified
5
Data collection can often lead to non-independence among your observations.
A few examples:
Repeated (longitudinal) observations on the same "individuals"
(on different days, weeks, months, years)
Collecting data from a few locations (spatial structure)
(surveys conducted in schools/streets, fields/sites/islands)
Collecting data on related individuals
(father & sons, twins, species within the same genus/tribe/family).
If we treat all these observations as fully independent, we are likely to overestimate the number of degrees of freedom.
which may lead us to wrongly reject a null hypothesis (type I error)
Non-independent Observations
6
Two ways to cope with non-independent observations
When design is balanced ("equal sample size")
We can use factors to partition our observations in different "groups" and analyse them as an ANOVA or ANCOVA.
We already know how to do that (when factors are "crossed")
We just need to figure out how to cope with nested factors.
When design is unbalanced ("uneven sample size")
Mixed effect models are then called for.
Non-independent Observations
7
Nested AnovaExample:
A designed field experiment on crop yield with three treatments :
irrigation (control, irrigated)
sowing density (low, medium, high)
fertilizer (N, P, NP)
Block
control irrigated
high
low
medium
N P NP
Split plot design
each block has 18 different subplots
8
Nested AnovaExample:
A designed field experiment on crop yield with three treatments
> yields <- read.table("splityield.txt", header=T)> attach(yields)> names(yields)[1] "yield" "block" "irrigation" "density" "fertilizer"> str(yields)'data.frame': 72 obs. of 5 variables: $ yield : int 90 95 107 92 89 92 81 92 93 80 ... $ block : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1... $ irrigation: Factor w/ 2 levels "control","irrigated": 1 1... $ density : Factor w/ 3 levels "high","low","medium": 2 2... $ fertilizer: Factor w/ 3 levels "N","NP","P": 1 3 2 1 3 2...
9
Nested AnovaExample:
A designed field experiment on crop yield with three treatments
> model0 <- aov(yield ~ irrigation*density*fertilizer)
## non-nested version (incorrect !!)
> summary(model0) Df Sum Sq Mean Sq F value Pr(>F) irrigation 1 8277.6 8277.6 59.5746 2.813e-10 ***density 2 1758.4 879.2 6.3276 0.0033972 ** fertilizer 2 1977.4 988.7 7.1160 0.0018070 ** irrigation:density 2 2747.0 1373.5 9.8853 0.0002197 ***irrigation:fertilizer 2 953.4 476.7 3.4310 0.0395615 * density:fertilizer 4 304.9 76.2 0.5486 0.7008151 irrigation:density:fertilizer 4 234.7 58.7 0.4223 0.7918283 Residuals 54 7503.0 138.9
Sum 71 23756.44
10
Nested Anova> model1 <- aov(yield ~ irrigation*density*fertilizer +
Error(block/irrigation/density) ) ## Correct nested version, nesting from large to small> summary(model1)Error:block Df Sum Sq Mean Sq F value Pr(>F)Residuals 3 194.44 64.815
Error:block:irrigation Df Sum Sq Mean Sq F value Pr(>F) irrigation 1 8277.6 8277.6 17.59 0.0247 *Residuals 3 1411.8 470.6
Error:block:irrigation:density Df Sum Sq Mean Sq F value Pr(>F) density 2 1758.4 879.18 3.784 0.05318 .irrigation:density 2 2747.0 1373.51 5.912 0.01633 *Residuals 12 2787.9 232.33
Error:Within Df Sum Sq Mean Sq F value Pr(>F) fertilizer 2 1977.4 988.72 11.449 0.000142 ***irrigation:fertilizer 2 953.4 476.72 5.520 0.008108 ** density:fertilizer 4 304.9 76.22 0.883 0.484053 irrigation:density:fertilizer 4 234.7 58.68 0.679 0.610667 Residuals 36 3108.8 86.36
Res Sum 54 7503.00Gd Sum 71 23756.44
11
Nested AnovaComparison between nested and non-nested results
Non-nested Nested Df F value Pr(>F) F value Pr(>F)irrigation 1 59.5746 2.81e-10 17.5896 0.024725density 2 6.3276 0.003397 3.7842 0.053181fertilizer 2 7.1160 0.001807 11.4493 0.000142irrig:dens 2 9.8853 0.000220 5.9119 0.016331irrig:ferti 2 3.4310 0.039562 5.5204 0.008108 dens:ferti 4 0.5486 0.700815 0.8826 0.484053irrig:dens:ferti 4 0.4223 0.791828 0.6795 0.610667
Block
control irrigated
high
low
medium
N P NP
12
Recognizing Nestedness is key !
Being able to distinguish crossed factors (independent from each other) from nested factors is essential.
Nestedness occurs most often from spatial structureStudent surveys in different classes from different schools.
Samples from individual branches on sets of trees within a number of forest patches.
But can also occur from temporal structureSamples taken from the same individuals every fortnight for 2 months on two successive years.
13
When the design is not balanced
We need a different modelling framework:Mixed Effects Models.
So called because they mix together fixed effects and random effects.
Until now, we have only used fixed effects in our models,
each effect having an estimated parameter
(intercept, slope, mean, ...).
But in certain circumstances, these parameters may not
be very informative and one would be better off trying to
"estimate" the underlying distribution they come from.
An example will help clarify the difference between these 2 approaches.
14
> library(nlme) ## package dedicated to mixed effects models> data(Rail)> names(Rail)[1] "Rail" "travel"> stripchart(Rail$travel ~ Rail$Rail, pch=16, ylab="Ultrasonic Travel Time (nanosecs)", xlab="Rail number", vertical =T, col=rainbow(6) )
> abline(h=mean(Rail$travel), col="Gray85", lty=2, lwd=2)
Mixed Effects Modelling Example: railway rails tested for longitudinal stress.
6 rails chosen at random and tested three times with ultrasound.
Classically in a linear model, we would be able to tell whether the rails differ significantly from each other.
But it doesn't help us make predictions about other rails.
15
Mixed Effects Modelling
Random effects: interested in explaining the variance of the response.
Fixed effects: interested in explaining the response itself.
Fixed effects
Male & Female
Control & Treatment
Wet vs. Dry
Light vs. Shade
Random effects
Blocks
Individuals withRepeated measures
Genotypes
Sites
16
Mixed Effects Modelling Back to our example:
> Rail2 <- data.frame(travel=Rail$travel, Rail=factor(as.character(Rail$Rail)) )
> Rail.lm <- lm(travel ~ Rail, data=Rail2) ## LINEAR MODEL
> summary(Rail.lm)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 54.000 2.321 23.262 2.37e-11 ***Rail2 -22.333 3.283 -6.803 1.90e-05 ***Rail3 30.667 3.283 9.341 7.44e-07 ***Rail4 42.000 3.283 12.793 2.36e-08 ***Rail5 -4.000 3.283 -1.218 0.246 Rail6 28.667 3.283 8.732 1.52e-06 ***---Residual standard error: 4.021 on 12 degrees of freedomMultiple R-squared: 0.9796, Adjusted R-squared: 0.9711 F-statistic: 115.2 on 5 and 12 DF, p-value: 1.033e-09
Makes Rail a simple factor
(not an ordered one)
17
Mixed Effects Modelling > anova(Rail.lm)Analysis of Variance Table
Response: travel Df Sum Sq Mean Sq F value Pr(>F) Rail 5 9310.5 1862.1 115.18 1.033e-09 ***Residuals 12 194.0 16.2
## now as a MIXED EFFECT MODEL> Rail.lme <- lme(travel ~ 1, data=Rail, random= ~1|Rail)> summary(Rail.lme)Linear mixed-effects model fit by REMLData: Rail AIC BIC logLik 128.177 130.6766 -61.0885
Random effects: Formula: ~1 | Rail (Intercept) ResidualStdDev: 24.80547 4.020779
Fixed effects: travel ~ 1 Value Std.Error DF t-value p-value(Intercept) 66.5 10.17104 12 6.538173 0
Standard Deviation associated with rails
Standard Deviation of residuals
Grand Average
18
Mixed Effects Modelling > anova(Rail.lm)Analysis of Variance Table
Response: travel Df Sum Sq Mean Sq F value Pr(>F) Rail 5 9310.5 1862.1 115.18 1.033e-09 ***Residuals 12 194.0 16.2
## now as a MIXED EFFECT MODEL> Rail.lme <- lme(travel ~ 1, data=Rail, random= ~1|Rail)> summary(Rail.lme)Linear mixed-effects model fit by REMLData: Rail AIC BIC logLik 128.177 130.6766 -61.0885
Random effects: Formula: ~1 | Rail (Intercept) ResidualStdDev: 24.80547 4.020779
Fixed effects: travel ~ 1 Value Std.Error DF t-value p-value(Intercept) 66.5 10.17104 12 6.538173 0
Standard Deviation associated with rails
Standard Deviation of residuals
Grand Average
19
Mixed Effects Modelling
> Rail.lme$call ## random effect modellme.formula(fixed = travel ~ 1, data = Rail,
random = ~1 | Rail)> AIC(Rail.lme) ## exact AIC version[1] 128.177 > Rail.lm0 <- lm(travel ~ 1, data=Rail2) ## NULL linear model> AIC(Rail.lm0)[1] 167.9265
> Rail.lm$call ## model with Rail as a fixed effect factorlm(formula = travel ~ Rail, data = Rail)> AIC(Rail.lm)[1] 107.8765
Is the Random effect significant ?
Comparing models which only differ in their random effects is easy (with AIC).
Comparing models which differ in their fixed effects is a little harder. Can only be done using "maximum likelihood" (not the default method in lme).
20
Mixed Effects Modelling
> yields <- read.table("splityield.txt", header=T)> yield.lme <- lme(yield ~ irrigation*density*fertilizer, data=yields, random= ~1|block/irrigation/density) > summary(yield.lme)Linear mixed-effects model fit by REML Data: yields AIC BIC logLik 481.6212 525.3789 -218.8106
Random effects: Formula: ~1 | block (Intercept) StdDev: 0.0006600339
Formula: ~1 | irrigation %in% block (Intercept) StdDev: 1.982461
Formula: ~1 | density %in% irrigation %in% block (Intercept) Residual StdDev: 6.975554 9.292805 ...
Applied on the split plot study of crop yield
21
Mixed Effects Modelling Fixed effects: yield ~ irrigation * density * fertilizer Value Std.Error DF t-value p-value(Intercept) 80.50 5.893741 36 13.658558 0.0000irrigirrig 31.75 8.335008 3 3.809234 0.0318dnslow 5.50 8.216282 12 0.669403 0.5159dnsmed 14.75 8.216282 12 1.795216 0.0978fertiNP 5.50 6.571005 36 0.837010 0.4081fertiP 4.50 6.571005 36 0.684827 0.4978irrigirrig:dnslow -39.00 11.619577 12 -3.356404 0.0057irrigirrig:dnsmed -22.25 11.619577 12 -1.914872 0.0796irrigirrig:fertiNP 13.00 9.292805 36 1.398932 0.1704irrigirrig:fertiP 5.50 9.292805 36 0.591856 0.5576dnslow:fertiNP 3.25 9.292805 36 0.349733 0.7286dnsmed:fertiNP -6.75 9.292805 36 -0.726368 0.4723dnslow:fertiP -5.25 9.292805 36 -0.564953 0.5756dnsmed:fertiP -5.50 9.292805 36 -0.591856 0.5576irrigirrig:dnslow:fertiNP 7.75 13.142011 36 0.589712 0.5591irrigirrig:dnsmed:fertiNP 3.75 13.142011 36 0.285344 0.7770irrigirrig:dnslow:fertiP 20.00 13.142011 36 1.521837 0.1368irrigirrig:dnsmed:fertiP 4.00 13.142011 36 0.304367 0.7626
22
Mixed Effects Modelling
> anova(yield.lme) numDF denDF F-value p-value(Intercept) 1 36 2674.6301 <.0001irrigation 1 3 30.9207 0.0115density 2 12 3.7842 0.0532fertilizer 2 36 11.4493 0.0001irrigation:density 2 12 5.9119 0.0163irrigation:fertilizer 2 36 5.5204 0.0081density:fertilizer 4 36 0.8826 0.4841irrigation:density:fertilizer 4 36 0.6795 0.6107
## We should probably remove the three-way interaction
## But if we are fiddling with the fixed effects, we ought ## to fit the model through Maximum Likelihood and base our ## decisions on its AIC values and Likelihood Ratio Tests
> yield.lme.ml <- update(yield.lme, ~. ,method="ML")
> AIC(yield.lme.ml)[1] 573.5108
23
Mixed Effects Modelling > yield.lme.ml2 <- update(yield.lme.ml, ~.
- irrigation:density:fertilizer)
> yield.lme.ml2$method[1] "ML" ## just checking that update() kept using "ML"
> AIC(yield.lme.ml2)[1] 569.0046 ## an improvement
> anova(yield.lme.ml2) numDF denDF F-value p-value(Intercept) 1 40 2872.7394 <.0001irrigation 1 3 33.2110 0.0104density 2 12 4.0645 0.0449fertilizer 2 40 11.4341 0.0001irrigation:density 2 12 6.3499 0.0132irrigation:fertilizer 2 40 5.5131 0.0077density:fertilizer 4 40 0.8815 0.4837
> yield.lme.ml3 <- update(yield.lme.ml2, ~. - density:fertilizer)
> AIC(yield.mle.lm3)[1] 565.1933
24
Mixed Effects Modelling > anova(yield.lme.ml, yield.lme.ml2) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml 1 22 573.5108 623.5974 -264.7554 yield.lme.ml2 2 18 569.0046 609.9845 -266.5023 1vs2 3.49379 0.4788
> anova(yield.lme.ml2, yield.lme.ml3) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml2 1 18 569.0046 609.9845 -266.5023 yield.lme.ml3 2 14 565.1933 597.0667 -268.5967 1vs2 4.18877 0.3811
> anova(yield.lme.ml3) numDF denDF F-value p-value(Intercept) 1 44 3070.8771 <.0001irrigation 1 3 35.5016 0.0095density 2 12 4.3448 0.0381fertilizer 2 44 11.2013 0.0001irrigation:density 2 12 6.7878 0.0107irrigation:fertilizer 2 44 5.4008 0.0080> yield.lme.ml4 <- update(yield.lme.ml3, ~. –irrigation:density)> AIC(yield.mle.ml4)[1] 572.9022
25
Mixed Effects Modelling > anova(yield.lme.m3, yield.lme.ml4) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml3 1 14 565.1933 597.0667 -268.5967 yield.lme.ml4 2 12 572.9022 600.2221 -274.4511 1vs2 11.7088 0.0029
> anova(yield.lme.ml4) numDF denDF F-value p-value(Intercept) 1 44 2138.9678 <.0001irrigation 1 3 24.7281 0.0156density 2 14 2.6264 0.1075fertilizer 2 44 11.5626 0.0001irrigation:fertilizer 2 44 5.5750 0.0069
## here comes Model Checking
> shapiro.test(yield.lme.ml3$residuals[,"fixed"]) # Best Model
Shapiro-Wilk normality test
data: yield.lme.ml3$residuals[, "fixed"] W = 0.9797, p-value = 0.2958
ml4 is one simplification too far
matrix column
26
Mixed Effects Modelling
> res <- yield.lme.ml3$resid[,"fixed"]> st.res <- res/sd(res)
> qqnorm(st.res, pch=16, main="")> qqline(st.res, col="red", lwd=2)
> res <- yield.lme.ml3$resid[,4] > st.res <- res/sd(res)
> qqnorm(st.res, pch=16, main="")> qqline(st.res, col="red", lwd=2)
including all random effects
excluding all random effects
27
Mixed Effects Modelling
> plot(yield.lme.ml3) ## by default Residuals vs Fitted values
> plot(yield.lme.ml3, yield ~ fitted(.) ) ## Observed vs Fitted values
28
Mixed Effects Modelling
> qqnorm(yield.lme.ml3, ~resid(.) | block) ## qqplot but broken down by block