los angeles r users group - july 12 2011 - part 1
TRANSCRIPT
Using R for multilevel modeling of salmon habitatYasmin Lucero, Statistical Consultant
Kelly Burnett, PNW Research Station, USFSKelly Christiansen, PNW Research Station, USFSE. Ashley Steel, PNW Research Station, USFSEli Holmes, NW Fisheries Science Center, NOAA
Acknowledgements: NRC-RAP, National Academy of SciencesISEMP Monitoring Program, NOAA
Outline
• Background on fish ecology and the data
• Background on multilevel modeling
• Demo of lme4 package in R
Schooling Juvenile Coho Salmon
The big goal: measure effect of stream habitat quality on fish survival
Photo by David Wolman
Land Area Affected by Land Area Affected by
Endangered SpeciesEndangered Species
Act Listings of SalmonAct Listings of Salmon
& Steelhead& Steelhead
* 28 distinct population segments:
6 endangered, 22 threatened
* 176,000 sq. miles in Washington,
Oregon, Idaho & California
* 61% of Washington’s land area,
55% of Oregon’s, 26% of Idaho’s, &
32% of California’s
February 2008
study area
The Data
~266 study sites
Oregon coastal region
juvenile coho salmon habitat
sparsely sampled, longitudinal study design
12 year time series
35 data layers
~100 landscape level variates
~22 habitat level variates
Oregon
Abundance increases over time due to variation in Ocean conditions (i.e. external to our analysis)
coho.obs
fs.year
fs.coho.obs
0
2
4
6
8
●●●
●
●●●
●
●●●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
●
●
● ●●
●
●
●
●
●
●
●
1998 2000 2002 2004 2006 20080.2
0.4
0.6
0.8
1.0
coho.obs
year
coefficient
Sparsely sampled longitudinal data
year
fs.coho.obs
0.00.51.01.52.02.53.0
0.00.51.01.52.02.53.0
0.00.51.01.52.02.53.0
0.00.51.01.52.02.53.0
17100201010102
●
●
●●
●
17100203040402
●●●● ●●
●
17100204050303
●
●
●●
●
●
●
●
●
●
●
●●
17100206010603
●●●● ●
●
●
●
●●
●
199820002002200420062008
17100202030201
● ●●
●
●●
●●
●
●
●●
17100203040602
●●● ●
●●
●
●
●
●●
●
17100205040105
●●● ● ●
17100303080202
● ● ● ●
199820002002200420062008
17100203020501
●
●
●●●●
●
●
●
●
●●
●
●
17100203070101●
●
●
●
17100205070202
●● ● ●● ● ●●●● ● ●
17100304010604
●●● ●
●●
●
●●●● ● ●
199820002002200420062008
17100203020902
●● ●
●
● ●● ●●
17100203090101
●● ●● ●●● ●●
17100206010504
●●
●●
17100305060202
●● ● ●●
●●
199820002002200420062008
• Only fish data has time component• year effects exogenous• Landscape data everywhere• Habitat data some places• Fish data some places• Not always same places
Figure Legend. Mean density of coho at 16 frequently visited sites for 1998–2009
How the landscape data is acquired
GIS map layerssummarize across area surrounding
study site
shallow, highly channelized
high structure: rocks and woody debris
habitat level data is collected by survey visits: labor intensive to collect/therefore less abundant
gradientpool densitydebrisflow ratesdrainage areachannel widthetc.
Multilevel structure for two reasons
landscape
habitat
fish
Multi-level structure for two reasons: (1) longitudinal sampling design(2) varying scales of predictors
Generalized linear mixed models(aka hierarchical, multilevel, or random effects models)
state
school
class
class
class
school
class
class
class
school
class
class
class
student_score ~ class_average + school_average + state_average
canonical example: school test scores
state
school 2school 1 school 4school 3
class 1 class 2 class 3 class 4
student 1 student 2 student 3 student 4
state level predictors
school level predictors
class level predictors
student level predictors
Norm(0, !2state)
Norm(µstate1, !2school)
Norm(µschool1, !2class)
Norm(µclass3, !2student)
Our model structure is not so complicated
global
site 1 site 2 site 3 site 4
obs 1 obs 2 obs 3 obs 4
landscape level predictors
habitat level predictors& year effects
Modeling presence/absence of fish:logistic mixed model with site and year effects
logit(Pr{yi = 1}) = !yearxy + !1xh1 + !1xh2 + "site! ! year+ "h1xh1 + "h2xh2 + ...+ #site
!site ! Norm("l1xl1 + "l2xl2 + ... ,#2site)
year effects
habitat level predictors
site effects
landscape level predictors
Fit a lot of models, some predictors rose to the top
−620 −600 −580 −560 −540 −520
1100
1150
1200
1250
1300
logLik
AIC
m1m2
m3m4m5m6
m7m8
m9
m10
m11
m12
m13
m14
m15
m16
m17m18
m19
m20
m21
m22m23m24
m25m26m27m28
m29m30m31
m32m33m34
Best predictors:
gradientdebris leveldrainage area
mean elevation
Overall model performance is strong at some things, weak at others
fitted probabilities
fitted(models.ls$m24)
coun
t
0
200
400
600
800
0.0 0.2 0.4 0.6 0.8 1.0
●
●
●
●
●
●
●
●
●
●
●
●●●●
●●●●
●
●●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●●●●●●
●●●
●
●●
●
●
●
●
●●
●●
●●
●●
●●
●●
●●●●
●
●
●
●
●
●●●
●●●
●●
●●●
●●●●
●●●●●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●●
●
●●●●
0 1
0.0
0.2
0.4
0.6
0.8
1.0
fitte
d pr
obab
ilitie
s
presenceabsence
fitte
d pr
obab
ility
histogram of fitted probabilities
Another look at model fit: some heavy outliers
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.4
0.8
fitted
p/a
of coho o
bs (
data
)
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
~
pa.obsfs.year + (fs.grad.rs + fs.cfs.down.rs + fs.vol.len.rs + el.mean.rs | catchment) - 1
conclusions
• site matters
• we can explain about half of the variation in why site matters with 4-5 predictors
• habitat data more valuable than landscape data
• small number of predictions are very wrong, and we can’t seem to improve them
Thanks. [email protected]
Model predicted probabilities given presence/absence with and without site effects
FALSE TRUE
0.0
0.2
0.4
0.6
0.8
1.0
m0
coho presence
Pr{c
oho
pres
ent}
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
FALSE TRUE0.
00.
20.
40.
60.
81.
0
m1
coho presence
Pr{c
oho
pres
ent}