running title: predicting plant performance with bayesian...
TRANSCRIPT
1
Running Title: Predicting plant performance with Bayesian models
Combining mesocosm and field experiments to predict invasive plant
performance: A hierarchical Bayesian approach
CHRIS WILSON1,5
T. TREVOR CAUGHLIN2
DAVID J. CIVITELLO3
S. LUKE FLORY4
Statement of authorship: SLF and DJC designed experiments and collected data, CW and
TTC performed modeling work, CW wrote first draft of manuscript, all contributed to
writing and revising.
1School of Natural Resources and Environment, University of Florida, Gainesville, FL
32611
2Department of Biology, University of Florida, Gainesville, FL, 32611
3Department of Integrative Biology, University of South Florida, 4202 E Fowler Ave,
Tampa, FL 33620
4Agronomy Department, University of Florida, Gainesville, FL 32611
5 Corresponding author
CORRESPONDENCE: Chris Wilson, 352-519-7653 (cell), [email protected]
2
Abstract 1
Invasive plant fecundity underlies propagule pressure and ultimately range expansion. Predicting 2
fecundity across large spatial scales, from regions to landscapes, is critical for understanding 3
invasion dynamics and optimizing management. However, to accurately predict fecundity and 4
other demographic processes, improved models that scale individual plant responses to abiotic 5
drivers across heterogeneous environments are needed. Here we combine two experimental 6
datasets to predict fecundity of a widespread and problematic invasive grass over large spatial 7
scales. First, we analyzed seed production as a function of plant biomass in a small-scale 8
mesocosm experiment with manipulated light levels. Then, in a field introduction experiment, 9
we tracked plant performance across 21 common garden sites that differed widely in available 10
light and other factors. We jointly analyzed these data using a Bayesian Hierarchical Model 11
(BHM) framework to predict fecundity as a function of light in the field. Our analysis reveals 12
that the invasive species is likely to produce sufficient seed to overwhelm establishment 13
resistance, even in deeply shaded environments, and is likely seed-limited across much of its 14
range. Finally, we extend this framework to address the general problem of how to scale-up plant 15
demographic processes and analyze the factors that control plant distribution and abundance at 16
large scales. 17
18
Key words: Species distribution models, fecundity, seed production, ecological niche, 19
hierarchical Bayes, invasive species, mechanistic models, scaling-up, Microstegium vimineum, 20
Stiltgrass, plant demography 21
22
23
3
24
Introduction 25
Fecundity is a key demographic parameter underlying propagule pressure (Holle and 26
Simberloff 2005) and, along with dispersal, is critical for establishment success and range 27
expansion. Given the interactive threats to biodiversity from climate change, habitat loss, and 28
invasive species (Brook et al. 2008), there is growing urgency to predict the demographic 29
processes underlying species distributions, including fecundity. One current approach to 30
modeling plant range expansion is to estimate the relationship between plant size and 31
demographic parameters, such as fecundity, and then link plant size to environmental covariates 32
via regression (Diez et al. 2014, Merow et al. 2014). On its own, this correlative approach does 33
not explicitly account for the mechanisms linking abiotic factors to plant biomass production, 34
and therefore may be vulnerable to cause-effect errors when attempting to predict fecundity 35
across diverse habitats that encompass significant variability in abiotic resources. Moreover, 36
directly measuring fecundity in the field is often logistically difficult and ethically questionable, 37
particularly for non-native invasive plant species (Flory et al. 2011). Thus, in order to model 38
invasive plant range expansion or success across heterogeneous environments, we propose an 39
approach to modeling seed production based on an abiotic factor that is manipulated in 40
controlled experiments and readily measured in the field. 41
Many abiotic and biotic factors interact to determine demographic processes such as 42
seed production, making it difficult to disentangle the relationship between any given abiotic 43
factor and an observed species’ distribution pattern. Additionally, predictive models must 44
account for environmental heterogeneity in order to scale across habitats. Previous approaches to 45
predicting species distributions as a function of abiotic factors have included controlled 46
4
manipulative experiments to delineate the “fundamental niche” of a species (Chase and Leibold 47
2003). However, such studies are logistically limited to few abiotic factors (Clancy and King 48
1993, Hooper et al. 2008) and, maybe more importantly, they ignore (by definition) biotic 49
interactions and processes. Thus, their predictive power in the field is limited (Schurr et al. 50
2012). On the other hand, correlative predictive models such as “bioclimate envelopes” (Kearney 51
and Porter 2009) tacitly incorporate both biotic and abiotic interactions but do not explicitly 52
model demographic processes such as seed production (Gallien et al. 2010, Merow et al. 2011). 53
Thus, such models cannot identify which mechanisms drive patterns in distribution or abundance 54
(Guisan and Thuiller 2005). More recent niche modeling work has used an Integral Projection 55
Model (IPM) framework to link demographic processes and environmental covariates, while also 56
accounting for population variability (Diez et al. 2014). Since observational correlations of 57
abundance with environmental variables have been shown to generate unreliable predictions 58
when extrapolated from one geographic range to another (Bahn and McGill 2013), it may be 59
preferable to link demographic responses to abiotic factors through experimental introductions 60
that eliminate confounding effects of dispersal, and other biotic interactions. Thus, to accurately 61
predict the performance of invasive species at particular habitats or sites, and their potential to 62
spread at the landscape scale, a hybrid approach that combines the causal insight of manipulative 63
approaches with the ecological realism of performance data under natural conditions is needed 64
(Kearney and Porter 2009). 65
Although the need to address spatial scale has long been discussed in ecology (Levin 66
1992), analytical methods for combining datasets from different scales and contexts into 67
quantitative, predictive models have been lacking. Bayesian Hierarchical Models (BHM) can 68
facilitate the integration of ecological datasets by offering significant flexibility in model 69
5
specification (Gelman and Hill 2007). Moreover, by explicitly modeling variance at all 70
hierarchical levels, BHM predictions incorporate uncertainty from all parts of the model, in 71
contrast to predictive models built on point estimates (Cressie et al. 2009, Gelman et al. 2013, 72
Ogle et al. 2014). BHM have been used to infer seed production as a function of fire frequency 73
(Evans et al. 2010) and forest type (Caughlin et al. 2014) using observational data and linear 74
correlations. However, this approach could be greatly strengthened by incorporating 75
manipulative experiments and nonlinear relationships that reflect key mechanisms driving 76
demographic performance. BHMs provide an intuitive and useful context for this integration by: 77
a) allowing specification of informative priors based on accurate manipulative experiments, and 78
b) fully modeling variance at all hierarchical levels of the model (Gelman et al. 2013), thus 79
quantifying the distribution of site-to-site variability. 80
Here we demonstrate how field and mesocosm datasets can be linked with BHM to generate 81
predictions for the field performance of an aggressive and widespread invasive plant species. 82
Using Microstegium vimineum (Trin.) A. Camus (stiltgrass), a non-native grass distributed 83
throughout forests of the eastern US (Fairbrothers and Gray 1972) as a model system, we 84
combine data from a mesocosm experiment where we manipulated light availability and 85
measured M. vimineum biomass and seed production with data from experimental introductions 86
of M. vimineum across heterogeneous field sites, where light availability was the strongest 87
covariate of biomass production (Flory et al. 2011). Because we measured available light at each 88
of the field introduction sites, these two datasets are, in combination, well suited to scaling 89
predictions of M. vimineum performance as a function of light. Therefore, we combined these 90
data using BHMs to mechanistically link biomass production to available light in the field, while 91
simultaneously using the estimated size-fecundity relationship to predict seed production as a 92
6
function of biomass. We then used our models to predict relative performance across habitats 93
within our model species’ geographic range to determine first, how much M. vimineum seed 94
production varies with available light in the field, and second, if M. vimineum seed production is 95
likely to provide adequate propagule pressure to colonize and sustain invasions in forest interior 96
habitats given realistic establishment resistance. 97
98
METHODS 99
Mesocosm experiment 100
To determine how light availability affects M. vimineum performance and fecundity, we 101
established an outdoor mesocosm experiment in June 2010. We sowed twenty seeds of M. 102
vimineum (source: Big Oaks National Wildlife Refuge, Madison, IN) into 2 gal plastic containers 103
filled with composted greenhouse soil. These three containers were then placed into each of eight 104
different light treatments that ranged from 1.59-100% full sun (AccuPAR Linear PAR/LAI 105
ceptometer, Decagon Devices, Inc., Pullman, Washington). Light treatments were replicated 106
three times (24 total tents, 72 total containers). We constructed each shade tent out of poly-vinyl 107
chloride (PVC) piping, forming 1m x 1m cubes that we covered with 1-4 layers of various types 108
of shade cloth to achieve a range of light conditions. Uncovered tent frames were defined as the 109
full sun treatment and we calculated relative light availability as the ratio of light intensity in a 110
treatment to the full sun treatment. We observed no herbivory on the plants during the course of 111
the experiment. Seed germination was tabulated after three weeks and seedlings thinned to three 112
per container. We watered containers to capacity twice per day using automatic drip irrigation 113
and supplied plants with a 20-20-20 (NPK) fertilizer solution once per month over the course of 114
the experiment. 115
7
M. vimineum seed production begins in early October in southern Indiana and continues 116
until temperatures are below freezing. Because seeds immediately drop from plants once they are 117
mature, we lined each shade treatment structure with water permeable landscape cloth to capture 118
seed as it dropped. Since M. vimineum has a mixed mating system (produces both 119
chasmogomous (CH) and cleistogamous (CL) seed), we counted all seed that dropped into the 120
landscape fabric or remained attached to exposed seed heads as CH and dissected all M. 121
vimineum stems to quantify CL seed production (Cheplick 2008). In late November, after all 122
seed had fallen from the plants, aboveground biomass was harvested, dried at 60 °C to constant 123
mass, and weighed. 124
125
Field introduction experiment 126
To evaluate the performance of M. vimineum under natural environmental conditions, we 127
established a field introduction experiment at the Griffy Woods (39°11′17″ N, 86°30′18″ W) and 128
Bayles Road (39°13′9″ N, 86°32′29″ W) properties of the Indiana University Research and 129
Teaching Preserve in May 2009 (Flory et al. 2011). We selected 21 introduction sites across a 130
wide range of habitats including shaded bottomland forest, mowed open fields, dry forested ridge 131
tops, forested stream banks, and lowland and upland forest edge habitat (Fig. 1). Environmental 132
conditions at the sites, including light availability, soil characteristics and plant communities 133
varied widely (Flory et al. 2011). No introduction sites had invasive M. vimineum in the 134
immediate area but populations of M. vimineum generally occurred within one km. 135
In April 2009, we germinated seed and grew seedlings in the Indiana University 136
greenhouses from 10 M. vimineum populations collected from 10 different states in the invasive 137
U.S. range (Flory et al. 2011). We grew seedlings for four weeks in individual cell packs. Then, 138
8
we transplanted two seedlings from each of the 10 populations into each of the 21 experimental 139
introduction sites on a random 5 x 8 grid with 0.5 m spacing. At planting time, the transplanted 140
seedlings were 5-10 cm tall and were approximately uniform in height across populations. We 141
assumed that seedlings that died within one week (~15/440; < 4%) had died from transplant 142
shock and we therefore replaced them. Although seed germination and seedling establishment is 143
often an important key life history transition (Moles and Westoby 2004), we planted seedlings to 144
prevent accidental release of novel genotypes into natural areas. 145
In late August 2009, three months after planting, all plants were harvested at ground 146
level, dried at 60 °C to constant mass, and weighed. Belowground biomass was not harvested 147
because at most common gardens it was not possible to separate roots of M. vimineum from co-148
occurring resident species. Plants were deliberately harvested before flowering and seed set to 149
prevent genetic mixing and escape into surrounding habitats. 150
151
Modeling 152
Overall, the aim of our modeling was to combine the size-dependent fecundity 153
relationship that we quantified in the mesocosm experiment, with plant growth data across 154
realistic environmental gradients from the field introduction. We use BHMs to reconcile these 155
data and to generate predictions for seed production of individual plants in unobserved sites that 156
span the range of 1-100% available light. These predictions account for uncertainty in all parts of 157
the model and are useful for answering ecological questions about which factors control M. 158
vimineum distribution across its introduced range. 159
9
Mesocosm experiment: To quantify the effects of light on M. vimineum growth in the 160
mesocosm experiment we assumed that light intensity, L, had a saturating effect on biomass 161
production, B, and used these data to parameterize a Michaelis-Menton (MM) function: 162
(1) B = Bmax*L/(h + L) 163
This function has two parameters, an asymptotic biomass (Bmax) and a half-saturation constant 164
(h), which represents the light intensity that yields half of Bmax. We treated biomass as a gamma-165
distributed random variable because biomass data are positive-only, often right-skewed, and 166
have variance proportional to the mean (Bolker 2008, Gelman et al. 2013). Thus, for the ith 167
observation, Yi, we parameterized the gamma distribution in terms of a mean (u) and a dispersion 168
constant (k): 169
(2) Yi ~ Gamma(ui, k) 170
(3) ui = (Bmax*Li)/(h + Li) 171
where Li is the light level associated with the ith biomass observation. We sampled a 95% 172
credible interval from the posterior predictive distribution for biomass and compared it to the 173
field introduction data to assess whether the results of the mesocosm could predict biomass in the 174
field introductions. 175
176
Next, to understand the size-dependent reproduction of M. vimineum, we modeled total 177
(CH + CL) seed production as a simple power law function of biomass with a negative binomial 178
error distribution: 179
(4) Si ~ Neg.Bin.(ui, k) 180
(5) ui = c*Xid 181
10
where Si is the estimated number of seeds produced per individual, and Xi is the biomass of the ith 182
individual, and the parameters to be estimated are the constant c, the power-law exponent d, and 183
the over-dispersion parameter k. 184
Field introduction experiment: Next, we built a multilevel model to predict biomass 185
production as a function of light while accounting for variance within and across sites for the 186
field introduction experiment. Because we measured environmental covariates at the site level, 187
we fit the relationship to light at the site level and modeled individual biomass observations as 188
gamma distributed within a site. The basic model is: 189
(6) Yij ~ Gamma(uj1, k1) 190
(7) uj1 ~ Gamma(uj2, k2) 191
(8) uj2 = Bmax*Lj/(h + Lj) 192
where i = number of individual biomass observations (n = 381), and j = number of sites (n = 21), 193
Yij is the ith biomass observation in site j, and Lj represents the jth value of available light. The 194
parameters to be estimated are: Bmax, h, k1 and k2. In essence, this model estimates the parameters 195
of the MM function globally, with individuals nested within sites treated as random effects. 196
Because we had already parameterized the MM function with the mesocosm data, we used the 197
posterior sampling distribution of the MM parameters from the mesocosm model to set 198
informative priors for the multi-level biomass model. We then compared the posterior sampling 199
distribution of MM parameters from the multi-level model to their priors in order to assess the 200
information contributed by the field introduction data. In order to analyze the relative importance 201
of among-site variability compared to within-site variability, we examined the 95% credible 202
intervals of the site-level dispersion constant k2 and individual-level dispersion constant k1. 203
Specifically, we created a test statistic (ktest = k2/k1) in order to evaluate the probability that the 204
11
site-level dispersion term was higher than the individual level dispersion term. Given the 205
estimates of all parameters from our data, we predicted the biomass of unobserved individuals 206
nested within 100 unobserved sites, ranging from 1% to 100% available light. We then sampled 207
the posterior distribution for each of these predictions in order to define the probability contour 208
graphs in Fig. 3a. 209
In order to verify that the fit of our hierarchical biomass model to data (Fig. 3a) was not 210
overly sensitive to the inclusion of data from any particular site, we generated a cross-validated 211
posterior predictive distribution (see Appendix D, Fig. D1). Since we modeled biomass 212
observations as exchangeable within a site (given a known light level), we chose to hold-out 213
entire sites and then predict their mean biomass with a model absent their data, a form of leave-214
one-out cross-validation modified for a hierarchical model such as ours (see also Marshall and 215
Spiegelhalter 2003). We compared the mean squared error (MSE) for the cross-validated 216
predictive distribution to the regular posterior predictive distribution in order to assess sensitivity 217
to inclusion of the site being predicted. 218
Finally, we integrated the hierarchical biomass and size-fecundity seed models to predict 219
seed production as a function of light in the field. The full model takes the biomass predictions 220
(from equation (6)) for an individual nested within a site of a given light level (Yij ) (Fig. 3a) and 221
propagates them together with the seed production model, equations (4) and (5) (Fig. 3b), in 222
order to generate posterior predictions for seed production as a function of available light (Fig. 223
3c). Because BHMs estimate all parameters conditional upon one another, these predictions 224
correctly propagate the uncertainty in both parts of the model (Cressie et al. 2009). Therefore, the 225
posterior predictive intervals represent the probabilities that the seed production of a specific 226
individual plant growing within a site of a certain light level will fall within a given range. In 227
12
particular, we analyzed the posterior seed production predictions at sites of very low light (2%, 228
5%, and 10% available light) compared to a site of high available light (80%). We calculated the 229
probability than an individual plant’s seed production would exceed 10 or 40 seeds, a range in 230
which prior work has shown that propagule pressure overwhelms forest litter-layer resistance to 231
seedling recruitment (Warren et al. 2012). 232
We analyzed all models in R (R version 3.0.2, R Core Development Team) and JAGS 233
(version 3.4.0) using the “R2JAGS” package. We fitted each model via Gibbs Sampling Markov-234
Chain Monte Carlo (MCMC) methods. In each case, we ran 5 independent Markov chains for 235
12,000 iterations, with a 2,000 iteration burn-in period and then thinned by 50 iterations, used the 236
r-hat < 1.1. criterion to assess MCMC convergence, and visually checked for well-blended 237
chains (Gelman et al. 2013). 238
239
RESULTS 240
In both experiments, biomass increased nonlinearly with light availability such that 241
above ~50% light availability further increases in light are not associated consistently with 242
greater biomass production. In the mesocosm biomass model, the posterior distribution of the 243
asymptote parameter is 18.2 ± 2.3 g (mean ± sd dry weight) and the half-saturation constant 244
posterior is 40.4 ± 10.5% available light (Table 1). As expected, seed production is highly 245
correlated with biomass (Fig. 3b). At peak biomass observed in the mesocosm experiment (~ 30 246
g), seed production per plant is between 6000-7000. 247
On its own, the results of the mesocosm experiment do not adequately predict biomass in 248
the field introductions. Several sites fall outside the 95% posterior predictive interval altogether 249
and many more have at least several points outside the interval (Fig. 2b). Thus, among-site 250
13
variation, which the mesocosm experiment cannot assess, clearly influences M. vimineum 251
performance. 252
The hierarchical biomass model, which explicitly accounts for among-site variation, fit 253
the field introduction data much better (Fig. 3a). In addition, the posterior sampling distributions 254
of the MM parameters have less uncertainty than in the mesocosm model (Table 1). Thus, the 255
field introduction data provide additional information for understanding the underlying biomass-256
light availability relationship, in addition to accounting for among-site variability. Moreover, the 257
cross-validated MSE (29.21) was very close to the standard posterior predictive distribution 258
MSE (28.61), confirming that our model is not sensitive to the inclusion/exclusion of the 259
particular site being predicted (see Appendix D, Fig. D1) 260
Our analysis suggests that among-site variability in M. vimineum biomass production is at 261
least as large as within-site variability. The 95% credible interval of the site-level dispersion 262
constant k2 is [1.472; 5.989] and the individual-level dispersion constant k1 is [1.61; 2.23]. Our 263
test statistic result was that (p[ktest>1]) = 0.878 (appendix A, Fig. A1). Thus, averaging over the 264
uncertainty in both estimates (Gelman et al. 2013) there is an 87.8% probability that the site-265
level dispersion exceeds the individual-level dispersion. This result is consistent with our 266
observation that incorporating both levels of variance into our predictions (via the hierarchical 267
model structure) significantly enhanced model fit to real data (Fig. 3a, compared to Fig. 2b). 268
The hierarchical fecundity model predicts that, in the field, seed production will have a 269
similar saturating response to light availability as does biomass (Fig. 3c). In full sun, mean 270
fecundity exceeds 2000 seeds/plant. At all levels, the distribution of predictions has a positive 271
skew. Therefore, even at low light levels there is a small probability of large seed production that 272
significantly exceeds the mean. 273
14
Our predictions of seed production are qualitatively distinct in sites of very low versus 274
high light (Fig. 4 and Table 2). At low light levels, there is a significant probability that an 275
individual plant will not produce seed. However, even at 2% light, there is a 2% probability that 276
seed production will exceed 2000. Although there is a decent probability that an individual plant 277
in a site with 2% light availability will not produce seeds (0.673), this probability is reduced by 278
50% when light availability is at 5% (0.366) (Table 2). 279
280
DISCUSSION 281
To predict invasive plant range expansion and performance across variable habitats it is 282
necessary to understand how plant demographic processes respond to environmental factors at 283
various spatial scales (Diez and Pulliam 2007). Here we linked light availability across field sites 284
to plant productivity in order to predict a specific demographic process (seed production) that is 285
otherwise difficult to measure in the field, particularly for invasive species. First we used a 286
manipulative experiment to understand the fundamental plant response to an abiotic driver 287
(Chase and Leibold 2003). Then, to generate predictions that scale across habitats and account 288
for environmental heterogeneity, we extended this mechanistic insight to large spatial scales by 289
combining our mesocosm experiment with field introduction data using a BHM framework. 290
Our results indicate that M. vimineum seed production in the field is strongly linked to light 291
availability, despite including uncertainty from both individual and site-level variability. 292
However, even in low light habitats, our model suggests that seed production is likely high 293
enough to overwhelm establishment resistance (Warren et al. 2012). This finding suggests that 294
the distribution of our model species within its geographic range is not likely to be limited by 295
light availability alone, although its abundance will certainly be greater in higher light habitats 296
15
(e.g. open fields) than in deeply shaded forest interiors (Cole and Weltzin 2005, Flory 2010). For 297
sites with intermediate or high light availability (30 – 100% light), our model predicts that 298
individuals can produce approximately 2000 seeds on average. Germination of M. vimineum 299
seeds is consistently high at all light levels, and seed viability from plants grown in various light 300
environments is similarly high (as seen in Appendix C, Fig.C1 and Fig. C2) and we expect seed 301
production to translate relatively consistently into germinating seedlings (Warren et al. 2011). 302
Therefore, we infer that range expansion of M. vimineum is limited by seed arrival, because 303
newly established stands are unlikely to fail due to inadequate local seed production and 304
propagule pressure. 305
Our model suggests that the variability among experimental introduction sites was at least as 306
important, and likely more so, than variability within sites (prob = 0.878). Thus, when scaling 307
across sites, the relationship between focal abiotic factors and demographic processes is 308
complicated by numerous environmental factors. Accounting for the resulting variability is 309
central to the task of predicting distribution in the field, not merely a nuisance factor. For 310
instance, ignoring the stochasticity introduced by these additional factors and their interactions, 311
equivalent to fixing the among-site variability in the model at zero, would lead to artificially 312
precise prediction intervals. Therefore, by including among-site variability in our hierarchical 313
model, estimated from our heterogeneous field data, we can scale-up demographic predictions 314
from mesocosm experiments to the field. 315
Another approach would have been to include additional environmental covariates, alongside 316
available light, to try and account for some of the among-site variance. For instance, soil 317
resource supply is a major determinant of plant biomass production. Therefore, we recommend 318
that future mesocosm experiments include a cross-factorial manipulation of an integrative soil 319
16
variable (e.g., average soil moisture supply) alongside available light in order to improve 320
predictive accuracy and biological realism (Tilman 1982). Using a BHM framework, there are 321
several ways that a soil variable could be integrated into the mechanistic function for predicting 322
biomass as a function of light. For instance, the asymptotic biomass (Bmax) could be allowed to 323
vary for each site based on a regression with soil moisture supply, using informative priors from 324
the cross-factorial manipulation. Such an expansion of the experimental design and model 325
framework would account for more of the among-site variance than our present single-factor 326
model, but at the expense of increased logistical and model complexity. 327
Validation of our model predictions is complicated by the main consideration that motivated 328
model development in the first place: it is logistically and ethically challenging to accurately 329
quantify seed production in the field. Nonetheless, sub-components of the model can be 330
validated with future data collection, or by approximation with procedures such as cross-331
validation. Although leave-one-out cross-validation is often computationally expensive, in 332
hierarchical models one option is to iteratively hold out and predict groups rather than individual 333
data-points (Marshall and Spiegelhalter 2003). Our cross-validated predictive distribution for site 334
biomass means (as seen in Appendix D, Fig. D1) is basically the same as our regular posterior 335
predictive distribution (Fig. 3b): half of our observed site means fall outside the cross-validated 336
50% predictive confidence interval, and one observed mean fell outside the cross-validated 95% 337
interval, which are about the expected numbers for each interval. 338
One limitation of our model is the need to extrapolate the performance of plants transplanted 339
individually to natural stand dynamics in which density-dependent effects operate. For instance, 340
previous work has established that as M. vimineum stand density increases, individual biomass 341
decreases, although stand-level biomass remains stable or even increases slightly at higher 342
17
densities (Cheplick and Fox 2011). However, we argue that our individual-level predictions are 343
ideal for addressing the spread and establishment phases of the invasion process, rather than for 344
providing insight into dense, long-established stands. Moreover, a distinct advantage of 345
generating clear-cut individual-level predictions is that we can infer a role for micro-site 346
variability, such as canopy variability in forest interiors, in maintaining local populations. 347
Although our model indicates that small populations may not persist in very low light forest 348
interiors (e.g. < 2-5% light), there is always a small probability of high seed producing 349
individuals within an established stand. Generalizing to forest interiors, we speculate that M. 350
vimineum is likely exploiting light gaps and microsite variability in light availability (Horton and 351
Neufeld 1998). Interestingly, our results suggest that micro-site variability in light may influence 352
whether M. vimineum is limited by seed arrival (seed limitation) or growth and survival 353
(establishment limitation) after seeds have arrived (Clark et al. 2007). Given the differences in 354
predictions between 2%, 5% and 10% light, it seems likely that a few plants may be capable of 355
subsidizing seed production for an entire stand where many individuals produce little if any seed. 356
Conversely, given the dramatic increases in seed production with increasing light, our models 357
suggest that disturbances that increase light availability in forest understories (e.g. timber 358
harvests or natural blowdowns) are likely to strongly promote M. vimineum abundance. 359
Our modeling framework has direct applications to analyzing invasive and native plant 360
species range expansion and performance. For instance, Harris et al. (2011) fit empirical 361
relationships between fecundity and habitat type (varying in type and cover of canopy) for 362
Rhododendron ponticum, and used these estimates of fecundity in an individual-based model to 363
predict speed of range expansion. BHMs help generalize this approach because they facilitate 364
robust predictions for species whose seed production, unlike R. ponticum, is difficult to measure 365
18
in the field. Moreover, as Kearney and Porter (2009) point out, correlative approaches linking 366
habitat variables and demographic processes are inappropriate for invasive species undergoing 367
range expansion because the species’ distribution is far from equilibrium. For invasive species, it 368
may be especially important to experimentally manipulate abiotic factors in order to understand 369
the fundamental plant responses to these drivers. Our modeling framework provides a way to 370
scale responses to experimental manipulations to the field. 371
The approach we implemented here could be applied more broadly to analyzing the 372
ecological niche of plant species. For instance, in analyzing the large-scale distributional limits 373
of a species based on temperature, one could parameterize a thermal response curve under 374
controlled conditions, and then use those estimates as informative priors to fit field data with a 375
hierarchical Bayesian model for field observations. We strongly advocate the use of mechanistic 376
functions, that describe process-based relationships between abiotic factors and biotic processes 377
(Kearney et al. 2008). This should help minimize predictive errors that arise when predicting 378
species’ distribution with correlative linear models that often include arbitrary polynomial terms 379
(Bahn and McGill 2013). In general, a hierarchical model structure will facilitate a flexible 380
analysis of variability across many levels of biological organization including individuals, sites, 381
and regions. Thus, such models are better suited to predict across a geographic range, or when 382
extrapolated into new ranges. Moreover, the posterior predictive distributions from BHMs 383
automatically and efficiently incorporate estimation uncertainty from all parts of the model 384
(Cressie et al. 2009, Gelman et al. 2013), eliminating the need to devise potentially complicated 385
bootstrapping algorithms to quantify predictive uncertainty. 386
In conclusion, we scaled up predictions of a key plant demographic process (seed 387
production) by combining two datasets—a mesocosm experiment with manipulated light 388
19
availability where we could accurately measure seed production, and a field introduction 389
experiment that captured the complex interactions of abiotic variables in the field, thereby adding 390
ecological realism. Our use of experimental data to parameterize non-linear functions that reflect 391
biological mechanisms should minimize the cause-effect errors which plague correlative 392
distribution approaches (Kearney and Porter 2009, Bahn and McGill 2013). Thus, we do not 393
expect our model predictions to disintegrate when extrapolated to new geographic ranges (Bahn 394
and McGill 2013), but there is the possibility that the geographic location of our experiments 395
(Indiana, USA) has introduced subtle limitations to our predictive power. Overall, we suggest 396
that the influence of abiotic factors on demographic parameters be studied both in logistically 397
manageable manipulative experiments that clarify underlying relationships, and in field studies 398
that realistically account for natural variability and interactions of abiotic variables. BHM 399
models offer wide latitude for combining these sources of information to make intuitive, 400
quantitative predictions. 401
402
ACKNOWLEDGEMENTS 403
We thank Jeremy Lichstein for consultation and critical feedback on our hierarchical model and 404
Keith Clay and Furong Long for assistance with the field experiment. This project was funded in 405
part by the University of Florida School of Natural Resources and the Environment. 406
407
408
LITERATURE CITED 409
Bahn, V., and B. J. McGill. 2013. Testing the predictive performance of distribution models. 410
Oikos 122:321–331. 411
20
Bolker, B. M. 2008. Ecological Models and Data in R. Princeton University Press. 412
Brook, B. W., N. S. Sodhi, and C. J. A. Bradshaw. 2008. Synergies among extinction drivers 413
under global change. Trends in Ecology & Evolution 23:453–460. 414
Caughlin, T. T., J. M. Ferguson, J. W. Lichstein, S. Bunyavejchewin, and D. J. Levey. 2014. The 415
importance of long-distance seed dispersal for the demography and distribution of a 416
canopy tree species. Ecology 95:952–962. 417
Chase, J. M., and M. A. Leibold. 2003. Ecological Niches: Linking Classical and Contemporary 418
Approaches. University of Chicago Press, Chicago, Illinois. 419
Cheplick, G. P., and J. Fox. 2011. Density-dependent growth and reproduction of Microstegium 420
vimineum in contrasting light environments1. The Journal of the Torrey Botanical 421
Society 138:62–72. 422
Clancy, K. M., and R. M. King. 1993. Defining the Western Spruce Budworm’s nutritional niche 423
with response surface methodology. Ecology 74:442–454. 424
Clark, C. J., J. R. Poulsen, D. J. Levey, and C. W. Osenberg. 2007. Are plant populations seed 425
limited? A critique and meta‐analysis of seed addition experiments. The American 426
Naturalist 170:128–142. 427
Cole, P. G., and J. F. Weltzin. 2005. Light limitation creates patchy distribution of an invasive 428
grass in eastern deciduous forests. Biological Invasions 7:477–488. 429
Cressie, N., C. A. Calder, J. S. Clark, J. M. V. Hoef, and C. K. Wikle. 2009. Accounting for 430
uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical 431
modeling. Ecological Applications 19:553–570. 432
Diez, J. M., I. Giladi, R. Warren, and H. R. Pulliam. 2014. Probabilistic and spatially variable 433
niches inferred from demography. Journal of Ecology 102:544–554. 434
21
Diez, J. M., and H. R. Pulliam. 2007. Hierarchical analysis of species distributions and 435
abundance across environmental gradients. Ecology 88:3144–3152. 436
Evans, M. E. K., K. E. Holsinger, and E. S. Menges. 2010. Fire, vital rates, and population 437
viability: a hierarchical Bayesian analysis of the endangered Florida scrub mint. 438
Ecological Monographs 80:627–649. 439
Fairbrothers, D. E., and J. R. Gray. 1972. Microstegium vimineum (Trin.) A. Camus 440
(Gramineae) in the United States. Bulletin of the Torrey Botanical Club 99:97–100. 441
Flory, S. L., F. Long, and K. Clay. 2011. Greater performance of introduced vs. native range 442
populations of Microstegium vimineum across different light environments. Basic and 443
Applied Ecology 12:350–359. 444
Gallien, L., T. Münkemüller, C. H. Albert, I. Boulangeat, and W. Thuiller. 2010. Predicting 445
potential distributions of invasive species: where to go from here? Diversity and 446
Distributions 16:331–342. 447
Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 2013. Bayesian Data Analysis, Third 448
Edition. CRC Press, Boca Raton, Florida. 449
Gelman, A., and J. Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical 450
Models. Cambridge University Press, Cambridge, England. 451
Guisan, A., and W. Thuiller. 2005. Predicting species distribution: offering more than simple 452
habitat models. Ecology Letters 8:993–1009. 453
Harris, C. M., H. L. Stanford, C. Edwards, J. M. J. Travis, and K. J. Park. 2011. Integrating 454
demographic data and a mechanistic dispersal model to predict invasion spread of 455
Rhododendron ponticum in different habitats. Ecological Informatics 6:187–195. 456
22
Holle, B. V., and D. Simberloff. 2005. Ecological resistance to biological invasion overwhelmed 457
by propagule pressure. Ecology 86:3212–3218. 458
Hooper, H. L., R. Connon, A. Callaghan, G. Fryer, S. Yarwood-Buchanan, J. Biggs, S. J. Maund, 459
T. H. Hutchinson, and R. M. Sibly. 2008. The ecological niche of Daphnia magna 460
characterized using population growth rate. Ecology 89:1015–1022. 461
Horton, J. L., and H. S. Neufeld. 1998. Photosynthetic responses of Microstegium vimineum 462
(Trin.) A. Camus, a shade-tolerant, C4 grass, to variable light environments. Oecologia 463
114:11–19. 464
Kearney, M., B. L. Phillips, C. R. Tracy, K. A. Christian, G. Betts, and W. P. Porter. 2008. 465
Modelling species distributions without using species distributions: the cane toad in 466
Australia under current and future climates. Ecography 31:423–434. 467
Kearney, M., and W. Porter. 2009. Mechanistic niche modelling: combining physiological and 468
spatial data to predict species’ ranges. Ecology Letters 12:334–350. 469
Levin, S. A. 1992. The problem of pattern and scale in Ecology: the Robert H. MacArthur award 470
lecture. Ecology 73:1943–1967. 471
Marshall, E. C., and D. J. Spiegelhalter. 2003. Approximate cross-validatory predictive checks in 472
disease mapping models. Statistics in Medicine 22:1649–1660. 473
Merow, C., J. P. Dahlgren, C. J. E. Metcalf, D. Z. Childs, M. E. K. Evans, E. Jongejans, S. 474
Record, M. Rees, R. Salguero-Gómez, and S. M. McMahon. 2014. Advancing population 475
ecology with integral projection models: a practical guide. Methods in Ecology and 476
Evolution 5:99–110. 477
23
Merow, C., N. LaFleur, J. A. S. Jr., A. M. Wilson, and M. Rubega. 2011. Developing dynamic 478
mechanistic species distribution models: predicting bird-mediated spread of invasive 479
plants across northeastern North America. The American Naturalist 178:30–43. 480
Moles, A. T., and M. Westoby. 2004. What do seedlings die from and what are the implications 481
for evolution of seed size? Oikos 106:193–199. 482
Ogle, K., S. Pathikonda, K. Sartor, J. W. Lichstein, J. L. D. Osnas, and S. W. Pacala. 2014. A 483
model-based meta-analysis for estimating species-specific wood density and identifying 484
potential sources of variation. Journal of Ecology 102:194–208. 485
Schurr, F. M., J. Pagel, J. S. Cabral, J. Groeneveld, O. Bykova, R. B. O’Hara, F. Hartig, W. D. 486
Kissling, H. P. Linder, G. F. Midgley, B. Schröder, A. Singer, and N. E. Zimmermann. 487
2012. How to understand species’ niches and range dynamics: a demographic research 488
agenda for biogeography. Journal of Biogeography 39:2146–2162. 489
Tilman, D. 1982. Resource Competition and Community Structure. Princeton University Press, 490
Princeton, New Jersey. 491
Warren, R. J., V. Bahn, and M. A. Bradford. 2012. The interaction between propagule pressure, 492
habitat suitability and density-dependent reproduction in species invasion. Oikos 493
121:874–881. 494
Warren, R. J., V. Bahn, T. D. Kramer, Y. Tang, and M. A. Bradford. 2011. Performance and 495
reproduction of an exotic invader across temperate forest gradients. Ecosphere 2:art14. 496
497 498
Supplemental Material 499
Appendix A 500
24
Posterior sampling distribution of the ratio of individual and site-level dispersion constants for 501
hierarchical biomass model (Ecological Archives XXXXX) 502
503
Appendix B 504
Simulation test of the gamma-gamma hierarchical biomass model (Ecological Archives XXXX) 505
506
Appendix C 507
Data on germination of seeds at different light levels and collected from plants grown at different 508
light levels (Ecological Archives XXXXX) 509
510
Appendix D 511
Cross-validated posterior predictive distribution for the gamma-gamma hierarchical biomass 512
model (Ecological Archives XXXXX) 513
514
Appendix E 515
Details on model development, particularly working with the gamma and negative binomial 516
parameterizations in JAGS, plus pseudo-code for generating cross-validated posterior predictions 517
(Ecological Archives XXXXX) 518
Supplement 519
R script files and data from both the field introduction and the mesocosm experiment to execute 520
the models described in the manuscript and in Appendix E. 521
522
523
25
Table 1. MM parameter estimates from sampling distributions of mesocosm model (used as 524
informative priors in the hierarchical model) and field introduction model (the posterior sampling 525
distribution). 526
527
Parameter Prior
Posterior
Mean Precision Mean Precision
Asymptote 18.2 1/(2.32) 17.4 1/(1.92)
Half Saturation 40.4 1/(10.52) 41.6 1/(8.72)
528
529
530
531
532
533
534
535
536
537
538
539
540
541
26
Table 2. Probability of seed production of a single plant equaling zero seeds, greater than 10 542
seeds, or greater than 40 seeds. These are calculated by sampling the posterior predictive 543
distribution of the hierarchical fecundity model (Fig 3). 544
545
546
% Light P[seeds = 0] P[seeds>10] P[seeds>40]
2% 0.683 0.283 0.248
5% 0.403 0.542 0.478
10% 0.203 0.745 0.691
80% 0.006 0.990 0.985
547
548
549
550
551
552
553
554
555 556 557 558
27
Figure Legends 559
560
Figure 1. Examples of sites used in the field introduction experiment, including full sun forest 561
opening (a) and partially (b) and densely (c) shaded forest understories. 562
563
Figure 2. Posterior predictive checking of mesocosm model to mesocosm data (a) and field 564
introduction data (b). Blue dashed lines are probability contours with the top and bottom lines 565
enclosing a 96% credible set. Solid red lines represent mean predictions for individuals at a given 566
light level. 567
568
Figure 3: Posterior predictive checking of hierarchical fecundity model including predictions 569
from the biomass part of the model against the raw field introduction data (a) and the seed 570
production part of the model against the seed production versus biomass data from the mesocosm 571
experiment (b). The distribution of seed production predictions for unobserved sites with a given 572
light level is plotted in (c). 573
574
Figure 4: Probability density histograms of predicted seed production from the hierarchical 575
fecundity model at four specific light levels: 2%, 5%, 10%, and 80% based on sampling the 576
posterior predictive distribution for seed production (i.e. Fig. 3c). 577
578
579
580
581 582
583 584
585
Figure 1
28
29
Figure 2 586
587
30
Figure 3 588
589
31
Figure 4 590
591