running title: predicting plant performance with bayesian...

1

Running Title: Predicting plant performance with Bayesian models

Combining mesocosm and field experiments to predict invasive plant

performance: A hierarchical Bayesian approach

CHRIS WILSON1,5

T. TREVOR CAUGHLIN2

DAVID J. CIVITELLO3

S. LUKE FLORY4

Statement of authorship: SLF and DJC designed experiments and collected data, CW and

TTC performed modeling work, CW wrote first draft of manuscript, all contributed to

writing and revising.

1School of Natural Resources and Environment, University of Florida, Gainesville, FL

32611

2Department of Biology, University of Florida, Gainesville, FL, 32611

3Department of Integrative Biology, University of South Florida, 4202 E Fowler Ave,

Tampa, FL 33620

4Agronomy Department, University of Florida, Gainesville, FL 32611

5 Corresponding author

CORRESPONDENCE: Chris Wilson, 352-519-7653 (cell), [email protected]

2

Abstract 1

Invasive plant fecundity underlies propagule pressure and ultimately range expansion. Predicting 2

fecundity across large spatial scales, from regions to landscapes, is critical for understanding 3

invasion dynamics and optimizing management. However, to accurately predict fecundity and 4

other demographic processes, improved models that scale individual plant responses to abiotic 5

drivers across heterogeneous environments are needed. Here we combine two experimental 6

datasets to predict fecundity of a widespread and problematic invasive grass over large spatial 7

scales. First, we analyzed seed production as a function of plant biomass in a small-scale 8

mesocosm experiment with manipulated light levels. Then, in a field introduction experiment, 9

we tracked plant performance across 21 common garden sites that differed widely in available 10

light and other factors. We jointly analyzed these data using a Bayesian Hierarchical Model 11

(BHM) framework to predict fecundity as a function of light in the field. Our analysis reveals 12

that the invasive species is likely to produce sufficient seed to overwhelm establishment 13

resistance, even in deeply shaded environments, and is likely seed-limited across much of its 14

range. Finally, we extend this framework to address the general problem of how to scale-up plant 15

demographic processes and analyze the factors that control plant distribution and abundance at 16

large scales. 17

18

Key words: Species distribution models, fecundity, seed production, ecological niche, 19

hierarchical Bayes, invasive species, mechanistic models, scaling-up, Microstegium vimineum, 20

Stiltgrass, plant demography 21

22

23

3

24

Introduction 25

Fecundity is a key demographic parameter underlying propagule pressure (Holle and 26

Simberloff 2005) and, along with dispersal, is critical for establishment success and range 27

expansion. Given the interactive threats to biodiversity from climate change, habitat loss, and 28

invasive species (Brook et al. 2008), there is growing urgency to predict the demographic 29

processes underlying species distributions, including fecundity. One current approach to 30

modeling plant range expansion is to estimate the relationship between plant size and 31

demographic parameters, such as fecundity, and then link plant size to environmental covariates 32

via regression (Diez et al. 2014, Merow et al. 2014). On its own, this correlative approach does 33

not explicitly account for the mechanisms linking abiotic factors to plant biomass production, 34

and therefore may be vulnerable to cause-effect errors when attempting to predict fecundity 35

across diverse habitats that encompass significant variability in abiotic resources. Moreover, 36

directly measuring fecundity in the field is often logistically difficult and ethically questionable, 37

particularly for non-native invasive plant species (Flory et al. 2011). Thus, in order to model 38

invasive plant range expansion or success across heterogeneous environments, we propose an 39

approach to modeling seed production based on an abiotic factor that is manipulated in 40

controlled experiments and readily measured in the field. 41

Many abiotic and biotic factors interact to determine demographic processes such as 42

seed production, making it difficult to disentangle the relationship between any given abiotic 43

factor and an observed species’ distribution pattern. Additionally, predictive models must 44

account for environmental heterogeneity in order to scale across habitats. Previous approaches to 45

predicting species distributions as a function of abiotic factors have included controlled 46

4

manipulative experiments to delineate the “fundamental niche” of a species (Chase and Leibold 47

2003). However, such studies are logistically limited to few abiotic factors (Clancy and King 48

1993, Hooper et al. 2008) and, maybe more importantly, they ignore (by definition) biotic 49

interactions and processes. Thus, their predictive power in the field is limited (Schurr et al. 50

2012). On the other hand, correlative predictive models such as “bioclimate envelopes” (Kearney 51

and Porter 2009) tacitly incorporate both biotic and abiotic interactions but do not explicitly 52

model demographic processes such as seed production (Gallien et al. 2010, Merow et al. 2011). 53

Thus, such models cannot identify which mechanisms drive patterns in distribution or abundance 54

(Guisan and Thuiller 2005). More recent niche modeling work has used an Integral Projection 55

Model (IPM) framework to link demographic processes and environmental covariates, while also 56

accounting for population variability (Diez et al. 2014). Since observational correlations of 57

abundance with environmental variables have been shown to generate unreliable predictions 58

when extrapolated from one geographic range to another (Bahn and McGill 2013), it may be 59

preferable to link demographic responses to abiotic factors through experimental introductions 60

that eliminate confounding effects of dispersal, and other biotic interactions. Thus, to accurately 61

predict the performance of invasive species at particular habitats or sites, and their potential to 62

spread at the landscape scale, a hybrid approach that combines the causal insight of manipulative 63

approaches with the ecological realism of performance data under natural conditions is needed 64

(Kearney and Porter 2009). 65

Although the need to address spatial scale has long been discussed in ecology (Levin 66

1992), analytical methods for combining datasets from different scales and contexts into 67

quantitative, predictive models have been lacking. Bayesian Hierarchical Models (BHM) can 68

facilitate the integration of ecological datasets by offering significant flexibility in model 69

5

specification (Gelman and Hill 2007). Moreover, by explicitly modeling variance at all 70

hierarchical levels, BHM predictions incorporate uncertainty from all parts of the model, in 71

contrast to predictive models built on point estimates (Cressie et al. 2009, Gelman et al. 2013, 72

Ogle et al. 2014). BHM have been used to infer seed production as a function of fire frequency 73

(Evans et al. 2010) and forest type (Caughlin et al. 2014) using observational data and linear 74

correlations. However, this approach could be greatly strengthened by incorporating 75

manipulative experiments and nonlinear relationships that reflect key mechanisms driving 76

demographic performance. BHMs provide an intuitive and useful context for this integration by: 77

a) allowing specification of informative priors based on accurate manipulative experiments, and 78

b) fully modeling variance at all hierarchical levels of the model (Gelman et al. 2013), thus 79

quantifying the distribution of site-to-site variability. 80

Here we demonstrate how field and mesocosm datasets can be linked with BHM to generate 81

predictions for the field performance of an aggressive and widespread invasive plant species. 82

Using Microstegium vimineum (Trin.) A. Camus (stiltgrass), a non-native grass distributed 83

throughout forests of the eastern US (Fairbrothers and Gray 1972) as a model system, we 84

combine data from a mesocosm experiment where we manipulated light availability and 85

measured M. vimineum biomass and seed production with data from experimental introductions 86

of M. vimineum across heterogeneous field sites, where light availability was the strongest 87

covariate of biomass production (Flory et al. 2011). Because we measured available light at each 88

of the field introduction sites, these two datasets are, in combination, well suited to scaling 89

predictions of M. vimineum performance as a function of light. Therefore, we combined these 90

data using BHMs to mechanistically link biomass production to available light in the field, while 91

simultaneously using the estimated size-fecundity relationship to predict seed production as a 92

6

function of biomass. We then used our models to predict relative performance across habitats 93

within our model species’ geographic range to determine first, how much M. vimineum seed 94

production varies with available light in the field, and second, if M. vimineum seed production is 95

likely to provide adequate propagule pressure to colonize and sustain invasions in forest interior 96

habitats given realistic establishment resistance. 97

98

METHODS 99

Mesocosm experiment 100

To determine how light availability affects M. vimineum performance and fecundity, we 101

established an outdoor mesocosm experiment in June 2010. We sowed twenty seeds of M. 102

vimineum (source: Big Oaks National Wildlife Refuge, Madison, IN) into 2 gal plastic containers 103

filled with composted greenhouse soil. These three containers were then placed into each of eight 104

different light treatments that ranged from 1.59-100% full sun (AccuPAR Linear PAR/LAI 105

ceptometer, Decagon Devices, Inc., Pullman, Washington). Light treatments were replicated 106

three times (24 total tents, 72 total containers). We constructed each shade tent out of poly-vinyl 107

chloride (PVC) piping, forming 1m x 1m cubes that we covered with 1-4 layers of various types 108

of shade cloth to achieve a range of light conditions. Uncovered tent frames were defined as the 109

full sun treatment and we calculated relative light availability as the ratio of light intensity in a 110

treatment to the full sun treatment. We observed no herbivory on the plants during the course of 111

the experiment. Seed germination was tabulated after three weeks and seedlings thinned to three 112

per container. We watered containers to capacity twice per day using automatic drip irrigation 113

and supplied plants with a 20-20-20 (NPK) fertilizer solution once per month over the course of 114

the experiment. 115

7

M. vimineum seed production begins in early October in southern Indiana and continues 116

until temperatures are below freezing. Because seeds immediately drop from plants once they are 117

mature, we lined each shade treatment structure with water permeable landscape cloth to capture 118

seed as it dropped. Since M. vimineum has a mixed mating system (produces both 119

chasmogomous (CH) and cleistogamous (CL) seed), we counted all seed that dropped into the 120

landscape fabric or remained attached to exposed seed heads as CH and dissected all M. 121

vimineum stems to quantify CL seed production (Cheplick 2008). In late November, after all 122

seed had fallen from the plants, aboveground biomass was harvested, dried at 60 °C to constant 123

mass, and weighed. 124

125

Field introduction experiment 126

To evaluate the performance of M. vimineum under natural environmental conditions, we 127

established a field introduction experiment at the Griffy Woods (39°11′17″ N, 86°30′18″ W) and 128

Bayles Road (39°13′9″ N, 86°32′29″ W) properties of the Indiana University Research and 129

Teaching Preserve in May 2009 (Flory et al. 2011). We selected 21 introduction sites across a 130

wide range of habitats including shaded bottomland forest, mowed open fields, dry forested ridge 131

tops, forested stream banks, and lowland and upland forest edge habitat (Fig. 1). Environmental 132

conditions at the sites, including light availability, soil characteristics and plant communities 133

varied widely (Flory et al. 2011). No introduction sites had invasive M. vimineum in the 134

immediate area but populations of M. vimineum generally occurred within one km. 135

In April 2009, we germinated seed and grew seedlings in the Indiana University 136

greenhouses from 10 M. vimineum populations collected from 10 different states in the invasive 137

U.S. range (Flory et al. 2011). We grew seedlings for four weeks in individual cell packs. Then, 138

8

we transplanted two seedlings from each of the 10 populations into each of the 21 experimental 139

introduction sites on a random 5 x 8 grid with 0.5 m spacing. At planting time, the transplanted 140

seedlings were 5-10 cm tall and were approximately uniform in height across populations. We 141

assumed that seedlings that died within one week (~15/440; < 4%) had died from transplant 142

shock and we therefore replaced them. Although seed germination and seedling establishment is 143

often an important key life history transition (Moles and Westoby 2004), we planted seedlings to 144

prevent accidental release of novel genotypes into natural areas. 145

In late August 2009, three months after planting, all plants were harvested at ground 146

level, dried at 60 °C to constant mass, and weighed. Belowground biomass was not harvested 147

because at most common gardens it was not possible to separate roots of M. vimineum from co-148

occurring resident species. Plants were deliberately harvested before flowering and seed set to 149

prevent genetic mixing and escape into surrounding habitats. 150

151

Modeling 152

Overall, the aim of our modeling was to combine the size-dependent fecundity 153

relationship that we quantified in the mesocosm experiment, with plant growth data across 154

realistic environmental gradients from the field introduction. We use BHMs to reconcile these 155

data and to generate predictions for seed production of individual plants in unobserved sites that 156

span the range of 1-100% available light. These predictions account for uncertainty in all parts of 157

the model and are useful for answering ecological questions about which factors control M. 158

vimineum distribution across its introduced range. 159

9

Mesocosm experiment: To quantify the effects of light on M. vimineum growth in the 160

mesocosm experiment we assumed that light intensity, L, had a saturating effect on biomass 161

production, B, and used these data to parameterize a Michaelis-Menton (MM) function: 162

(1) B = Bmax*L/(h + L) 163

This function has two parameters, an asymptotic biomass (Bmax) and a half-saturation constant 164

(h), which represents the light intensity that yields half of Bmax. We treated biomass as a gamma-165

distributed random variable because biomass data are positive-only, often right-skewed, and 166

have variance proportional to the mean (Bolker 2008, Gelman et al. 2013). Thus, for the ith 167

observation, Yi, we parameterized the gamma distribution in terms of a mean (u) and a dispersion 168

constant (k): 169

(2) Yi ~ Gamma(ui, k) 170

(3) ui = (Bmax*Li)/(h + Li) 171

where Li is the light level associated with the ith biomass observation. We sampled a 95% 172

credible interval from the posterior predictive distribution for biomass and compared it to the 173

field introduction data to assess whether the results of the mesocosm could predict biomass in the 174

field introductions. 175

176

Next, to understand the size-dependent reproduction of M. vimineum, we modeled total 177

(CH + CL) seed production as a simple power law function of biomass with a negative binomial 178

error distribution: 179

(4) Si ~ Neg.Bin.(ui, k) 180

(5) ui = c*Xid 181

10

where Si is the estimated number of seeds produced per individual, and Xi is the biomass of the ith 182

individual, and the parameters to be estimated are the constant c, the power-law exponent d, and 183

the over-dispersion parameter k. 184

Field introduction experiment: Next, we built a multilevel model to predict biomass 185

production as a function of light while accounting for variance within and across sites for the 186

field introduction experiment. Because we measured environmental covariates at the site level, 187

we fit the relationship to light at the site level and modeled individual biomass observations as 188

gamma distributed within a site. The basic model is: 189

(6) Yij ~ Gamma(uj1, k1) 190

(7) uj1 ~ Gamma(uj2, k2) 191

(8) uj2 = Bmax*Lj/(h + Lj) 192

where i = number of individual biomass observations (n = 381), and j = number of sites (n = 21), 193

Yij is the ith biomass observation in site j, and Lj represents the jth value of available light. The 194

parameters to be estimated are: Bmax, h, k1 and k2. In essence, this model estimates the parameters 195

of the MM function globally, with individuals nested within sites treated as random effects. 196

Because we had already parameterized the MM function with the mesocosm data, we used the 197

posterior sampling distribution of the MM parameters from the mesocosm model to set 198

informative priors for the multi-level biomass model. We then compared the posterior sampling 199

distribution of MM parameters from the multi-level model to their priors in order to assess the 200

information contributed by the field introduction data. In order to analyze the relative importance 201

of among-site variability compared to within-site variability, we examined the 95% credible 202

intervals of the site-level dispersion constant k2 and individual-level dispersion constant k1. 203

Specifically, we created a test statistic (ktest = k2/k1) in order to evaluate the probability that the 204

11

site-level dispersion term was higher than the individual level dispersion term. Given the 205

estimates of all parameters from our data, we predicted the biomass of unobserved individuals 206

nested within 100 unobserved sites, ranging from 1% to 100% available light. We then sampled 207

the posterior distribution for each of these predictions in order to define the probability contour 208

graphs in Fig. 3a. 209

In order to verify that the fit of our hierarchical biomass model to data (Fig. 3a) was not 210

overly sensitive to the inclusion of data from any particular site, we generated a cross-validated 211

posterior predictive distribution (see Appendix D, Fig. D1). Since we modeled biomass 212

observations as exchangeable within a site (given a known light level), we chose to hold-out 213

entire sites and then predict their mean biomass with a model absent their data, a form of leave-214

one-out cross-validation modified for a hierarchical model such as ours (see also Marshall and 215

Spiegelhalter 2003). We compared the mean squared error (MSE) for the cross-validated 216

predictive distribution to the regular posterior predictive distribution in order to assess sensitivity 217

to inclusion of the site being predicted. 218

Finally, we integrated the hierarchical biomass and size-fecundity seed models to predict 219

seed production as a function of light in the field. The full model takes the biomass predictions 220

(from equation (6)) for an individual nested within a site of a given light level (Yij ) (Fig. 3a) and 221

propagates them together with the seed production model, equations (4) and (5) (Fig. 3b), in 222

order to generate posterior predictions for seed production as a function of available light (Fig. 223

3c). Because BHMs estimate all parameters conditional upon one another, these predictions 224

correctly propagate the uncertainty in both parts of the model (Cressie et al. 2009). Therefore, the 225

posterior predictive intervals represent the probabilities that the seed production of a specific 226

individual plant growing within a site of a certain light level will fall within a given range. In 227

12

particular, we analyzed the posterior seed production predictions at sites of very low light (2%, 228

5%, and 10% available light) compared to a site of high available light (80%). We calculated the 229

probability than an individual plant’s seed production would exceed 10 or 40 seeds, a range in 230

which prior work has shown that propagule pressure overwhelms forest litter-layer resistance to 231

seedling recruitment (Warren et al. 2012). 232

We analyzed all models in R (R version 3.0.2, R Core Development Team) and JAGS 233

(version 3.4.0) using the “R2JAGS” package. We fitted each model via Gibbs Sampling Markov-234

Chain Monte Carlo (MCMC) methods. In each case, we ran 5 independent Markov chains for 235

12,000 iterations, with a 2,000 iteration burn-in period and then thinned by 50 iterations, used the 236

r-hat < 1.1. criterion to assess MCMC convergence, and visually checked for well-blended 237

chains (Gelman et al. 2013). 238

239

RESULTS 240

In both experiments, biomass increased nonlinearly with light availability such that 241

above ~50% light availability further increases in light are not associated consistently with 242

greater biomass production. In the mesocosm biomass model, the posterior distribution of the 243

asymptote parameter is 18.2 ± 2.3 g (mean ± sd dry weight) and the half-saturation constant 244

posterior is 40.4 ± 10.5% available light (Table 1). As expected, seed production is highly 245

correlated with biomass (Fig. 3b). At peak biomass observed in the mesocosm experiment (~ 30 246

g), seed production per plant is between 6000-7000. 247

On its own, the results of the mesocosm experiment do not adequately predict biomass in 248

the field introductions. Several sites fall outside the 95% posterior predictive interval altogether 249

and many more have at least several points outside the interval (Fig. 2b). Thus, among-site 250

13

variation, which the mesocosm experiment cannot assess, clearly influences M. vimineum 251

performance. 252

The hierarchical biomass model, which explicitly accounts for among-site variation, fit 253

the field introduction data much better (Fig. 3a). In addition, the posterior sampling distributions 254

of the MM parameters have less uncertainty than in the mesocosm model (Table 1). Thus, the 255

field introduction data provide additional information for understanding the underlying biomass-256

light availability relationship, in addition to accounting for among-site variability. Moreover, the 257

cross-validated MSE (29.21) was very close to the standard posterior predictive distribution 258

MSE (28.61), confirming that our model is not sensitive to the inclusion/exclusion of the 259

particular site being predicted (see Appendix D, Fig. D1) 260

Our analysis suggests that among-site variability in M. vimineum biomass production is at 261

least as large as within-site variability. The 95% credible interval of the site-level dispersion 262

constant k2 is [1.472; 5.989] and the individual-level dispersion constant k1 is [1.61; 2.23]. Our 263

test statistic result was that (p[ktest>1]) = 0.878 (appendix A, Fig. A1). Thus, averaging over the 264

uncertainty in both estimates (Gelman et al. 2013) there is an 87.8% probability that the site-265

level dispersion exceeds the individual-level dispersion. This result is consistent with our 266

observation that incorporating both levels of variance into our predictions (via the hierarchical 267

model structure) significantly enhanced model fit to real data (Fig. 3a, compared to Fig. 2b). 268

The hierarchical fecundity model predicts that, in the field, seed production will have a 269

similar saturating response to light availability as does biomass (Fig. 3c). In full sun, mean 270

fecundity exceeds 2000 seeds/plant. At all levels, the distribution of predictions has a positive 271

skew. Therefore, even at low light levels there is a small probability of large seed production that 272

significantly exceeds the mean. 273

14

Our predictions of seed production are qualitatively distinct in sites of very low versus 274

high light (Fig. 4 and Table 2). At low light levels, there is a significant probability that an 275

individual plant will not produce seed. However, even at 2% light, there is a 2% probability that 276

seed production will exceed 2000. Although there is a decent probability that an individual plant 277

in a site with 2% light availability will not produce seeds (0.673), this probability is reduced by 278

50% when light availability is at 5% (0.366) (Table 2). 279

280

DISCUSSION 281

To predict invasive plant range expansion and performance across variable habitats it is 282

necessary to understand how plant demographic processes respond to environmental factors at 283

various spatial scales (Diez and Pulliam 2007). Here we linked light availability across field sites 284

to plant productivity in order to predict a specific demographic process (seed production) that is 285

otherwise difficult to measure in the field, particularly for invasive species. First we used a 286

manipulative experiment to understand the fundamental plant response to an abiotic driver 287

(Chase and Leibold 2003). Then, to generate predictions that scale across habitats and account 288

for environmental heterogeneity, we extended this mechanistic insight to large spatial scales by 289

combining our mesocosm experiment with field introduction data using a BHM framework. 290

Our results indicate that M. vimineum seed production in the field is strongly linked to light 291

availability, despite including uncertainty from both individual and site-level variability. 292

However, even in low light habitats, our model suggests that seed production is likely high 293

enough to overwhelm establishment resistance (Warren et al. 2012). This finding suggests that 294

the distribution of our model species within its geographic range is not likely to be limited by 295

light availability alone, although its abundance will certainly be greater in higher light habitats 296

15

(e.g. open fields) than in deeply shaded forest interiors (Cole and Weltzin 2005, Flory 2010). For 297

sites with intermediate or high light availability (30 – 100% light), our model predicts that 298

individuals can produce approximately 2000 seeds on average. Germination of M. vimineum 299

seeds is consistently high at all light levels, and seed viability from plants grown in various light 300

environments is similarly high (as seen in Appendix C, Fig.C1 and Fig. C2) and we expect seed 301

production to translate relatively consistently into germinating seedlings (Warren et al. 2011). 302

Therefore, we infer that range expansion of M. vimineum is limited by seed arrival, because 303

newly established stands are unlikely to fail due to inadequate local seed production and 304

propagule pressure. 305

Our model suggests that the variability among experimental introduction sites was at least as 306

important, and likely more so, than variability within sites (prob = 0.878). Thus, when scaling 307

across sites, the relationship between focal abiotic factors and demographic processes is 308

complicated by numerous environmental factors. Accounting for the resulting variability is 309

central to the task of predicting distribution in the field, not merely a nuisance factor. For 310

instance, ignoring the stochasticity introduced by these additional factors and their interactions, 311

equivalent to fixing the among-site variability in the model at zero, would lead to artificially 312

precise prediction intervals. Therefore, by including among-site variability in our hierarchical 313

model, estimated from our heterogeneous field data, we can scale-up demographic predictions 314

from mesocosm experiments to the field. 315

Another approach would have been to include additional environmental covariates, alongside 316

available light, to try and account for some of the among-site variance. For instance, soil 317

resource supply is a major determinant of plant biomass production. Therefore, we recommend 318

that future mesocosm experiments include a cross-factorial manipulation of an integrative soil 319

16

variable (e.g., average soil moisture supply) alongside available light in order to improve 320

predictive accuracy and biological realism (Tilman 1982). Using a BHM framework, there are 321

several ways that a soil variable could be integrated into the mechanistic function for predicting 322

biomass as a function of light. For instance, the asymptotic biomass (Bmax) could be allowed to 323

vary for each site based on a regression with soil moisture supply, using informative priors from 324

the cross-factorial manipulation. Such an expansion of the experimental design and model 325

framework would account for more of the among-site variance than our present single-factor 326

model, but at the expense of increased logistical and model complexity. 327

Validation of our model predictions is complicated by the main consideration that motivated 328

model development in the first place: it is logistically and ethically challenging to accurately 329

quantify seed production in the field. Nonetheless, sub-components of the model can be 330

validated with future data collection, or by approximation with procedures such as cross-331

validation. Although leave-one-out cross-validation is often computationally expensive, in 332

hierarchical models one option is to iteratively hold out and predict groups rather than individual 333

data-points (Marshall and Spiegelhalter 2003). Our cross-validated predictive distribution for site 334

biomass means (as seen in Appendix D, Fig. D1) is basically the same as our regular posterior 335

predictive distribution (Fig. 3b): half of our observed site means fall outside the cross-validated 336

50% predictive confidence interval, and one observed mean fell outside the cross-validated 95% 337

interval, which are about the expected numbers for each interval. 338

One limitation of our model is the need to extrapolate the performance of plants transplanted 339

individually to natural stand dynamics in which density-dependent effects operate. For instance, 340

previous work has established that as M. vimineum stand density increases, individual biomass 341

decreases, although stand-level biomass remains stable or even increases slightly at higher 342

17

densities (Cheplick and Fox 2011). However, we argue that our individual-level predictions are 343

ideal for addressing the spread and establishment phases of the invasion process, rather than for 344

providing insight into dense, long-established stands. Moreover, a distinct advantage of 345

generating clear-cut individual-level predictions is that we can infer a role for micro-site 346

variability, such as canopy variability in forest interiors, in maintaining local populations. 347

Although our model indicates that small populations may not persist in very low light forest 348

interiors (e.g. < 2-5% light), there is always a small probability of high seed producing 349

individuals within an established stand. Generalizing to forest interiors, we speculate that M. 350

vimineum is likely exploiting light gaps and microsite variability in light availability (Horton and 351

Neufeld 1998). Interestingly, our results suggest that micro-site variability in light may influence 352

whether M. vimineum is limited by seed arrival (seed limitation) or growth and survival 353

(establishment limitation) after seeds have arrived (Clark et al. 2007). Given the differences in 354

predictions between 2%, 5% and 10% light, it seems likely that a few plants may be capable of 355

subsidizing seed production for an entire stand where many individuals produce little if any seed. 356

Conversely, given the dramatic increases in seed production with increasing light, our models 357

suggest that disturbances that increase light availability in forest understories (e.g. timber 358

harvests or natural blowdowns) are likely to strongly promote M. vimineum abundance. 359

Our modeling framework has direct applications to analyzing invasive and native plant 360

species range expansion and performance. For instance, Harris et al. (2011) fit empirical 361

relationships between fecundity and habitat type (varying in type and cover of canopy) for 362

Rhododendron ponticum, and used these estimates of fecundity in an individual-based model to 363

predict speed of range expansion. BHMs help generalize this approach because they facilitate 364

robust predictions for species whose seed production, unlike R. ponticum, is difficult to measure 365

18

in the field. Moreover, as Kearney and Porter (2009) point out, correlative approaches linking 366

habitat variables and demographic processes are inappropriate for invasive species undergoing 367

range expansion because the species’ distribution is far from equilibrium. For invasive species, it 368

may be especially important to experimentally manipulate abiotic factors in order to understand 369

the fundamental plant responses to these drivers. Our modeling framework provides a way to 370

scale responses to experimental manipulations to the field. 371

The approach we implemented here could be applied more broadly to analyzing the 372

ecological niche of plant species. For instance, in analyzing the large-scale distributional limits 373

of a species based on temperature, one could parameterize a thermal response curve under 374

controlled conditions, and then use those estimates as informative priors to fit field data with a 375

hierarchical Bayesian model for field observations. We strongly advocate the use of mechanistic 376

functions, that describe process-based relationships between abiotic factors and biotic processes 377

(Kearney et al. 2008). This should help minimize predictive errors that arise when predicting 378

species’ distribution with correlative linear models that often include arbitrary polynomial terms 379

(Bahn and McGill 2013). In general, a hierarchical model structure will facilitate a flexible 380

analysis of variability across many levels of biological organization including individuals, sites, 381

and regions. Thus, such models are better suited to predict across a geographic range, or when 382

extrapolated into new ranges. Moreover, the posterior predictive distributions from BHMs 383

automatically and efficiently incorporate estimation uncertainty from all parts of the model 384

(Cressie et al. 2009, Gelman et al. 2013), eliminating the need to devise potentially complicated 385

bootstrapping algorithms to quantify predictive uncertainty. 386

In conclusion, we scaled up predictions of a key plant demographic process (seed 387

production) by combining two datasets—a mesocosm experiment with manipulated light 388

19

availability where we could accurately measure seed production, and a field introduction 389

experiment that captured the complex interactions of abiotic variables in the field, thereby adding 390

ecological realism. Our use of experimental data to parameterize non-linear functions that reflect 391

biological mechanisms should minimize the cause-effect errors which plague correlative 392

distribution approaches (Kearney and Porter 2009, Bahn and McGill 2013). Thus, we do not 393

expect our model predictions to disintegrate when extrapolated to new geographic ranges (Bahn 394

and McGill 2013), but there is the possibility that the geographic location of our experiments 395

(Indiana, USA) has introduced subtle limitations to our predictive power. Overall, we suggest 396

that the influence of abiotic factors on demographic parameters be studied both in logistically 397

manageable manipulative experiments that clarify underlying relationships, and in field studies 398

that realistically account for natural variability and interactions of abiotic variables. BHM 399

models offer wide latitude for combining these sources of information to make intuitive, 400

quantitative predictions. 401

402

ACKNOWLEDGEMENTS 403

We thank Jeremy Lichstein for consultation and critical feedback on our hierarchical model and 404

Keith Clay and Furong Long for assistance with the field experiment. This project was funded in 405

part by the University of Florida School of Natural Resources and the Environment. 406

407

408

LITERATURE CITED 409

Bahn, V., and B. J. McGill. 2013. Testing the predictive performance of distribution models. 410

Oikos 122:321–331. 411

20

Bolker, B. M. 2008. Ecological Models and Data in R. Princeton University Press. 412

Brook, B. W., N. S. Sodhi, and C. J. A. Bradshaw. 2008. Synergies among extinction drivers 413

under global change. Trends in Ecology & Evolution 23:453–460. 414

Caughlin, T. T., J. M. Ferguson, J. W. Lichstein, S. Bunyavejchewin, and D. J. Levey. 2014. The 415

importance of long-distance seed dispersal for the demography and distribution of a 416

canopy tree species. Ecology 95:952–962. 417

Chase, J. M., and M. A. Leibold. 2003. Ecological Niches: Linking Classical and Contemporary 418

Approaches. University of Chicago Press, Chicago, Illinois. 419

Cheplick, G. P., and J. Fox. 2011. Density-dependent growth and reproduction of Microstegium 420

vimineum in contrasting light environments1. The Journal of the Torrey Botanical 421

Society 138:62–72. 422

Clancy, K. M., and R. M. King. 1993. Defining the Western Spruce Budworm’s nutritional niche 423

with response surface methodology. Ecology 74:442–454. 424

Clark, C. J., J. R. Poulsen, D. J. Levey, and C. W. Osenberg. 2007. Are plant populations seed 425

limited? A critique and meta‐analysis of seed addition experiments. The American 426

Naturalist 170:128–142. 427

Cole, P. G., and J. F. Weltzin. 2005. Light limitation creates patchy distribution of an invasive 428

grass in eastern deciduous forests. Biological Invasions 7:477–488. 429

Cressie, N., C. A. Calder, J. S. Clark, J. M. V. Hoef, and C. K. Wikle. 2009. Accounting for 430

uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical 431

modeling. Ecological Applications 19:553–570. 432

Diez, J. M., I. Giladi, R. Warren, and H. R. Pulliam. 2014. Probabilistic and spatially variable 433

niches inferred from demography. Journal of Ecology 102:544–554. 434

21

Diez, J. M., and H. R. Pulliam. 2007. Hierarchical analysis of species distributions and 435

abundance across environmental gradients. Ecology 88:3144–3152. 436

Evans, M. E. K., K. E. Holsinger, and E. S. Menges. 2010. Fire, vital rates, and population 437

viability: a hierarchical Bayesian analysis of the endangered Florida scrub mint. 438

Ecological Monographs 80:627–649. 439

Fairbrothers, D. E., and J. R. Gray. 1972. Microstegium vimineum (Trin.) A. Camus 440

(Gramineae) in the United States. Bulletin of the Torrey Botanical Club 99:97–100. 441

Flory, S. L., F. Long, and K. Clay. 2011. Greater performance of introduced vs. native range 442

populations of Microstegium vimineum across different light environments. Basic and 443

Applied Ecology 12:350–359. 444

Gallien, L., T. Münkemüller, C. H. Albert, I. Boulangeat, and W. Thuiller. 2010. Predicting 445

potential distributions of invasive species: where to go from here? Diversity and 446

Distributions 16:331–342. 447

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 2013. Bayesian Data Analysis, Third 448

Edition. CRC Press, Boca Raton, Florida. 449

Gelman, A., and J. Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical 450

Models. Cambridge University Press, Cambridge, England. 451

Guisan, A., and W. Thuiller. 2005. Predicting species distribution: offering more than simple 452

habitat models. Ecology Letters 8:993–1009. 453

Harris, C. M., H. L. Stanford, C. Edwards, J. M. J. Travis, and K. J. Park. 2011. Integrating 454

demographic data and a mechanistic dispersal model to predict invasion spread of 455

Rhododendron ponticum in different habitats. Ecological Informatics 6:187–195. 456

22

Holle, B. V., and D. Simberloff. 2005. Ecological resistance to biological invasion overwhelmed 457

by propagule pressure. Ecology 86:3212–3218. 458

Hooper, H. L., R. Connon, A. Callaghan, G. Fryer, S. Yarwood-Buchanan, J. Biggs, S. J. Maund, 459

T. H. Hutchinson, and R. M. Sibly. 2008. The ecological niche of Daphnia magna 460

characterized using population growth rate. Ecology 89:1015–1022. 461

Horton, J. L., and H. S. Neufeld. 1998. Photosynthetic responses of Microstegium vimineum 462

(Trin.) A. Camus, a shade-tolerant, C4 grass, to variable light environments. Oecologia 463

114:11–19. 464

Kearney, M., B. L. Phillips, C. R. Tracy, K. A. Christian, G. Betts, and W. P. Porter. 2008. 465

Modelling species distributions without using species distributions: the cane toad in 466

Australia under current and future climates. Ecography 31:423–434. 467

Kearney, M., and W. Porter. 2009. Mechanistic niche modelling: combining physiological and 468

spatial data to predict species’ ranges. Ecology Letters 12:334–350. 469

Levin, S. A. 1992. The problem of pattern and scale in Ecology: the Robert H. MacArthur award 470

lecture. Ecology 73:1943–1967. 471

Marshall, E. C., and D. J. Spiegelhalter. 2003. Approximate cross-validatory predictive checks in 472

disease mapping models. Statistics in Medicine 22:1649–1660. 473

Merow, C., J. P. Dahlgren, C. J. E. Metcalf, D. Z. Childs, M. E. K. Evans, E. Jongejans, S. 474

Record, M. Rees, R. Salguero-Gómez, and S. M. McMahon. 2014. Advancing population 475

ecology with integral projection models: a practical guide. Methods in Ecology and 476

Evolution 5:99–110. 477

23

Merow, C., N. LaFleur, J. A. S. Jr., A. M. Wilson, and M. Rubega. 2011. Developing dynamic 478

mechanistic species distribution models: predicting bird-mediated spread of invasive 479

plants across northeastern North America. The American Naturalist 178:30–43. 480

Moles, A. T., and M. Westoby. 2004. What do seedlings die from and what are the implications 481

for evolution of seed size? Oikos 106:193–199. 482

Ogle, K., S. Pathikonda, K. Sartor, J. W. Lichstein, J. L. D. Osnas, and S. W. Pacala. 2014. A 483

model-based meta-analysis for estimating species-specific wood density and identifying 484

potential sources of variation. Journal of Ecology 102:194–208. 485

Schurr, F. M., J. Pagel, J. S. Cabral, J. Groeneveld, O. Bykova, R. B. O’Hara, F. Hartig, W. D. 486

Kissling, H. P. Linder, G. F. Midgley, B. Schröder, A. Singer, and N. E. Zimmermann. 487

2012. How to understand species’ niches and range dynamics: a demographic research 488

agenda for biogeography. Journal of Biogeography 39:2146–2162. 489

Tilman, D. 1982. Resource Competition and Community Structure. Princeton University Press, 490

Princeton, New Jersey. 491

Warren, R. J., V. Bahn, and M. A. Bradford. 2012. The interaction between propagule pressure, 492

habitat suitability and density-dependent reproduction in species invasion. Oikos 493

121:874–881. 494

Warren, R. J., V. Bahn, T. D. Kramer, Y. Tang, and M. A. Bradford. 2011. Performance and 495

reproduction of an exotic invader across temperate forest gradients. Ecosphere 2:art14. 496

497 498

Supplemental Material 499

Appendix A 500

24

Posterior sampling distribution of the ratio of individual and site-level dispersion constants for 501

hierarchical biomass model (Ecological Archives XXXXX) 502

503

Appendix B 504

Simulation test of the gamma-gamma hierarchical biomass model (Ecological Archives XXXX) 505

506

Appendix C 507

Data on germination of seeds at different light levels and collected from plants grown at different 508

light levels (Ecological Archives XXXXX) 509

510

Appendix D 511

Cross-validated posterior predictive distribution for the gamma-gamma hierarchical biomass 512

model (Ecological Archives XXXXX) 513

514

Appendix E 515

Details on model development, particularly working with the gamma and negative binomial 516

parameterizations in JAGS, plus pseudo-code for generating cross-validated posterior predictions 517

(Ecological Archives XXXXX) 518

Supplement 519

R script files and data from both the field introduction and the mesocosm experiment to execute 520

the models described in the manuscript and in Appendix E. 521

522

523

25

Table 1. MM parameter estimates from sampling distributions of mesocosm model (used as 524

informative priors in the hierarchical model) and field introduction model (the posterior sampling 525

distribution). 526

527

Parameter Prior

Posterior

Mean Precision Mean Precision

Asymptote 18.2 1/(2.32) 17.4 1/(1.92)

Half Saturation 40.4 1/(10.52) 41.6 1/(8.72)

528

529

530

531

532

533

534

535

536

537

538

539

540

541

26

Table 2. Probability of seed production of a single plant equaling zero seeds, greater than 10 542

seeds, or greater than 40 seeds. These are calculated by sampling the posterior predictive 543

distribution of the hierarchical fecundity model (Fig 3). 544

545

546

% Light P[seeds = 0] P[seeds>10] P[seeds>40]

2% 0.683 0.283 0.248

5% 0.403 0.542 0.478

10% 0.203 0.745 0.691

80% 0.006 0.990 0.985

547

548

549

550

551

552

553

554

555 556 557 558

27

Figure Legends 559

560

Figure 1. Examples of sites used in the field introduction experiment, including full sun forest 561

opening (a) and partially (b) and densely (c) shaded forest understories. 562

563

Figure 2. Posterior predictive checking of mesocosm model to mesocosm data (a) and field 564

introduction data (b). Blue dashed lines are probability contours with the top and bottom lines 565

enclosing a 96% credible set. Solid red lines represent mean predictions for individuals at a given 566

light level. 567

568

Figure 3: Posterior predictive checking of hierarchical fecundity model including predictions 569

from the biomass part of the model against the raw field introduction data (a) and the seed 570

production part of the model against the seed production versus biomass data from the mesocosm 571

experiment (b). The distribution of seed production predictions for unobserved sites with a given 572

light level is plotted in (c). 573

574

Figure 4: Probability density histograms of predicted seed production from the hierarchical 575

fecundity model at four specific light levels: 2%, 5%, 10%, and 80% based on sampling the 576

posterior predictive distribution for seed production (i.e. Fig. 3c). 577

578

579

580

581 582

583 584

585

Figure 1

28

29

Figure 2 586

587

30

Figure 3 588

589

31

Figure 4 590

591

running title: predicting plant performance with bayesian...

Documents