package ‘spind’ - r · package ‘spind’ july 20, 2018 title spatial methods and indices ......
TRANSCRIPT
Package ‘spind’July 20, 2018
Title Spatial Methods and Indices
Version 2.2.0
Author Gudrun Carl [cre, aut], Ingolf Kuehn [aut], Sam Levin [aut]
Maintainer Sam Levin <[email protected]>
URL https://github.com/levisc8/spind
BugReports https://github.com/levisc8/spind/issues
Description Functions for spatial methods based on generalized estimating equations (GEE) andwavelet-revised methods (WRM), functions for scaling by wavelet multiresolution regression (WMRR),conducting multi-model inference, and stepwise model selection. Further, contains functionsfor spatially corrected model accuracy measures.
Depends R (>= 3.0.0)
Imports gee (>= 4.13.19), geepack (>= 1.2.1), MASS (>= 7.3.49),splancs (>= 2.1.40), lattice (>= 0.20.35), waveslim (>= 1.7.5),rje (>= 1.9), stringr (>= 1.3.1), ggplot2 (>= 3.0.0),RColorBrewer (>= 1.1.2), rlang(>= 0.2.1)
License GPL-3
Encoding UTF-8
LazyData true
RoxygenNote 6.0.1
Suggests knitr (>= 1.15.1), rmarkdown (>= 1.9), gridExtra (>= 2.3),testthat (>= 2.0.0), PresenceAbsence (>= 1.1.9), sp (>= 1.2.7),covr (>= 3.1.0)
VignetteBuilder knitr
NeedsCompilation no
Repository CRAN
Date/Publication 2018-07-20 15:40:03 UTC
1
2 acfft
R topics documented:acfft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2adjusted.actuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3aic.calc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4carlinadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5covar.plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5GEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7hook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10mmiGEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11mmiWMRR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12musdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14qic.calc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14rvi.plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15scaleWMRR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17spind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19step.spind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21th.dep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23th.indep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24upscale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26wavecovar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27wavevar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28WRM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Index 33
acfft Spatial autocorrelation diagnostics
Description
A function for calculating spatial autocorrelation using Moran’s I.
Usage
acfft(coord, f, lim1 = 1, lim2 = 2, dmax = 10)
Arguments
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
f A vector which is the same length as x and y
lim1 Lower bound for first bin. Default is 1
lim2 Upper bound for first bin. Default is 2
dmax Number of distance bins to examine. Bins are formed by annuli of graduallyincreasing radii. Default is 10.
adjusted.actuals 3
Value
A vector of Moran’s I values for each distance bin.
Author(s)
Gudrun Carl
Examples
data(musdata)coords <- musdata[ ,4:5]mglm <- glm(musculus ~ pollution + exposure, "poisson", musdata)
ac <- acfft(coords, resid(mglm, type = "pearson"), lim1 = 0, lim2 = 1)ac
adjusted.actuals Adjusted actual values
Description
Adjusts actual presence/absence data based on the autocorrelation in the predictions of a model. Thefunction will optionally plot results of model predictions, un-modified actual presence/absence, andadjusted values.
Usage
adjusted.actuals(data, coord, plot.maps = FALSE, color.maps = FALSE)
Arguments
data a dataframe or matrix containing actual presence/absence (binary, 0 or 1) valuesin 1st column and predicted values (numeric between 0 and 1) in 2nd column.
coord a matrix of two columns of the same length providing integer, consecutivelynumbered coordinates for each occurrence and prediction in data.
plot.maps A logical indicating whether maps should be plotted. Default is FALSE.
color.maps A logical value. If TRUE, produces colorful maps. If FALSE, produces grayscalemaps. Default is grayscale. NOW DEPRECATED, color maps will not be pro-duced in future versions.
Value
A vector of adjusted actual values.
Author(s)
Gudrun Carl
4 aic.calc
Examples
data(hook)data <- hook[ ,1:2]coord <- hook[ ,3:4]aa <- adjusted.actuals(data, coord, plot.maps = TRUE)
aic.calc Akaike Information Criterion with correction for sample size
Description
Calculates AIC and AICc
Usage
aic.calc(formula, family, data, mu, n.eff = NULL)
Arguments
formula A model formulafamily Family used to fit the model. gaussian, binomial, or poisson are supporteddata A data framemu Fitted values from a modeln.eff Effective number of observations. Default is NULL
Value
A list with the following components
loglik Log likelihood of the modeldf Degrees of freedomAIC AIC score for the specified modelAICc AIC score corrected for small sample sizes
Author(s)
Gudrun Carl, Sam Levin
Examples
data(musdata)coords <- musdata[ ,4:5]mglm <- glm(musculus ~ pollution + exposure, "poisson", musdata)
aic <- aic.calc(musculus ~ pollution + exposure, "poisson", musdata,mglm$fitted)
aic$AIC
carlinadata 5
carlinadata Carlina data set
Description
A data frame containing simulated count data for the thistle, Carlina horrida.
Usage
carlinadata
Format
A data frame with 961 rows and 5 columns
carlina.horrida integer - Simulated count dataaridity numeric - Simulated aridity index values. This variable has high spatial autocorrelation
values.land.use numeric - Simulated land use intensity. This variable has no spatial autocorrelation.x integer - x-coordinates for each grid celly integer - y-coordinates for each grid cell
covar.plot Plot wavelet variance/covariance
Description
Plots the wavelet variance or covariance for the specified formula. The scale-dependent results aregraphically displayed.
Usage
covar.plot(formula, data, coord, wavelet = "haar", wtrafo = "dwt",plot = "covar", customize_plot = NULL)
Arguments
formula With specified notation according to names in data frame.data Data frame.coord A matrix of 2 columns with corresponding x,y-coordinates which have to be
integer.wavelet Type of wavelet: haar, d4, or la8.wtrafo Type of wavelet transform: dwt or modwt.plot Either var for wavelet variance analysis or covar for wavelet covariance analy-
sis.customize_plot Additional plotting parameters passed to ggplot. NOW DEPRECATED
6 covar.plot
Details
Each variable or pair of variables in formula is passed to wavevar or wavecovar internally, and theresult is plotted as a function of level.
Value
A list containing
1. result = a vector of results.
2. plot = a ggplot object
Author(s)
Gudrun Carl
See Also
wavevar, wavecovar
Examples
data(carlinadata)coords <- carlinadata[,4:5]
covariance <- covar.plot(carlina.horrida ~ aridity + land.use - 1,data = carlinadata,coord = coords,wavelet = "d4",wtrafo = 'modwt',plot = 'covar')
covariance$plotcovariance$result
variance <- covar.plot(carlina.horrida ~ aridity + land.use - 1,data = carlinadata,coord = coords,wavelet = "d4",wtrafo = 'modwt',plot = 'var')
variance$plotvariance$result
GEE 7
GEE GEE (Generalized Estimating Equations)
Description
GEE provides GEE-based methods from the packages gee and geepack to account for spatial auto-correlation in multiple linear regressions
Usage
GEE(formula, family, data, coord, corstr = "fixed", cluster = 3,moran.params = list(), plot = FALSE, scale.fix = FALSE,customize_plot = NULL)
## S3 method for class 'GEE'plot(x, ...)
## S3 method for class 'GEE'predict(object, newdata, ...)
## S3 method for class 'GEE'summary(object, ..., printAutoCorPars = TRUE)
Arguments
formula Model formula. Variable names must match variables in data.
family gaussian, binomial, or poisson are supported. Called using a quoted charac-ter string (i.e. family = "gaussian").
data A data frame with variable names that match the variables specified in formula.
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
corstr Expected autocorrelation structure: independence, fixed, exchangeable, andquadratic are possible.
• independence - This is the same as a GLM, i.e. correlation matrix = iden-tity matrix.
• fixed - Uses an adapted isotropic power function specifying all correlationcoefficients.
• exchangeable and quadratic for clustering, i.e. the correlation matrixhas a block diagonal form:
– exchangeable - All intra-block correlation coefficients are equal.– quadratic - Intra-block correlation coefficients for different distances
can be different.
cluster Cluster size for cluster models exchangeable and quadratic. Values of 2, 3,and 4 are allowed.
8 GEE
• 2 - a 2*2 cluster• 3 - a 3*3 cluster• 4 - a 4*4 cluster
moran.params A list of parameters for calculating Moran’s I.
• lim1 Lower limit for first bin. Default is 0.• increment Step size for calculating I. Default is 1.
plot A logical value indicating whether autocorrelation of residuals should be plotted.NOW DEPRECATED in favor of plot.GEE method.
scale.fix A logical indicating whether or not the scale parameter should be fixed. Thedefault is FALSE. Use TRUE when planning to use stepwise model selection pro-cedures in step.spind.
customize_plot Additional plotting parameters passed to ggplot. NOW DEPRECATED in fa-vor plot.GEE method.
x An object of class GEE or WRM
... Not used.
object An object of class GEE.
newdata A data frame containing variables to base the predictions on.printAutoCorPars
A logical indicating whether to print the working autocorrelation parameters
Details
GEE can be used to fit linear models for response variables with different distributions: gaussian,binomial, or poisson. As a spatial model, it is a generalized linear model in which the residu-als may be autocorrelated. It accounts for spatial (2-dimensional) autocorrelation of the residualsin cases of regular gridded datasets and returns corrected parameter estimates. The grid cells areassumed to be square. Furthermore, this function requires that all predictor variables be continu-ous.
Value
An object of class GEE. This consists of a list with the following elements:
call Call
formula Model formula
family Family
coord Coordinates used for the model
corstr User-selected correlation structure
b Estimate of regression parameters
s.e. Standard errors of the estimates
z Depending on the family, either a z or t value
p p-values for each parameter estimate
scale Scale parameter (dispersion parameter) of the distribution’s variance
GEE 9
scale.fix Logical indicating whether scale has fixed value
cluster User-specified cluster size for clustered models
fitted Fitted values from the model
resid Normalized Pearson residuals
w.ac Working autocorrelation parameters
Mat.ac Working autocorrelation matrix
QIC Quasi Information Criterion. See qic.calc for further details
QLik Quasi-likelihood. See qic.calc for further details
plot Logical value indicating whether autocorrelation should be plotted
moran.params Parameters for calculating Moran’s I
v2 Parameter variance of the GEE model
var.naive Parameter variance of the independence model
ac.glm Autocorrelation of GLM residuals
ac.gee Autocorrelation of GEE residuals
plot An object of class ggplot containing information on the autocorrelation of residuals from thefitted GEE and a GLM
Elements can be viewed using the summary.GEE methods included in the package.
Note
When using corstr = "fixed" on large data sets, the function may return an error, as the resultingvariance-covariance matrix is too large for R to handle. If this happens, one will have to use one ofthe cluster models (i.e quadratic, exchangeable).
Author(s)
Gudrun Carl, Sam Levin
References
Carl G & Kuehn I, 2007. Analyzing Spatial Autocorrelation in Species Distributions using Gaussianand Logit Models, Ecol. Model. 207, 159 - 170
Carey, V. J., 2006. Ported to R by Thomas Lumley (versions 3.13, 4.4, version 4.13)., B. R. gee:Generalized Estimation Equation solver. R package version 4.13-11.
Yan, J., 2004. geepack: Generalized Estimating Equation Package. R package version 0.2.10.
See Also
qic.calc, summary.GEE, gee
10 hook
Examples
data(musdata)coords<- musdata[,4:5]
## Not run:mgee <- GEE(musculus ~ pollution + exposure,
family = "poisson",data = musdata,coord = coords,corstr = "fixed",scale.fix = FALSE)
summary(mgee, printAutoCorPars = TRUE)
pred <- predict(mgee, newdata = musdata)
library(ggplot2)
plot(mgee)
my_gee_plot <- mgee$plot
# move the legend to a new positionprint(my_gee_plot + ggplot2::theme(legend.position = 'top'))
## End(Not run)
hook Hook data set
Description
A data frame containing actual presence absence data and predicted probability of occurrence val-ues.
Usage
hook
Format
A data frame with 100 rows and 4 columns
actuals integer - Presence/absence records
predictions numeric - predicted probabilities of occurrence
x integer - x-coordinates for each grid cell
y integer - y-coordinates for each grid cell
mmiGEE 11
mmiGEE Multi-model inference for GEE models
Description
mmiGEE is a multimodel inference approach evaluating the relative importance of predictors usedin GEE.
@details It performs automatically generated model selection and creates a model selection tableaccording to the approach of multi-model inference (Burnham & Anderson, 2002). QIC is usedto obtain model selection weights and to rank the models. Moreover, mmiGEE calculates relativevariable importance of a given model. Finally, this function requires that all predictor variablesbe continuous.
Usage
mmiGEE(object, data, trace = FALSE)
Arguments
object A model of class GEE.
data A data frame or set of vectors of equal length.
trace A logical indicating whether or not to print results to console.
Details
Calculates the relative importance of each variable using multi-model inference methods in a Gen-eralized Estimating Equations framework implemented in GEE.
Value
mmiGEE returns a list containing the following elements
result A matrix containing slopes, degrees of freedom, quasilikelihood, QIC, delta, and weightvalues for the set of candidate models. The models are ranked by QIC.
rvi A vector containing the relative importance of each variable in the regression.
Author(s)
Gudrun Carl, Sam Levin
References
Burnham, K.P. & Anderson, D.R. (2002) Model selection and multimodel inference. Springer, NewYork.
Carl G & Kuehn I, 2007. Analyzing Spatial Autocorrelation in Species Distributions using Gaussianand Logit Models, Ecol. Model. 207, 159 - 170
12 mmiWMRR
See Also
GEE, qic.calc, MuMIn
Examples
# data (for demonstration only)library(MASS)data(birthwt)
# impose an artificial (not fully appropriate) grid structure
x <- rep(1:14, 14)y <- as.integer(gl(14, 14))coords <- cbind(x[-(190:196)], y[-(190:196)])
## Not run:
formula <- formula(low ~ race + smoke + bwt)
mgee <- GEE(formula,family = "gaussian",data = birthwt,coord = coords,corstr = "fixed",scale.fix = TRUE)
mmi <- mmiGEE(mgee, birthwt)
## End(Not run)
mmiWMRR Multi-model inference for wavelet multiresolution regression
Description
mmiWMRR is a multimodel inference approach evaluating the relative importance of predictorsused in scaleWMRR.
Usage
mmiWMRR(object, data, scale, detail = TRUE, trace = FALSE)
Arguments
object A model of class WRM.
data Data frame.
scale 0 or higher integers possible (limit depends on sample size). scale=1 is equiv-alent to WRM with level=1.
mmiWMRR 13
detail Remove smooth wavelets? If TRUE, only detail components are analyzed. If setto FALSE, smooth and detail components are analyzed. Default is TRUE.
trace Logical value indicating whether to print results to console.
Details
It performs automatically generated model selection and creates a model selection table accordingto the approach of multi-model inference (Burnham & Anderson, 2002). The analysis is carriedout for scale-specific regressions (i.e. where scaleWMRR can be used). AIC is used to obtain modelselection weights and to rank the models. Furthermore, this function requires that all predictorvariables be continuous.
Value
mmiWMRR returns a list containing the following elements
result A matrix containing slopes, degrees of freedom, likelihood, AIC, delta, and weight valuesfor the set of candidate models. The models are ranked by AIC.
level An integer corresponding to scale
Author(s)
Gudrun Carl
References
Burnham, K.P. & Anderson, D.R. (2002) Model selection and multimodel inference. Springer, NewYork.
Carl G, Doktor D, Schweiger O, Kuehn I (2016) Assessing relative variable importance acrossdifferent spatial scales: a two-dimensional wavelet analysis. Journal of Biogeography 43: 2502-2512.
See Also
aic.calc, rvi.plot, MuMIn, WRM
Examples
data(carlinadata)coords <- carlinadata[ ,4:5]
## Not run:
wrm <- WRM(carlina.horrida ~ aridity + land.use,family = "poisson",data = carlinadata,coord = coords,level = 1,
14 qic.calc
wavelet = "d4")
mmi <- mmiWMRR(wrm,data = carlinadata,scale = 3,detail = TRUE,trace = FALSE)
## End(Not run)
musdata Mus musculus data set
Description
A data frame containing simulated count data of a house mouse.
Usage
musdata
Format
A data frame with 400 rows and 5 columns
musculus integer - Simulated count data for Mus musculus
pollution numeric - Simulated variable that describes degree of pollution in corresponding gridcell
exposure numeric - Simulated variable that describes degree of exposure for each grid cell
x integer - x-coordinates for each grid cell
y integer - y-coordinates for each grid cell
qic.calc Quasi-Information Criterion for Generalized Estimating Equations
Description
A function for calculating quasi-likelihood and Quasi-Information Criterion values based on themethod of Hardin & Hilbe (2003).
Usage
qic.calc(formula, family, data, mu, var.robust, var.indep.naive)
rvi.plot 15
Arguments
formula a model formula
family gaussian, binomial, or poisson
data a data frame
mu fitted values from a model
var.robust variance of model parametersvar.indep.naive
naive variance of model parameters under the independence model
Value
A list with the following components:
QIC quasi-information criterion
loglik quasi-likelihood
References
Hardin, J.W. & Hilbe, J.M. (2003) Generalized Estimating Equations. Chapman and Hall, NewYork.
Barnett et al. Methods in Ecology & Evolution 2010, 1, 15-24.
rvi.plot Relative Variable Importance
Description
Creates model selection tables, calculates and plots relative variable importance based on the scalelevel of a given model.
Usage
rvi.plot(formula, family, data, coord, maxlevel, detail = TRUE,wavelet = "haar", wtrafo = "dwt", n.eff = NULL, trace = FALSE,customize_plot = NULL)
Arguments
formula A model formula
family gaussian, binomial, and poisson are supported.
data A data frame or set of vectors of equal length.
coord X,Y coordinates for each observation. Coordinates should be consecutive inte-gers.
maxlevel An integer for maximum scale level
16 rvi.plot
detail Remove smooth wavelets? If TRUE, only detail components are analyzed. If setto FALSE, smooth and detail components are analyzed. Default is TRUE.
wavelet Type of wavelet: haar, d4, or la8
wtrafo Type of wavelet transform: dwt or modwt
n.eff A numeric value of effective sample size
trace Should R print progress updates to the console? Default is FALSE
customize_plot Additional plotting parameters passed to ggplot. NOW DEPRECATED.
Details
Calculates the relative importance of each variable using multi-model inference methods in a waveletmulti-resolution regression framework implemented in mmiWMRR. The scale level dependent resultsare then graphically displayed.
Value
A list containing
1. A matrix containing the relative importance of each variable in the regression at each value ofthe scale level.
2. A ggplot object containing a plot of the relative variable importance
Examples
data(carlinadata)coords<- carlinadata[,4:5]
## Not run:
wrm <- WRM(carlina.horrida ~ aridity + land.use,family = "poisson",data = carlinadata,coord = coords,level = 1,wavelet = "d4")
mmi <- mmiWMRR(wrm, data = carlinadata, scale = 3, detail = TRUE)
# Plot scale-dependent relative variable importancervi <- rvi.plot(carlina.horrida ~ aridity + land.use,
family = "poisson",data = carlinadata,coord = coords,maxlevel = 4,detail = TRUE,wavelet = "d4")
rvi$plotrvi$rvi
scaleWMRR 17
## End(Not run)
scaleWMRR Scaling by wavelet multiresolution regression (WMRR)
Description
scaleWMRR performs a scale-specific regression based on a wavelet multiresolution analysis.
Usage
scaleWMRR(formula, family, data, coord, scale = 1, detail = TRUE,wavelet = "haar", wtrafo = "dwt", b.ini = NULL, pad = list(),control = list(), moran.params = list(), trace = FALSE)
Arguments
formula With specified notation according to names in data frame.
family gaussian, binomial, or poisson.
data Data frame.
coord Corresponding coordinates which have to be integer.
scale 0 (which is equivalent to GLM) or higher integers possible (limit depends onsample size).
detail Remove smooth wavelets? If TRUE, only detail components are analyzed. If setto FALSE, smooth and detail components are analyzed. Default is TRUE.
wavelet Type of wavelet: haar or d4 or la8
wtrafo Type of wavelet transform: dwt or modwt.
b.ini Initial parameter values. Default is NULL.
pad A list of parameters for padding wavelet coefficients.
• padform - 0, 1, and 2 are possible. padform is automatically set to 0when either level=0 or the formula includes an intercept and has a non-gaussian family.
– 0 - Padding with 0s.– 1 - Padding with mean values.– 2 - Padding with mirror values.
• padzone - Factor for expanding the padding zone
control A list of parameters for controlling the fitting process.
• eps - Positive convergence tolerance. Smaller values of eps provide bet-ter parameter estimates, but also reduce the probability of the iterationsconverging. In case of issues with convergence, test larger values of eps.Default is 10^-5.
18 scaleWMRR
• denom.eps - Default is 10^-20.• itmax - Integer giving the maximum number of iterations. Default is 200.
moran.params A list of parameters for calculating Moran’s I.
• lim1 - Lower limit for first bin. Default is 0.• increment - Step size for calculating Moran’s I. Default is 1.
trace A logical value indicating whether to print parameter estimates to the console
Details
This function fits generalized linear models while taking the two-dimensional grid structure ofdatasets into account. The following error distributions (in conjunction with appropriate link func-tions) are allowed: gaussian, binomial, or poisson. The model provides scale-specific resultsfor data sampled on a contiguous geographical area. The dataset is assumed to be regular griddedand the grid cells are assumed to be square. A function from the package ’waveslim’ is used for thewavelet transformations (Whitcher, 2005). Furthermore, this function requires that all predictorvariables be continuous.
Value
scaleWMRR returns a list containing the following elements
call Model call
b Estimates of regression parameters
s.e. Standard errors of the parameter estimates
z Z values (or corresponding values for statistics)
p p-values for each parameter estimate
df Degrees of freedom
fitted Fitted values
resid Pearson residuals
converged Logical value whether the procedure converged
trace Logical. If TRUE:
• ac.glm Autocorrelation of glm.residuals• ac Autocorrelation of wavelet.residuals
Author(s)
Gudrun Carl
References
Carl G, Doktor D, Schweiger O, Kuehn I (2016) Assessing relative variable importance acrossdifferent spatial scales: a two-dimensional wavelet analysis. Journal of Biogeography 43: 2502-2512.
Whitcher, B. (2005) Waveslim: basic wavelet routines for one-, two- and three-dimensional signalprocessing. R package version 1.5.
spind 19
See Also
waveslim,mra.2d
Examples
data(carlinadata)coords <- carlinadata[ ,4:5]
## Not run:
# scaleWMRR at scale = 0 is equivalent to GLM
ms0 <- scaleWMRR(carlina.horrida ~ aridity + land.use,family = "poisson",data = carlinadata,coord = coords,scale = 0,trace = TRUE)
# scale-specific regressions for detail componentsms1 <- scaleWMRR(carlina.horrida ~ aridity + land.use,
family = "poisson",data = carlinadata,coord = coords,scale = 1,trace = TRUE)
ms2 <- scaleWMRR(carlina.horrida ~ aridity + land.use,family = "poisson",data = carlinadata,coord = coords,scale = 2,trace = TRUE)
ms3<- scaleWMRR(carlina.horrida ~ aridity + land.use,family = "poisson",data = carlinadata,coord = coords,scale = 3,trace = TRUE)
## End(Not run)
spind spind: Spatial Methods and Indices
20 spind
Description
The spind package provides convenient implementation of Generalized estimating equations (GEEs)and Wavelet-revised models (WRMs) in the context of spatial models. It also provides tools formulti-model inference, stepwise model selection, and spatially corrected model diagnostics. Thishelp section provides brief descriptions of each function and is organized by the type of model theyapply to or the scenarios in which you might use them. Of course, these are recommendations - feelfree to use them as you see fit. For a more detailed description of the package and its functions,please see the vignette Intro to spind (browseVignettes('spind')).
GEEs
The GEE function fits spatial models using a generalized estimating equation and a set of griddeddata. The package also includes S3 methods for summary and predict so you can interact withthese models in the same way you might interact with a glm or lm.
WRMs
The WRM function fits spatial models using a wavelet-revised model and a set of gridded data. Thepackage also includes S3 methods for summary and predict so you can interact with these modelsin the same way you might interact with a glm or lm. There are also a number of helper functionsthat help you fine tune the fitting process that are specific to WRMs. Please see the documentationfor WRM for more details on those.
WRM also has a few other features specific to it. For example, if you are interested in viewing thevariance or covariance of your variables as a function of level, covar.plot is useful. upscalewill plot your matrices as a function of level so you can examine the effect of cluster resolution onyour results.
Multi-model inference and stepwise model selection
spind includes a couple of functions to help you find the best fit for your data. The first two aremultimodel inference tools specific to GEEs and WRMs and are called mmiGEE and mmiWMRR. Thesegenerate outputs very similar to those from the MuMIn package. If you would like to see how variableimportance changes as a function of the scale of the WMRR, you can call rvi.plot. This willgenerate a model selection table for each degree of level (from 1 to maxlevel) and then plot theweight of each variable as a function of level.
spind also includes a function for stepwise model selection that is loosely based on step andstepAIC. step.spind differs from these in that it is specific to classes WRM and GEE. It performsmodel selection using AIC or AICc for WRMs and QIC for GEEs.
Spatial indices of goodness of fit
Finally, spind has a number of functions that provide spatially corrected goodness of fit diagnosticsfor any type of model (i.e. they are not specific to classes WRM or GEE). These first appeared inspind v1.0 and have not been updated in this release. The first two are divided into whether or notthey are threshold dependent or not. Threshold dependent metrics can be calculated using th.depand threshold independent metrics can be calculated using th.indep.
step.spind 21
acfft calculates spatial autocorrelation of residuals from a model using Moran’s I. You can set thenumber of distance bins you’d like to examine using dmax argument and the size of those bins usinglim1 and lim2.
Conclusion
The vignette titled Intro to spind provides more information on these functions and some exampleworkflows that will demonstrate them in greater depth than this document. Of course, if you havesuggestions on how to improve this document or any of the other ones in here, please don’t hesitateto contact us.
step.spind Stepwise model selection for GEEs and WRMs
Description
Stepwise model selection by AIC or AICc for WRMS and QIC for GEEs
Usage
step.spind(object, data, steps = NULL, trace = TRUE, AICc = FALSE)
Arguments
object A model of class WRM or GEE.
data The data used to fit that model.
steps Number of iterations the procedure should go through before concluding. Thedefault is to use the number of variables as the number of iterations.
trace Should R print progress updates and the final, best model found to the console?Default is TRUE.
AICc Logical. In the case of model selection with WRMs, should AICc be used todetermine which model is best rather than AIC? This argument is ignored forGEEs. Default is FALSE.
Details
This function performs stepwise variable elimination for model comparison. Each iteration will tryto find the best combination of predictors for a given number of variables based on AIC, AICc, orQIC, and then use that as the base model for the next iteration until there are no more variables toeliminate. Alternatively, it will terminate when reducing the number of variables while respectingthe model hierarchy no longer produces lower information criterion values.
22 step.spind
Value
A list with components model and table. model is always formula for the best model found by theprocedure. table is always a data frame, but the content varies for each type of model. For WRM’s,the columns returned are
• Deleted.Vars Variables retained from the previous iteration which were tested in the currentiteration.
• LogLik Log-likelihood of the model.
• AIC AIC score for the model.
• AICc AICc score for the model.
For GEEs:
• Deleted.Vars Variables retained from the previous iteration which were tested in the currentiteration.
• QIC Quasi-information criterion of the model.
• Quasi.Lik Quasi-likelihood of the model.
Note
Currently, the function only supports backwards model selection (i.e. one must start with a fullmodel and subtract variables). Forward and both directions options may be added later.
Author(s)
Sam Levin
References
Hardin, J.W. & Hilbe, J.M. (2003) Generalized Estimating Equations. Chapman and Hall, NewYork.
See Also
qic.calc, aic.calc, add1, step, stepAIC
Examples
# For demonstration only. We are artificially imposing a grid structure# on data that is not actually spatial data
library(MASS)data(birthwt)
x <- rep(1:14, 14)y <- as.integer(gl(14, 14))coords <- cbind(x[-(190:196)], y[-(190:196)])
## Not run:formula <- formula(low ~ age + lwt + race + smoke + ftv + bwt)
th.dep 23
mgee <- GEE(formula,family = "gaussian",data = birthwt,coord = coords,corstr = "fixed",scale.fix = TRUE)
ss <- step.spind(mgee, birthwt)
best.mgee <- GEE(ss$model,family = "gaussian",data = birthwt,coord = coords,corstr = "fixed",scale.fix = TRUE)
summary(best.mgee, printAutoCorPars = FALSE)
## End(Not run)
th.dep Spatial threshold-dependent accuracy measures
Description
Calculates spatially corrected, threshold-dependent metrics for an observational data set and modelpredictions (Kappa and confusion matrix)
Usage
th.dep(data, coord, thresh = 0.5, spatial = TRUE)
Arguments
data A data frame or matrix with two columns. The first column should contain actualpresence/absence data (binary, 0 or 1) and the second column should containmodel predictions of probability of occurrence (numeric, between 0 and 1).
coord A data frame or matrix with two columns containing x,y coordinates for eachactual and predicted value. Coordinates must be integer and consecutively num-bered.
thresh A cutoff value for classifying predictions as modeled presences or modeled ab-sences. Default is 0.5.
spatial A logical indicating whether spatially corrected indices (rather than classicalindices) should be computed.
24 th.indep
Value
A list with the following components:
kappa Kappa statistic
cm Confusion matrix
sensitivity Sensitivity
specificity Specificity
actuals Actual occurrence data or adjusted actual occurrence data
splitlevel.pred Level splitting of predicted values
splitlevel.act Level splitting of actuals or adjusted actuals
splitposition.pred Position splitting of predicted values
splitposition.act Position splitting of actuals or adjusted actuals
Author(s)
Gudrun Carl
References
Carl G, Kuehn I (2017) Spind: a package for computing spatially corrected accuracy measures.Ecography 40: 675-682. DOI: 10.1111/ecog.02593
See Also
th.indep
Examples
data(hook)data <- hook[ ,1:2]coord <- hook[ ,3:4]si1 <- th.dep(data, coord, spatial = TRUE)si1$kappasi1$cm
th.indep Spatial threshold-independent accuracy measures
Description
Calculates spatially corrected, threshold-independent metrics for an observational data set andmodel predictions (AUC, ROC, max-TSS)
th.indep 25
Usage
th.indep(data, coord, spatial = TRUE, plot.ROC = FALSE,customize_plot = NULL)
Arguments
data A data frame or matrix with two columns. The first column should contain actualpresence/absence data (binary, 0 or 1) and the second column should containmodel predictions of probability of occurrence (numeric, between 0 and 1).
coord A data frame or matrix with two columns containing x,y coordinates for eachactual and predicted value. Coordinates must be integer and consecutively num-bered.
spatial A logical value indicating whether spatial corrected indices (rather than classicalindices) should be computed.
plot.ROC A logical indicating whether the ROC should be plotted. NOW DEPRECATED.
customize_plot Additional plotting parameters passed to ggplot. NOW DEPRECATED.
Value
A list with the following components:
AUC Area under curve
opt.thresh optimal threshold for maximum TSS value
TSS Maximum TSS value
sensitivity Sensitivity
Specificity Specificity
AUC.plot A ggplot object
Author(s)
Gudrun Carl
References
Carl G, Kuehn I (2017) Spind: a package for computing spatially corrected accuracy measures.Ecography 40: 675-682. DOI: 10.1111/ecog.02593
See Also
th.dep
26 upscale
Examples
data(hook)data <- hook[ ,1:2]coord <- hook[ ,3:4]si2 <- th.indep(data, coord, spatial = TRUE)si2$AUCsi2$TSSsi2$opt.threshsi2$plot
upscale Upscaling of smooth components
Description
The analysis is based a wavelet multiresolution analysis using only smooth wavelet components. Itis a 2D analysis taking the grid structure and provides scale-specific results for data sampled on acontiguous geographical area. The dataset is assumed to be regular gridded and the grid cells areassumed to be square. The scale-dependent results are graphically displayed.
Usage
upscale(f, coord, wavelet = "haar", wtrafo = "dwt", pad = mean(f),color.maps = FALSE)
Arguments
f A vector.
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
wavelet Name of wavelet family. haar, d4, and la8. are possible. haar is the default.
wtrafo Type of wavelet transform. Either dwt or modwt. dwt is the default.
pad A numeric value for padding the matrix into a bigger square. Default is set tomean(f).
color.maps A logical value. If TRUE, produces colorful maps. If FALSE, produces grayscalemaps. Default is grayscale. NOW DEPRECATED, color maps will not be pro-duced in future versions.
Value
A set of plots showing the matrix image at each value for level.
Author(s)
Gudrun Carl
wavecovar 27
Examples
data(carlinadata)coords <- carlinadata[ ,4:5]
# Upscaling of smooth componentsupscale(carlinadata$land.use, coord = coords)
wavecovar Wavelet covariance analysis
Description
Calculates the wavelet covariance based on a wavelet multiresolution analysis.
Usage
wavecovar(f1, f2, coord, wavelet = "haar", wtrafo = "dwt")
Arguments
f1 A vector of length n.
f2 A vector of length n.
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
wavelet Name of wavelet family. haar, d4, and la8. are possible. haar is the default.
wtrafo Type of wavelet transform. Either dwt or modwt. dwt is the default.
Value
Wavelet covariance for f1 and f2.
Author(s)
Gudrun Carl
See Also
waveslim, WRM, covar.plot, scaleWMRR
28 wavevar
Examples
data(carlinadata)
coords <- carlinadata[ ,4:5]pc <- covar.plot(carlina.horrida ~ aridity + land.use,
data = carlinadata,coord = coords,wavelet = 'd4',wtrafo = 'modwt',plot = 'covar')
pc$plot
wavevar Wavelet variance analysis
Description
Calculates the wavelet variance based on a wavelet multiresolution analysis.
Usage
wavevar(f, coord, wavelet = "haar", wtrafo = "dwt")
Arguments
f A vector
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
wavelet Name of wavelet family. haar, d4, and la8. are possible. haar is the default.
wtrafo Type of wavelet transform. Either dwt or modwt. dwt is the default.
Value
Wavelet variance for f.
Author(s)
Gudrun Carl
See Also
waveslim, WRM, covar.plot, scaleWMRR
WRM 29
Examples
data(carlinadata)
coords <- carlinadata[ ,4:5]pv <- covar.plot(carlina.horrida ~ aridity + land.use,
data = carlinadata,coord = coords,wavelet = 'd4',wtrafo = 'modwt',plot = 'var')
pv$plot
WRM Wavelet-revised models (WRMs)
Description
A wavelet-based method to remove spatial autocorrelation in multiple linear regressions. Wavelettransforms are implemented using waveslim (Whitcher, 2005).
Usage
WRM(formula, family, data, coord, level = 1, wavelet = "haar",wtrafo = "dwt", b.ini = NULL, pad = list(), control = list(),moran.params = list(), plot = FALSE, customize_plot = NULL)
## S3 method for class 'WRM'plot(x, ...)
## S3 method for class 'WRM'summary(object, ...)
## S3 method for class 'WRM'predict(object, newdata, sm = FALSE, newcoord = NA, ...)
Arguments
formula Model formula. Variable names must match variables in data.
family gaussian, binomial, or poisson are supported.
data A data frame with variable names that match the variables specified in formula.
coord A matrix of two columns with corresponding cartesian coordinates. Currentlyonly supports integer coordinates.
level An integer specifying the degree of wavelet decomposition
• 0 - Without autocorrelation removal (equivalent to a GLM)
30 WRM
• 1 - For best autocorrelation removal• ... - Higher integers possible. The limit depends on sample size
wavelet Name of wavelet family. haar, d4, and la8. are possible. haar is the default.
wtrafo Type of wavelet transform. Either dwt or modwt. dwt is the default.
b.ini Initial parameter values. Default is NULL.
pad A list of parameters for padding wavelet coefficients.
• padform - 0, 1, and 2 are possible. padform is automatically set to 0 wheneither level=0 or a formula including an intercept and a non-gaussianfamily
– 0 - Padding with 0s.– 1 - Padding with mean values.– 2 - Padding with mirror values.
• padzone - Factor for expanding the padding zone
control a list of parameters for controlling the fitting process.
• eps - Positive convergence tolerance. Smaller values of eps provide bet-ter parameter estimates, but also reduce the probability of the iterationsconverging. In case of issues with convergence, test larger values of eps.Default is 10^-5.
• denom.eps - Default is 10^-20.• itmax - Integer giving the maximum number of iterations. Default is 200.
moran.params A list of parameters for calculating Moran’s I.
• lim1 - Lower limit for first bin. Default is 0.• increment - Step size for calculating Moran’s I. Default is 1.
plot A logical value indicating whether to plot autocorrelation of residuals by dis-tance bin. NOW DEPRECATED in favor of plot.WRM method.
customize_plot Additional plotting parameters passed to ggplot. NOW DEPRECATED in fa-vor of plot.WRM method.
x An object of class GEE or WRM
... Not used
object An object of class WRM
newdata A data frame containing variables used to make predictions.
sm Logical. Should part of smooth components be included?
newcoord New coordinates corresponding to observations in newdata.
Details
WRM can be used to fit linear models for response vectors of different distributions: gaussian,binomial, or poisson. As a spatial model, it is a generalized linear model in which the residualsmay be autocorrelated. It corrects for 2-dimensional residual autocorrelation for regular griddeddata sets using the wavelet decomposition technique. The grid cells are assumed to be square.Furthermore, this function requires that all predictor variables be continuous.
WRM 31
Value
An object of class WRM. This consists of a list with the following elements:
call Call
formula Model formula
family Family
coord Coordinates used in the model
b Estimate of regression parameters
s.e. Standard errors
z Depending on the family, either a z or t value
p p-values
fitted Fitted values from the model
resid Pearson residuals
b.sm Parameter estimates of neglected smooth part
fitted.sm Fitted values of neglected smooth part
level Selected level of wavelet decomposition
wavelet Selected wavelet
wtrafo Selected wavelet transformation
padzone Selected padding zone expansion factor
padform Selected matrix padding type
n.eff Effective number of observations
AIC Akaike information criterion
AICc AIC score corrected for small sample sizes
LogLik Log likelihood of the model
ac.glm Autocorrelation of GLM residuals
ac.wrm Autocorrelation of WRM residuals
b.ini Initial parameter values
control Control parameters for the fitting process
moran.params Parameters for calculating Moran’s I
pad List of parameters for padding wavelet coefficients
plot An object of class ggplot containing information on the autocorrelation of residuals from thefitted WRM and a GLM
Note
For those interested in multimodel inference approaches, WRM with level = 1 is identical tommiWMRR with scale = 1.
Author(s)
Gudrun Carl, Sam Levin
32 WRM
References
Carl, G., Kuehn, I. (2010): A wavelet-based extension of generalized linear models to remove theeffect of spatial autocorrelation. Geographical Analysis 42 (3), 323 - 337
Whitcher, B. (2005) Waveslim: basic wavelet routines for one-, two- and three-dimensional signalprocessing. R package version 1.5.
See Also
mmiWMRR, predict.WRM, summary.WRM, aic.calc
Examples
data(musdata)coords <- musdata[,4:5]
## Not run:mwrm <- WRM(musculus ~ pollution + exposure,
family = "poisson",data = musdata,coord = coords,level = 1)
pred <- predict(mwrm, newdata = musdata)
summary(mwrm)
plot(mwrm)
library(ggplot2)
my_wrm_plot <- mwrm$plot
# increase axis text sizeprint(my_wrm_plot + ggplot2::theme(axis.text = element_text(size = 15)))
## End(Not run)
Index
∗Topic datasetscarlinadata, 5hook, 10musdata, 14
acfft, 2add1, 22adjusted.actuals, 3aic.calc, 4, 13, 22, 32
carlinadata, 5covar.plot, 5, 27, 28
GEE, 7, 11, 12gee, 9
hook, 10
mmiGEE, 11mmiWMRR, 12, 32mra.2d, 19musdata, 14
plot.GEE (GEE), 7plot.WRM (WRM), 29predict.GEE (GEE), 7predict.WRM, 32predict.WRM (WRM), 29
qic.calc, 9, 12, 14, 22
rvi.plot, 13, 15
scaleWMRR, 12, 13, 17, 27, 28spind, 19spind-package (spind), 19step, 20, 22step.spind, 21stepAIC, 20, 22summary.GEE, 9summary.GEE (GEE), 7
summary.WRM, 32summary.WRM (WRM), 29
th.dep, 23, 25th.indep, 24, 24
upscale, 26
wavecovar, 6, 27wavevar, 6, 28WRM, 13, 27, 28, 29
33