dealing with spatial autocorrelation spatial analysis seminar spring 2009

Dealing with Spatial AutocorrelationDealing with Spatial Autocorrelation

Spatial Analysis Seminar

Spring 2009

Spatial Autocorrelation Defined

• “…the property of random variables taking values, at pairs of locations a certain distance apart, that are more similar (positive autocorrelation) or less similar (negative autocorrelation) than expected for randomly associated pairs of observations.”– Legendre (1993)

Types of Spatial Autocorrelation

• Inherent autocorrelation: caused by “contagious biotic processes”

vs.

• Induced spatial dependence: biological variables of interest are functionally dependent on one or more autocorrelated exogenous variable(s)

Why Should We Care?

• “natural systems almost always have autocorrelation in the form of patchiness or gradients…over a wide range of spatial and temporal scales.”– Fortin & Dale (2005)

→ Autocorrelation is a “fact of life” for ecologists!

2 Views of Spatial Autocorrelation:

1. It’s a nuisance that complicates statistical hypothesis testing

2. It’s functionally important in many ecosystems, so we must revise our theories and models to incorporate spatial structure

• Either way, the first step involves describing the autocorrelation (i.e., the “spatial structure”)

Describing Spatial Autocorrelation

• Compute Moran’s I or Geary’s c coefficients over multiple distances

• Correlogram: plot distance on X-axis against correlation coefficient on Y-axis

• Mantel correlogram: multivariate response

• Semi-variogram/variogram

Example Data

• Wetland hardwood forest (5 x 5 m cells)

• Response variable: log of non-ground lidar points in 0-1 m vertical height bin

• n1 = 217, n2 = 68

• Welch’s t-test (unequal variance, unequal sample sizes) results: t = 2.33, df = 181, p-value ≈ 0.021

Moran’s I correlograms

Now what do I do???

• Adjusting the effective sample size

• Spatial statistical modeling methods

• Restricted randomization

• Other methods: canonical ordination, partial Mantel tests, etc.

Adjusting the Effective Sample Size

• Estimate of effective sample size (Fortin & Dale 2005, p. 223, Equation 5.15):

n

i

n

jji xxCor

nn

1 1

2

),(

'

• For first-order autocorrelation ρ and large n:

1

1' nn

Adjusting the Effective Sample Size• For the “Recently Burned” example data:

11033.01

33.01217

1

1'

nn

• For the “Long Unburned” example data:

4322.01

22.0168

1

1'

nn

• Welch’s t-test results: t = 1.76, df = 123, p ≈ 0.080• BUT, this is a very simplistic model!

Detour: Autocorrelation Models

• Model 1 (“spatial independence”):

• Model 2 (“first-order autoregressive”):

• Model 3 (“induced autoregressive”):

• Model 4 (“doubly autoregressive”):

ii

iii

z

zx

11,1 iii xx

iii

iii

zz

zx

1

iizi

iixii

zz

xzx

1

1

SOURCE: Fortin & Dale (2005), pp. 213-216

Detour: Autocorrelation Models

• The models on the previous slide were one-dimensional, but most spatial data is two-dimensional (Lat-Long, XY-coordinates, etc.)

• The two-dimensional spatial autocorrelation model incorporates W, a “proximity matrix” of neighbor weights, which in turn affects the variance-covariance matrix (C):

12 )]()[(

)(

WIWIC

ZxWZx

T

Generalized Least Squares (GLS)

• Relatively easy way to introduce spatial autocorrelation structure to linear models

• Fits a parametric correlation function (exponential, Gaussian, spherical, etc.) directly to the variance-covariance matrix

• Assumes normally distributed errors, but errors are allowed to be correlated and/or have unequal variances

• Built-in R package: nlme

GLS Model – No Spatial Structurelibrary(nlme)…## Model A: spatial independenceModelA <- gls(LN_COUNT~BURNED,data=SAC_data)plot(Variogram(ModelA, form=~x+y))

GLS Models with Spatial Structure> ModelB <- gls(LN_COUNT~BURNED,data=SAC_data,corr=corAR1())> ModelC <- gls(LN_COUNT~BURNED,data=SAC_data,corr=corExp(form=~x+y))> ModelD <- gls(LN_COUNT~BURNED,data=SAC_data,corr=corGaus(form=~x+y))> ModelE <- gls(LN_COUNT~BURNED,data=SAC_data,corr=corSpher(form=~x+y))> AIC(ModelA,ModelB,ModelC,ModelD,ModelE)

df AICModelA 3 702.1288ModelB 4 677.3121ModelC 4 591.7996ModelD 4 607.3873ModelE 4 604.7950

> anova(ModelA,ModelC)

Model df AIC BIC logLik Test L.Ratio p-valueModelA 1 3 702.1288 713.0652 -348.0644 ModelC 2 4 591.7996 606.3814 -291.8998 1 vs 2 112.3293 <.0001

→ Exponential GLS model seems to fit best

Other Autocorrelation Models

• Conditional autoregressive (CAR), simultaneous autoregressive (SAR), and moving average (MA) models– See pp. 229-233 of Fortin & Dale (2005)– Implemented in R package spdep, as well as SAM

(Spatial Analysis for Macroecology) software

• Generalized linear mixed models (GLMMs): R built-in packages MASS, nlme

• But wait, there’s more: see Dormann et al. (2007) review paper in Ecography (30) 609-628.

Models and Reality

• “Much of the treatment of spatial autocorrelation in the statistical literature is predicated on the simplest AR model, which produces an exponential decline in autocorrelation as a function of distance (Figure 5.16).”– Fortin & Dale (2005, pp. 247-248)

• BUT, simple corrections based on first-order AR don’t account for effects of potentially negative autocorrelation at greater distances

Restricted Randomization

• PROBLEM: randomization tests based on complete spatial randomness will destroy autocorrelation structure

• POTENTIAL SOLUTIONS:1. “Toroidal shift” randomization (Figure 5.12)

2. Contiguity-constrained permutations (see Legendre et al. 1990 for algorithms)

Conclusion

• Incorporating spatial structure into ecological models was identified by Legendre as a “new paradigm” in 1993, BUT…

• …ecologists are still refining their methods for dealing with spatial autocorrelation

• OUR LAST HOPE?: Dale, M.R.T. and M.-J. Fortin. (in press). Spatial Autocorrelation and Statistical Tests: Some Solutions. Journal of Agricultural, Biological, and Environmental Statistics.

Spatial autocorrelation, don’t make me open this…

dealing with spatial autocorrelation spatial analysis seminar spring 2009

Documents

spatial structure slide

spatial data

order autocorrelation

autocorrelation models

spatial independence

previous slide

correlograms slide

induced spatial dependence