spatial modeling and analysis deana d. pennington, phd university of new mexico
Post on 13-Jan-2016
214 Views
Preview:
TRANSCRIPT
Spatial Modeling and Spatial Modeling and AnalysisAnalysis
Deana D. Pennington, PhDDeana D. Pennington, PhDUniversity of New MexicoUniversity of New Mexico
What is spatial analysis?What is spatial analysis?
Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the
analysis
Statistical IssuesStatistical IssuesValid statistics depend on:Valid statistics depend on: Temporal stability and causal transienceTemporal stability and causal transience Unit homogeneityUnit homogeneity IndependenceIndependence Constant effectsConstant effects
BUT Ecology & Earth Science violate all of these!BUT Ecology & Earth Science violate all of these!We study:We study: Change with time (no temporal stability)Change with time (no temporal stability) Legacies, persistence, recovery (no causal transience )Legacies, persistence, recovery (no causal transience ) Heterogenity through space and time (no unit Heterogenity through space and time (no unit
homogeneityhomogeneity Spatial structure (no independence)Spatial structure (no independence) Differences in response through space/time (non-Differences in response through space/time (non-
constant effects)constant effects) Attributes rather than causal factors, which must be Attributes rather than causal factors, which must be
inferredinferred
Issues in Spatial AnalysisIssues in Spatial Analysis
•Error•Small sample sizes compared with size of environmental data sets•Spatial dependency•Spatial heterogeneity•Boundaries effects•Modifiable Areal Unit Problem
Spatial DependencySpatial Dependency
Tobler’s Law: All things are related, but nearby things are more related than distant things
Non-independent observations: duplicates observations in the sample set, therefore is a loss of information compared with independent observations. Affects mean, variance, confidence intervals and significance tests
***Field samples tend to be taken from nearby locations, and are almost always spatially autocorrelated***
Spatial HeterogeneitySpatial Heterogeneity
•Stratification of the landscape (regions, classes, etc) problematic due to gradational nature•Intra-strata variability, mixtures•Differences in numbers of observations within strata
Heterogenity in spatial data
300 x 300 pixels, 192 training pixels out of 90,000 total pixels, 7 mislabeled
*low % samples*errors in samples
Hyperspectral ExampleHyperspectral Example
Roads 33Clouds 23
River 23
Riparian 28
Arid upland 25
Barren 22
Agriculture 38
7TrueColor
FalseColor
6 km2
Hyperspectral ResultsHyperspectral Results
7
Riparian
Riparian
Riparian
Riparian
Arid upland
Semi-arid upland
Arid upland
K-meansUnsupervised10 classes
Semi-arid upland
Clouds/barren
•Confusion between river & agriculture
•Confusion between clouds and barren
•Unsampled semi-arid upland
•Mislabeled arid upland
•Unsampled variability in riparian
River/agriculture
•Road variability
7
Clouds
Agriculture
River
Riparian
Arid upland
Barren
Roads
Unclassified
K-means UnsupervisedMaximum Likelihood89.44%
Naïve Bayesian83.33%
Parallelepiped82.78%
Minimum Distance69.44%
Support Vector Machine77.22%
•Confusion between river & agriculture•Confusion between clouds and barren•Unsampled semi-arid upland
•Mislabeled arid upland (4.4%)•Unsampled variability in riparian•Road variability
Boundary EffectsBoundary Effects
•Loss of neighbors in analyses that depend on neighborhood values•Solution: collect data along a border outside of the analysis area
Modifiable Areal Unit Modifiable Areal Unit Problem (MAUP)Problem (MAUP)
•Results sensitive to cell size, location, orientation
Components of Spatial Analysis
Exploratory Spatial Data Analysis (ESDA) Finding interesting patterns. Visualization Showing interesting patterns. Spatial Modeling Explaining interesting patterns.
Spatial AnalysesSpatial Analyses
Things to consider:Things to consider: Objective: describe, map, causationObjective: describe, map, causation Data type: binary (Y/N), categorical, Data type: binary (Y/N), categorical,
continuouscontinuous Expected pattern: gradient, periodic, Expected pattern: gradient, periodic,
clusteredclustered Scale of patternScale of pattern Univariate/multivariateUnivariate/multivariate
Spatial AnalysesSpatial Analyses
Biological survey where each point denotesthe observation of an endangered species. If a pattern exists, like this diagram, we may be ableto analyze behavior in termsof environmental characteristics
1. Quantify pattern• Attraction or
repulsion• Directionality
2. Make inferences about process based on observed pattern
ChoicesChoices
Point pattern analyses
Single scale of pattern Quadrat analysisNearest neighbor
Multiscale patternRefined nearest neighbor2nd order analysisRipley’s K
Make maps from pointsDistance interpolation
KrigingTrend surface analysis
Spline
Test models with space as causal factor
Mantel testMantel correlogramMultivariate analysis
Describe spatial structure
Gradient, periodic
Single scale of patternSemivariogramCorrelogram
Multiscale patternSpectral analysis
EdgeWaveletanalysis
ContextAdjacency measures
Cross variogramCross
correlogramSelf-similarity
Fractaldimension
Network AnalysisPath analysisAllocationConnectivity
Point Pattern Point Pattern AnalysisAnalysis
Clustered (attraction)
Uniform (repulsion)
Point Pattern AnalysisPoint Pattern Analysis
Statistical tests for significant patterns in data, compared with the null hypothesis of random spatial pattern
The standard against which spatial point patterns are compared is a:
Completely Spatially Random (CSR) Point Process Poisson probability distribution (mean = variance)
used to generate spatially random points
Quadrat AnalysisQuadrat Analysis1. Divide the area up into quadrats2. Count the number of points in each quadrat3. Compare counts with expected counts in random distribution
# ofcells
# of pts/cell
Expected CSR = null hypothesis
Clustered
UniformExpected mean #/cell in CSR = N/# of quadsFor Poisson distribution:
p(x) = (e- x)/x!
Chi square 2 = (observed – expected)2/expected# Oi P(x) Ei0 2 0.0156 0.391 2 0.0649 1.62 5.39 2.422 5 0.1350 3.383 1 0.1873 4.68… 2
Check Chi square tableIf Ho rejected:Mean <> varianceMean > variance (uniform)Mean < variance (clustered)
Nearest Neighbor Nearest Neighbor DistanceDistance
1. Calculate the distance to the nearest neighbor for every point2. Calculate mean nn distance3. Calculate expected mean for CSR distribution E(di) = 0.5 A/N4. Compare expected mean to observed mean with Z statistic
Z = [ d – E(di)] / [0.0683 A/N2]
Look up in significance in z-statistic tableIf Ho rejected,
observed mean < expected and Z < 0 => clusteredobserved mean > expected and Z > 0 => uniform
Ripley’s KRipley’s K1. Expand a circle of increasing radius around each point2. Count the number of points within each circle.3. Calculate L(d), a measure of the expected number of points
within distance (d); L(d) = [ASkij/N(N-1)]0.5, where A = area, Skij = number of points j within distance d of all i points
4. Monte Carlo simulations or t-test
Radius
L(d)
Expected CSR mean
Clustered
Uniform
***Note added information – mean clustering distance
Lab #12ALab #12A
Point pattern analysisPoint pattern analysis
Analysis of Continuous Analysis of Continuous DataData
1. Variation in mean values
2. Describe local variability & spatial dependence
Mean trendsMean trends
Focal
Zonal
Global
Input Output
Single value (surface analysis)
or table
Grid Analysis: Focal Grid Analysis: Focal AnalysisAnalysis
Spatial filters: output value for each cell is calculated from neighboring cells (moving windows)
Neighborhood shapes: MajorityMaximumMeanMedianMinimumRangeStandard deviationSumVariety
Species A habitatSpecies B habitat
Range Species A = 4 cellsSpecies A depends on B
•Low pass: Smoothing, removing noise•High pass: Emphasize local variation•Edge enhancement
Grid Analysis: Zonal Grid Analysis: Zonal AnalysisAnalysis
Vegetation class A or land use A
Vegetation class B or land use B
Vegetation class C or land use C
AreaCentroidGeometryPerimeter
MajorityMaximumMeanMedianMinimumRangeStatisticsStandard deviationSumThicknessVarietyOutput is:
a) grid with same value in each cell for a given zoneb) table with values by zone
Lab 12B Lab 12B Neighborhood and Neighborhood and
Zone AnalysisZone Analysis
Geostatistics BasicsGeostatistics Basics
Parametric StatsUnivariate Multivariate
Spatial StatsUnivariateMultivariate
meanvariance
x
correlation
covariance
x, y
semi-variancelag correlation
lag covariance
x, h
h = lag (time or space)
cross-semivariance (variogram)cross correlation ||inverse
cross covariance (correlogram)
x, y, h
Semi-variance Semi-variance hh
N
Variance: 2 = (xi – x )2
i=1
N
Nh
Semi-variance: h = (xi – xi+h )2
i=1
2Nh
Local meanw.r.t study
extent
1. Slide x through space to get h 2. Vary h
Xi
Xi+h
Semi-variance Semi-variance hh
Nh
Semi-variance: h = (xi – xi+h )2
i=1
2Nh
Local mean
Xi
Xi+h
Number of cells N = 10Number of windows Nh = # cells – h
h = 1….Nh = 9
h = 5….Nh = 5
Limit h to 1/3 of study extent
Nh
Semi-variance: h = (xi – xi+h )2
i=1
2Nh
Next x
Semi-variogramSemi-variogramIf xi is similar to xi+h , h is small, and they are spatially correlatedIf xi is not similar to xi+h , h is large, and they are not spatially correlated
=> h measures heterogeneity
Nugget
Sill
Range
Nugget – value of h at distance 0 (not in data) – measure of unexplained variabilityRange – distance h of leveling off – below range heterogeneity is increasing in a predictable manner, above range, heterogenity is constant – measure of independenceSill – measure of maximum heterogeneity in data (max)
h
hh
0independence
spatialdependence
Semi-variogramsSemi-variograms
h
hh
0
h
hh
0
periodic, cyclic
Examples: timber harvest, forest agerange harvest areasill rotation
gradient, no sill or range
Lag Covariance: Geary’s Lag Covariance: Geary’s CC
Xi
Xi+hXi-h
Centered around mean values of x, x
Nh
Lag covariance: Ch = (xi – xi-h )(xi – xi+h ) i=1
Nh
Local mean
Correlograms have the inverse shape of semi-variograms
If x, xi+h and xi-h are all the same, Ch = 0If values are increasing or decreasing through space (xi-h < x < xi+h, or xi-h > x > xi+h, 1 term is negative and Ch = negative, things are not similar. Otherwise positive, things are similar
Lag Correlation: Moran’s Lag Correlation: Moran’s II
Centered around mean values of x, xStandardized against sample variation
Nh
Lag covariance: Ch = (xi – xi-h )(xi – xi+h ) i=1
Nh
Lag correlation Ph = Ch Sx-h Sx+h
ComparisonComparison
Semi-variance h 0 < Gh <
Lag Covariance Geary’s C Ch - < Ch <
Lag Correlation Moran’s I Ph -1 < Ph < +1
h
hh
0 h
CChh
0-
h
PPhh
0-1
+1
range similar h
zero
Correlated Independent
Lab 12C CorrelogramsLab 12C Correlograms
Surface AnalysisSurface Analysis
Spatial distribution of Spatial distribution of surface information in surface information in terms of a three-terms of a three-dimensional structuredimensional structure
Surfaces do not have to Surfaces do not have to be elevation, but could be elevation, but could be population density, be population density, species richness, or any species richness, or any other measured other measured attributedattributed
Surface Surface AnalysisAnalysis
Given geolocated point data, calculate values at regular intervals between points
Inverse distance weighting
•Can’t create extremes (ridges, valleys)•Isotropic influence (not ridge preserving)•Best with dense samples
Kriging
•Uses semi-variogram to determine relative importance (weighting) of data at different distances•Uses global variation, only works well if semi-varigram captures variation across entire mapTrend analysis
•Calculates a best-fit polynomial equation using linear regression•Recalculates all positions using equation (lose original data)•Smoothing depends on polynomial order
Spline
•Calculates a 2-D minimum curvature surface that passes through every input point
Surface Analysis: Surface Analysis: StreamsStreams
Network AnalysisNetwork Analysis Designed specifically for line features organized Designed specifically for line features organized
in connected networks, typically applies to in connected networks, typically applies to transportation problems and location analysistransportation problems and location analysis
•Streams•Dispersal vectors•Community interactions
Network AnalysisNetwork Analysis
•Pathfinding: shortest or least cost•Allocation of network areas to a center based on supply, demand and impedance•Connectivity
Integrated Integrated AnalysisAnalysis
DEMHydroModel
Watershed
LandCover
Soil
Grid Process
Statistics
Modeling- regression,
et al.
GaugePoints
Samples
Field Data(Vector)
Lab 12D Lab 12D CorrelationCorrelation
SamplingSampling Spatial dependency must be considered in Spatial dependency must be considered in
sample designsample design Non-independent observationsNon-independent observations Fewer degrees of freedomFewer degrees of freedom Differences within groups will appear small => Differences within groups will appear small =>
over estimate significance of between group over estimate significance of between group variationvariation
Spatial structure & heterogeneity can affect Spatial structure & heterogeneity can affect experimental results – response due to treatments experimental results – response due to treatments or due to inherent spatial structure?or due to inherent spatial structure?
Solutions:Solutions: include space as an explanatory variable (Mantel include space as an explanatory variable (Mantel
test)test) Sample at greater distance than the variogram Sample at greater distance than the variogram
rangerange
Elevation (m)
Vegetation cover type
P, juniper, 2200m, 16CP, pinyon, 2320m, 14CA, creosote, 1535m, 22C
Sample 3, lat, long, species, absence
Mean annual temperature (C)
Access File
Excel File
Integrated data:
Sample 2, lat, long, species, presence
Sample 1, lat, long, species, presence
Example: Integrating Example: Integrating Species Occurrence Species Occurrence Points and ImagesPoints and Images
1. Semantics2. Compatible scales3. Reproject4. Resample grain5. Clip extent6. Sample occurrence points
ENM ResultsENM Results
Geographic patterns of species richness of 17 native rodent species.
Sanchez-Cordero and Martinez-Meyer, 2000
Model building and testing. a) training data; b) predictive model.
Peterson, Ball and Cohoon, 2002
top related