geog_404_lecture9

7/28/2019 GEOG_404_Lecture9

1/23

An overview of Geostatistical

Concepts & Examples

Lecture 9


2/23

Geostatistics

Geostatistics combines practical conceptual thoughts that

facilitate the modeling of spatial variability with mathematical

and statistical methods.

It is rigorous and has the ability: analyze and integrate different types of spatial data

measure spatial autocorrelation by incorporating the statistical

distribution

measure spatial relationships between the sample data

perform spatial prediction

assess uncertainty.

Geostatisticspredicts the value of unsampled locations from

the observed nearby samples by the defined relationships.


3/23

Geostatistics vs. Classical Statistics

Geostatistics assumes

there is spatial autocorrelation of a random function consisting of

random variables spatially distributed in a 2-dimensional space

data values of a random function at different locations are spatially

auto-correlated with each other. Classical statistics assumes there is no spatial autocorrelation

of a random variable, that is, data values of a random variable

at different locations are independent.

Regionalized variables In geostatistics, the random variablesare called regionalized variables.

the closer the locations of the data, the more similar the data values.

the similarity becomes weaker as the separation distance of data locations

increases and

disappears when the distance reaches a certain value called range.


4/23

Geostatistics (Example)

Lets suppose we want to measure variables like rainfall and

temperature

It can be possible through the meteorological stations located

at specified locations. But it is impossible to put monitoring stations everywhere.

Therefore we will establish spatial relationships between the

known values of ourobserved locations and use these

relationships too make predictions at unobserved locations. ****Geostatistics will play a role here****


5/23

Regionalized variables

A variable that takes on values according to its spatial

location is known as a regionalizedvariable.

Considering a variablezmeasured at location i, we can

partition the total variability inzinto three components:

z(i) =f(i) + s(i) +

wheref(i) is some coarse-scale forcing or trend in the data, s(i) is local spatial dependency, and

is error variance (presumed normal).


6/23


blue dots represent the data

The structural component (e.g., a linear trend) The random noise component (non-fitted) The spatially correlated component


7/23


Regionalized variables are variables that fall between randomvariables and completely deterministic variables.

Typical regionalized variables are functions describingvariables that have geographic distributions

Example: elevation of ground surface).

Unlike random variables, regionalized variables exhibit spatialcontinuity

the change in the variable is so complex that they cannot be

described by any deterministic function. The variogram is used to describe regionalized variables


8/23

Variograms (Basic Concepts)

Variogram: A visual exploratory tool for characterizing the

spatial continuity of the variable.

Sill: the plateau that the variogram reaches;

in the variogram context it is the average squared difference betweenpaired data values and it is approximately equal to twice the variance of

the data

Range: The distance at which the variogram reaches the sill.

Nugget Effect: The vertical height of the discontinuity at the

origin. It is the combination of:

(1) short-scale variations that occur at a scale smaller than the closest

sample spacing; and

(2) sampling error due to the way the samples were collected, prepared,

and analyzed.


9/23


Kriging: The process of fitting the best linear unbiased

estimate of a value at a point or of an average over a volume.

Isotropic (semi)variogram: This is when the spatial pattern is

identical in all directions. In this case, the fitting of the semivariogram model will heavily depend

on the (Euclidean) distance between locations.

Anisotropic (semi)variogram: This is when the spatial pattern

is strongly biased towards a specific direction.

This phenomenon is also at times referred as directional variograms

because the weighting scheme depends on distance and direction.


10/23

Variograms

0

0.2

0.4

0.6

0.8

1

1.2

0 40 80 120 160 200

Distance between data locations h (m)

Maximum distance for spatial auto correlation = 150 m

V

ariance

Nugget Range

Structure Sill = nugget + structure


11/23


In mathematical terms, the semi-variogram:

Where h represents a distance vector.

( )2

1

1( ) [ ( ) ( )]

2 ( )

N h

h z u z u h

N h

h

h

h


12/23

Variograms


13/23

Variograms (ArcGIS Geostatistic Analysts)


14/23

Variograms

Statistical assumptions:

Stationarymean and variance are not a function of location. Second-order stationary is requiredvariance is a function of the separationdistance.

Isotropyno directional trends occur in the data (as contrasted with

anisotropy). However, you can compute directional variograms in order to assess directional

trends in the data.

Use of trend surface analysis to remove global trends in the data (totransform a non-stationary variable [mean varies across space] to astationary one).

Lag distances typically we group the distance intervals into classes so thatwe can have enough sample points within any one distance class (typically30 is suggested as the minimum number). Small-scale (high resolution) variation (at the resolution implied by the original sampling

scheme) may not be detectable as a result.


15/23

Variograms

The technique can provide: a quantification of the scale of variability exhibited by natural patterns

of resource distributions and

an identification of the spatial scale at which the sampled variableexhibits maximum variance.

At larger lag distances harmonic effects can be noted, in whichthe variogram peaks or dips at lag distances that are multiplesof the natural scale.

Given the noise present in natural environmental data sets, it isunlikely that you will be able clearly to identify multiple

scales. One approach might be to fit a semivariogram model to the data, and

to examine the residuals for the presence of multiple patterns of scale.


16/23

Variograms


17/23

Variograms


18/23

Variogram models


19/23

Kriging

Kriging is a spatial interpolation technique based onsemi-

variograms.

Unlike every other spatial interpolation technique, kriging

provides a map that shows you the uncertainty associated withthe prediction.


20/23

Kriging

?

Sample data z(u) at u

Cell u to be estimated

Neighborhood used

to estimate cell u

( )2( )

1

( ) (0) ( ) ( ) ( )n u

ok ok ok u C u C u u u

( )

1

( ) ( ) ( )n u

ok ok z u u z u

1)()(

1

u

un

ok


21/23

Kriging

Kriging produces the best linear unbiased estimate of an attribute at anunmeasured site, once the variogram has been modeled.

Ordinary kriging: used when there is no drift in the data.

Universal kriging accounts for drift (in ArcGIS drift is modeled by aconstant, linear, second or third order equation).

Punctual kriging: produces values for non-sampled points. Block kriging: produces values for areas instead of points. Estimates for

blocks have lower variance because several point values are averaged toget the estimated value for one block. This averaging smoothes thesmall scale fluctuations of the function [Z(x)] over the area of the block.

Co-kriging: uses 2 or more variables that are correlated betweenthemselves in the estimation of values for one of them (e.g: soil bulkdensity and soil water content).


22/23

Geostatistics

Geostatistical analysis is highly useful for accounting for thesmall population problem and to solve the spatial prediction

(will accurately predict better local estimates) and analysis

The main basis of geostatistical analysis is the regionalized

variable theory.

A geostatistical analysis must be properly implemented

following a solid knowledge of mathematical and statistical

methods.


23/23

References & Examples of application

Goovaerts, P. 1997. Geostatistics for Natural Resources Evaluation. Oxford University

Press. Wang, G., T. Oyana, M. Zhang, S. Adu-Prah, S. Zeng, H. Lin, and J. Se. 2009 . Mapping and

spatial uncertainty analysis of forest vegetation carbon by combining national forest

inventory data and satellite images. Forest Ecology and Management 258(7):1275-1283.

Wang, G., G.Z. Gertner, H. Howard, and A.B. Anderson. 2008. Optimal spatial resolution

for collection of ground data and multi-sensor image mapping of a soil erosion cover factor.

Journal of Environmental management 88:1088-1098.

Wang, G., G.Z. Gertner, and A.B. Anderson. 2007. Sampling and mapping a soil erosion

relevant cover factor by integrating stratification, model updating and cokriging with

images. Environmental Management. 39(1):84-97.

Oyana, T.J., (2004). Statistical comparisons of positional accuracies of geocoded databases

for use in medical research. In Egenhofer M, Freksa C, and Miller H. (eds.): In Proceedings of

the Third International Geographic Information Science, GIScience 2004, October 2023,

2004. Regents of the University of California: pp.309313.

Robertson, G.P. (1987). Geostatistics in ecology: interpolating with known variance. Ecology,

68(3):744748.

Yarus, J.M. and Chambers, R.L. (2006). Practical geostatisticsAn armchair overview for

petroleum reservoir engineers. Distinguished Author Series, JPT, Society of Petroleum

Engineers

geog_404_lecture9

Documents