gis geostatistics
TRANSCRIPT
-
7/28/2019 GIS Geostatistics
1/17
Environmental and Ecological Statistics 8, 361377, 2001
GIS and geostatistics: Essential partners
for spatial analysis
P. A . B U R R O U G H
Utrecht Centre for Environment and Landscape Dynamics (UCEL),
Faculty of Geographical Sciences, Utrecht University, Post Box 80.115, 3508 TC Utrecht,
The Netherlands
E-mail: [email protected]
Received June 1999; Revised May 2001
Initially, geographical information systems (GIS) concentrated on two issues: automated map
making, and facilitating the comparison of data on thematic maps. The rst required high quality
graphics, vector data models and powerful data bases, the second is based on grid cells that can be
manipulated by suites of mathematical operators collectively termed ``map algebra''. Both kinds of
GIS are widely available and are taught in many universities and technical colleges. After more than
20 years of development, most standard GIS provide both kinds of functionality and good quality
graphic display, but until recently they have not included the methods of statistics and geostatistics as
tools for spatial analysis.
Recently, standard statistical packages have been linked to GIS for both exploratory data analysis
and statistical analysis and hypothesis testing. Standard statistical packages include methods for the
analysis of random samples of cases or objects that are not necessarily co-located in spaceif the
results of statistical analysis display a spatial pattern then that is because the underlying data alsoshare that pattern.
Geostatistics addresses the need to make predictions of sampled attributes (i.e., maps) at
unsampled locations from sparse, often expensive data. To make up for lack of hard data
geostatistics has concentrated on the development of powerful methods based on stochastic theory.
Though there have been recent moves to incorporate ancillary data in geostatistical analyses,
insufcient attention has been paid to using modern methods of data display for the visualization of
results.
GIS can serve geostatistics by aiding geo-registration of data, facilitating spatial exploratory data
analysis, providing a spatial context for interpolation and conditional simulation, as well as
providing easy-to-use and effective tools for data display and visualization. The value of
geostatistics for GIS lies in the provision of reliable interpolation methods with known errors,
methods of upscaling and generalization, and for supplying multiple realizations of spatial patterns
that can be used in environmental modeling. These stochastic methods are improving understanding
of how errors in models of spatial processes accrue from errors in data or incompleteness in thestructure of the models.
New developments in GIS, based on ideas taken from map algebra, cellular automata and image
analysis are providing high level programming languages for modeling dynamic processes such as
erosion or the development of alluvial fans and deltas. Research has demonstrated that these models
need stochastic inputs to yield realistic results. Non-stochastic tools such as fuzzy subsets have been
shown to be useful for spatial analysis when probabilistic approaches are inappropriate or
impossible. The conclusion is that in spite of differences in history and approach, the linkage of GIS,
statistics and geostatistics provides a powerful, and complementary suite of tools for spatial analysis
in the agricultural, earth and environmental sciences.
1352-8505 # 2001 Kluwer Academic Publishers
-
7/28/2019 GIS Geostatistics
2/17
Keywords: geographic information systems, geostatistics, statistical methods, spatial analysis,environmental modeling, map algebra, fuzzy sets
1352-8505 # 2001 Kluwer Academic Publishers
1. IntroductionGIS, statistics and geostatistics
Geographical information systems, in the sense of computer tools for handling spatial data
(Burrough and McDonnell, 1998), have been used since the late 1960s (Coppock and
Rhind, 1991). Their initial development was mainly in North America, stimulated by the
need to map, plan and manage large areas of terrain, but major contributions came also
from Britain and other European countries, and from Japan and Australasia. Initially there
were two different kinds of GIS. The rst kind, dominated by cartographers, aimed at
automating the map making process: ultimately this was to replace the paper map by the
much more exible electronic database. Initially, the essential ingredients of this approach
were geometrical accuracy, and elegant hard copy output. The second approach, pioneered
by the Harvard Laboratory for Computer Graphics, focused on spatial analysis, in
particular the overlaying of different thematic maps so that relations and conicts in land
use could be resolved. Whereas the rst approach was an automated version of the
cartographer's eye, arm and hand, and insisted on full cartographic design standards, the
Harvard approach concentrated on the clever combination of data linked to a gridded
division of space. As the computer output devices of the time were limited to line printers
having a unit cell measuring 1=661=10 inch, differences in values could only be indicated
by overprinting different alphanumeric characters, so the gridded (or raster maps) were notat all pretty. GIS anno 1980 consisted of two opposing camps, the one with expensive,
beautiful, but essentially dumb products that were the electronic equivalent of paper maps;
the other, a sort of mapping spreadsheet, in which spatial analysis could be carried out with
great mathematical exibility, but ugly results and huge demands on the then limited
computer memories. Developments in computer technology and the analysis of remotely
sensed images has reinforced the gridded approach for environmental study. Iinitially,
however, the differences in budgets and apparatus between the remote sensing
professionals and environmental scientists ensured that raster GIS and the classication
and display of remotely sensed images remained separate areas of development.
Technical advances since the 1980s have ensured that the division of GIS practitioners
into two opposing camps has largely disappeared, and the input of gridded maps and
remotely sensed images to GIS has now become standard practice. True, there are still
arguments today as to whether the raster (gridded) or the vector (point, line, polygon)
approach is better, but the discussion now focuses on the correct choice of spatial
paradigm for a given application, and not on the limitations of the approaches per se
(Burrough and McDonnell, 1998). Today, most commercial GIS provide facilities for
working with raster or vector data, either individually, or in combination. They also
provide database facilities for storing, retrieving, modifying the attributes of the spatial
entities that have been recognized for the given application, and many also include their
own internal programming languages which allow the user to treat the spatial data as
inputs to a virtually unlimited range of environmental models (Burrough, 1996).
362 Burrough
-
7/28/2019 GIS Geostatistics
3/17
In brief, GIS are sets of computer tools for the storage, retrieval, analysis and display ofspatial data. GIS may also be required to supply data to numerical models of
environmental processes (e.g., air quality, water quality and quantity, plant-soil-
environment responses, etc.) and display the results of these models as cartographically
acceptable screen or hard copy images. By convention, GIS analyses are almost
exclusively deterministic and data are assumed to be exact. Apart from specialists (e.g.,
Heuvelink and Lemmens, 2000) the GIS community has shown little regard for issues of
uncertainty and spatio-temporal variability apart from geometric precision. This is not
because of computational problems, but because market forces have determined that many
GIS applications need not address these issues.
1.1 GIS and statistics
Statistical theory and practice for describing the average properties of samples, and for
hypothesis testing are well known in environmental science. Conventionally, the
geographical location of the individual observations is not taken into account, but if
these methods are used for attributes of spatially located objects then one may be able to
set up and test hypotheses as to whether geographically separate, but eponymous objects
(e.g., instances of soil series, land use classes) really share the same sets of attributes.
Statistical spatial data analysis (SSDA) (Wise et al., 2001) treats the objects in the spatial
data base (points, lines, areas, pixels) as though they and their attributes were samples
from a larger population. As Wise et al. (2001) point out, two main approaches have been
developedexploratory spatial data analysis (ESDA) and conrmatory spatial data
analysis (CSDA). ESDA is a spatial extension of Tukey's (1977) methods for robust andvisual analysis of data: the accent is on descriptive univariate and multivariate statistics
(means, deviations, ranges, correlations, principal components) in which one searches for
outliers or oddities in the value patterns of the spatial objects under consideration. In
CSDA, attention is focused on building empirical regression models and/or the testing of
hypotheses.
Several standard statistical packages (SPSS, S-plus, etc.) include a wide range of
methods for EDA and CDA, though they may not include all the hyper data links
envisaged by the developers of ESDA (e.g., Wise et al., 2001). Never the less, today it is
comparatively easy to link a statistical analysis of tabular attribute data to a set of
geographical objects in a GIS like ARC-VIEW, either via a DBase le (e.g., using SPSS)
or embedded links (using S-plus).
As an example of simple descriptive statistical analysis linked to GIS, consider Fig. 1,
which shows a soil map with three soil types and 126 sample locations.
In the study area the soil is usually less than 100 cm thick over bedrock. In a GIS
analysis we might want to test the hypothesis that there is no signicant difference in soil
thickness between the three soil types so that the map pattern may be simplied without
loss of information. Visual inspection of the right hand gure suggests that the different
soil types do have different soil thickness, and this is easily conrmed by extracting the
observed thickness data for each site and carrying out an ANOVA analysis for all soil
types. As Table 1 shows, the mean soil thickness per soil type does differ signicantly; the
analysis returns a F-value of 22.67 with p40.001. A post-hoc Scheffe test suggests that all
GIS and geostatistics 363
-
7/28/2019 GIS Geostatistics
4/17
three soil types have signicantly different means (Table 2) so there is little point in
simplifying the soil map.
As another example of straightforward statistical analysis using a linked statistics
package, Fig. 2 presents the results of carrying out a multivariate discriminant analysis on
all the 20 attributes of the soil collected at each of the 126 sample sites. This clearly shows
that though the centroids of the three soil types clearly differ in multivariate space, there is
considerable overlap.
1.2 GIS and geostatistics
As noted, the standard GIS approach to recording and analyzing the attributes of pre-
dened objects implies no spatial variation within an object, and all change occurs at
object boundaries. In many applications (hydrology, oceanography, earth sciences, soil
Figure 1. Left: Soil prole classes at sample sites (dot is unit Cr, small circle with dot is unit Ct and
large circle with dot is unit Ia). Right: Soil thickness at sample sites (dot is 040 cm, small ag is 40
80 cm, and large ag is4 80 cm).
Table 1. Descriptive statistics of soil thickness for each soil type.
Soil Type N Mean Std. Error
Ct 36 51.15 4.20
Cr 31 67.02 4.48
Ia 59 32.76 2.79
Total 126 46.45 2.42
364 Burrough
-
7/28/2019 GIS Geostatistics
5/17
science to name but a few), this approach is not always sensible and it is better to consider
the variation of the attribute in terms of a continuous, but noisy surface. This surface is
often constructed by interpolation from sets of point data. Though there are many methods
for interpolation (see Burrough and MacDonnell, 1998), most of these treat the data as if
they can be modeled by a smooth, differentiable surface and no attention is paid to the
uncertainty of the results. The methods of geostatistics (Matheron, 1965; Journel, 1996;
Goovaerts, 1997) use the stochastical theory of spatial correlation both for interpolation
and for apportioning uncertainty.
Although still unfamiliar to many GIS users, in terms of technical development, the
Table 2. Post hoc Scheffe test indicates that all three soil types have signicantly differentthicknesses.
Subset for Alpha 0.05
Soil Type N 1 2 3
3 59 32,7617
1 36 51,1528
2 31 67,0232
Sig. 1000 1000 1000
Figure 2. Plot of discriminant functions for all 126 soil observations compared with map classes.
GIS and geostatistics 365
-
7/28/2019 GIS Geostatistics
6/17
methods of geostatistics are of similar age to GIS, but have different roots. Whereas GISwas seen as a way to automate the creation of exact, deterministic models of the world in a
dominantly cartographic context, geostatistics is about making predictions under
conditions of uncertainty and limited information. The path of geostatistics from its
founders Krige and Matheron in the 1960s and 1970s to present day exponents such as
Journel, Goovaerts and others emphasizes the role of chance in spatial prediction. Where
GIS ignores statistical variation, geostatistics uses the understanding of statistical variation
as an important source of information for improving predictions of an attribute at
unsampled points, given a limited set of measurements. Geostatistics are therefore a very
useful ``add on'' or extension to the GIS toolkit for spatial analysis.
A central aspect of geostatistics is the use of spatial autocovariance structures, often
represented by the (semi)variogram, or its cousin the autocovariogram, which differentiate
different kinds of spatial variation. The semivariance indicates the degree of similarity of
values of a regionalized variable Zover a given sample spacing or lag, h. Semivariograms
(Fig. 3) are graphs of the semivariance gh against sample spacing or lag, h: they aredened as:
gh 1
2VarfZxi Zxi hg 1
and estimated from sampled data by:
gh 1
2nXn
i 1
fzxi zxi hg2
2
where n is the number of samples, and zxi; zxi h are measurements separated by a
distance h.In practice, gh is estimated from sets of point samples which can be extracted from the
GIS data base. Because experimentally derived semivariances do not always follow a
smooth increase with sample spacing, a theoretical variogram model is tted to the data
(Burrough and McDonnell, 1998; Deutsch and Journel, 1998; Goovaerts, 1997). The
interpolation weights for predicting the value of attribute z at unsampled locations x are
derived with the help of this tted model and the method is known as ordinary point
kriging (OPK) after its rst exponent. Predictions can also be computed for units of land
(blocks) larger than those sampled, thereby smoothing out local variationsthis is known
as block kriging. Much practical geostatistics is concerned with the estimation and tting
of variograms to experimental data (Pannatier, 1996) followed by interpolation or
conditional simulation of gridded surfaces (Pebesma and Wesseling, 1998). Besides
interpolation, kriging provides information on interpolation errors. Knowledge of the
spatial correlation structures may also be used to generate sets of equiprobable realizations
(simulations) of the attribute z that can be of great value for studying error propagation
through spatial models that may be linked to the GIS.
For many users of GIS, kriging is no more than an alternative method of interpolation
(see Burrough and McDonnell, 1998 for references). Indeed, many statisticians and
geographers use other methods for statistical spatial analysis (c.f. Bailey and Gatrell, 1995;
Cressie, 1991). The general lack of appreciation of geostatistics by the GIS community
during the seminal years from the mid-1970s to the mid-1990s was due to many factors,
including the publication of Matheron's original treatize in French (Matheron, 1965),
366 Burrough
-
7/28/2019 GIS Geostatistics
7/17
which is therefore inaccessible to most native English speakers. Until the mid-1990s, the
high prices charged for geostatistics software packages and their almost exclusive use by
mining corporations made it difcult to teach geostatistics in many universities. Of course,
a contributing factor to the lack of interest in geostatistics by the GIS practitioner is itsgrounding in mathematical statistics which clearly bafes those of us who have little
feeling for the statistical treatment of sampling, variance analysis and correlation and
regression.
2. The mutual benets of linking GIS, statistics andgeostatistics
In this Section I present some examples of the ways in which GIS, statistics and
geostatistics complement each other in spatial analysis.
2.1 The value of GIS for geostatistics
Besides acting as a spatial database, GIS provides several benets to statisticians and
geostatisticians that are largely concerned with the correct geometric registration of
sample data, prior data analysis, the linking of hard and soft data, and the presentation of
results.
Geo-registration. As with all spatial data, spatial analysis must be carried out on data
that have been collected with reference to a properly dened coordinate system. GIS can
Figure 3. Example of a semivariogram tted to experimental data. The numbers indicate the
numbers of pairs of points used at each lag.
GIS and geostatistics 367
-
7/28/2019 GIS Geostatistics
8/17
provide the means to register the locations of samples directly (via GPS or other methods),or to convert local coordinates to standard coordinates. The use of standard coordinates
ensures that data collected at different times can be properly combined and overlaid on
conventional maps. The use of standard coordinate systems is particularly important when
international databases are created from different sources, such as occurs in Europe, for
example.
Exploratory spatial data analysis. As already noted, ESDA is a useful toolkit for
examining data prior to analysis. For geostatisticians, the presence and location of spatial
outliers, or other irregularities in the data may have important consequences for the tting
of variograms, or for determining whether data should be transformed to logarithms. GIS
often provide search engines that can be linked to statistical packages to determine
whether any given data set contains anomalies or unexpected structure. The underlying
reasons for such anomalies may sometimes be easily seen when these data are displayed
on a map together with other information. Not all users of ESDA in GIS use conventional
geostatistics, however, and other measures of spatial autocorrelation such as Moran's I
statistic are often used (Pereira et al., 1998).
Spatial context and the use of external information. Increasingly, the suite of
geostatistical methods currently available allow the user to incorporate external
information that can be used to modify, and possibly improve, the predictions or
simulations required. Geostatisticians term the external information ``secondary'',
because they believe that the ``hard data'' measured at the sample locations is most
important. But GIS practitioners might prefer to call the ``primary data'' that which
separates a landscape into its main componentsdifferent soils, or rock types, or land
cover classes, regarding the sampled data as merely lling in the details that were not
apparent at the smaller map scale. In any case, GIS makes it possible to incorporate data
from other aspects of the environment with the geostatistical study of autocorrelationstructures, so that differentiated knowledge of different patterns of variation can be used to
best effect. For example, in the c. 56 2 km study area used in Principles of Geographical
Information Systems (Burrough and McDonnell, 1998) the distribution of heavy metals
(zinc) in the top soils of the river alluvium was clearly inuenced by ooding regime,
which in turn is affected by factors such as distance from the river and the relative
elevation of the oodplain. Fig. 4 shows how the extra information may be used in several
ways. Stratied kriging involves dividing the original set of 155 soil samples into classes
based on ooding frequencya simple ``point-in-polygon'' search in GISto yield three
strata. Variograms were estimated for each stratum and these were interpolated to yield a
single map (Fig. 4b). In a second approach, a multiple regression model was computed
from the triplets of zinc level, elevation and distance to river measured at all data points
(Fig. 4c). A third approach, known as ``Universal kriging'' directly incorporates the trend
in the estimation of the interpolation weights and Fig. 4d illustrates how both stratication
and trends may be combined.
The results clearly show the differences in the patterns obtained with and without the
ancillary data. The single, or combined incorporation of external information through
stratication and strata-specic trends yielded maps with good levels of prediction and a
spatial resolution that was better than could have been obtained from ordinary point
kriging alone. Other examples are given in Goovaerts (1997, 1999).
Display and visualization2D, 3D, plus time. Who is the recipient of a geostatistical
interpolation? If a geostatistician, or statistician, then simple maps and tables of numbers
368 Burrough
-
7/28/2019 GIS Geostatistics
9/17
may sufce, but environmental managers need to see how the results relate to other aspects
of the terrain. Today it is easy to import the results of a kriging interpolation into a GIS and
display the results in conjunction with a scanned topographic map, or display them in 3D
over a digital elevation model (DEM) of the landscape from which the samples were taken
(Fig. 5). Such presentation invites visual interpretation, the re-evaluation of results and the
discovery of more information, and therefore is an essential part of the spatial analysis
process.
Figure 4. Results of interpolating the ln(Zinc) levels of topsoils (010 cm) in a frequently ooded
part of the Maas oodplain, Limburg, NL. a: ordinary point kriging, b: OPK within different ooding
strata, c: using a regression model based on elevation and distance from the river, d: universal kriging
with a single trend, e: universal kriging with stratication and different trends for each stratum.
GIS and geostatistics 369
-
7/28/2019 GIS Geostatistics
10/17
2.2 The value of geostatistics for GIS
Besides providing powerful means of interpolating point data to areas, there are many
useful ways in which statistics and geostatistics can bring major improvements to the
understanding of uncertainty and error in GIS-based spatial analyzes. This is particularly
so for most kinds of GIS-based environmental modeling where a priori we are dealing
with incomplete data and uncertainty. Indeed, to pretend, as the standard GIS paradigms
do, that all data are exactly known, and exactly located, is not to recognize reality
Geostatistics provides at least the following attractive options for environmental GIS
and environmental decision support systems: interpolation from point data and estimates
of error bounds, estimates of error propagation and uncertainty ranges for spatial and
temporal modeling, and data reduction and generalization.
Interpolation errors. Although surfaces interpolated by kriging are smooth, all forms of
kriging yield estimates of the estimation uncertainty or kriging error. Such values can be
mapped to provide error surfaces which can be combined with other information. Kriging
errors depend on the form of the variogram and the disposition of observationsthe more
Figure 5. 3-Dimensional display of interpolation results obtained from stratied kriging on a digitalelevation model with shading and transparency oated above a scanned topographic map. Dark gray
zones indicate heavy metal concentrations.
370 Burrough
-
7/28/2019 GIS Geostatistics
11/17
data surrounding an unsampled location, and the stronger the autocorrelation structure, thelower the estimation variance.
Error propagation in spatial models. When data from interpolated surfaces are used as
inputs to numerical models, the error surfaces associated with kriging interpolation may be
used to understand the propagation of errors through spatial models. Heuvelink (1998) gives
both theory and examples of using Taylor series expansion on interpolated data to compute
error propagation through cartographic modelssee also Burrough and McDonnell (1998).
An increasingly popular alternative to the Taylor expansion method is to use methods of
conditional simulation (Pebesma and Wesseling, 1998) to provide sets of multiple
realizations of data surfaces for inputs to numerical models like the 3D groundwater model
``MODFLOW'', so that error propagation and model sensitivity can be followed using
Monte Carlo methods (e.g., Bierkens, 1994; Gomez-Hernandez and Journel, 1992).
Monte Carlo techniques using conditional simulation may also be useful for comparingdata collected at different times and locations within the same area. Recent work on the
redistribution of137Cs fallout from the Chernobyl nuclear disaster in 1986 has shown that
the normal decay of radiocaesium levels and uptake rates in cow's milk can be temporally
reversed if the cows are grazing on recently ooded, poorly drained peat soils (Burrough
and McDonnell, 1998; Burrough et al., 1999a). The data for these studies consisted of
radionuclide determinations made on bulked soil samples taken in 1988 and 1993.
Unfortunately, the samples were collected at different sites in the two years, so it was
difcult to use the raw data to test the hypothesis that the ood events had really enhanced
radio caesium levels near the rivers. However, by computing the variograms for the data
sets from both years and using these to compute sets of conditional simulations of the
normalized differences of radiocaesium in the topsoil between the two sampling times and
at all sampled sites, it was possible to establish a clear relation between the incidence of
ooding and ood-induced enhancement of radiocaesium which could enter the food
chain (Burrough et al., 1999a). Fig. 6 shows clearly that although there seem to be
systematic differences between the two years (mean values for 1993 exceed those for 1988
by 0.51.0 standard errors) sites within 1.5 km of a ooding river are not only more
variable, but many have higher levels of radio caesium.
Data reduction and spatial generalization. In some applications there may be too much
data, which may need to be reduced to manageable proportions or common coordinates.
An example is the need to compare the yields of different crops over several years on the
same plot when yields have been recorded using data loggers and GPS. For example,
Burrough and Swindell (1997) report the collection of annual yield data for three
successive crops on a 5 ha eld at the experimental farm of the Royal College of
Agriculture, Cirencester, UK. Data were collected on wheat, barley and oilseed rape in
successive years by a combine harvester tted with a data logger whose location was
pinpointed by locally referenced GPS. The spatial resolution of the sample was
approximately 4 m (the width of the harvester)6 2.5 m (along the cut), and each survey
yielded some 2000 samples or more.
Because of locational noise in the GPS and errors in the amount of crop cut each 2.5 m
by the harvester, it was not possible to relate the yields of the three crops directly to
location in the eld nor to investigate links between crop yields and soil conditions. To
generalize and smooth the data, for each year an isotropic variogram was computed: the
data were then interpolated to a common grid of 2.5 m resolution using block kriging with
GIS and geostatistics 371
-
7/28/2019 GIS Geostatistics
12/17
units of 256 25 m. Each annual map was normalized to give a map showing relative
yield; these three maps were then combined to give a three year, normalized average.
Comparison of the normalized average yield map with a computer enhanced, scanned
aerial image of the site (Fig. 7) demonstrates clear relations between site conditions and
normalized crop yields that otherwise were not apparent.
Figure 6. Plots of conditional simulations for the 19881993 normalized differences of137Cs at data
points, with distance to rivers that ood.
Figure 7. Comparison between aerial photo image of eld A and displayed on its right, the average,
standardized crop yields as interpolated using block kriging.
372 Burrough
-
7/28/2019 GIS Geostatistics
13/17
Geostatistics and remote sensing. The applications of geostatistical methods in theanalysis of remotely sensed images is a topic in itself. Here I refer the reader to the recent
issue of Photogrammetric Engineering and Remote Sensing (January, 1999) for a recent
compilation of research. Remote Sensing applications of geostatistics have less to do with
interpolation from sparse data (the images are complete unless masked by cloud cover in
which interpolation could be used to ll in the gaps) than with the description and analysis
of gridded, stochastic surfaces and the simulation of multiscale data sets.
3. Stochastic inputs to the modeling of spatial processes
As already indicated, geostatistical methods of conditional simulation are useful for
following the propagation of errors through spatial models that may be linked to, or runfrom GIS. Recent research in the modeling of dynamic spatial processes (van Deursen,
1995; Takeyama and Couclelis, 1997; Wesseling et al., 1996) indicates the value of
including an understanding of errors and roughness in many models of dynamic spatial
processes, particularly when processes are non-linear.
Stability of the topology of drainage nets. The automatic derivation of surface topology
from gridded digital elevation model is now a standard operation in GIS that are used for
hydrological projects (Fig. 8a). The usual procedure is to use thin plate splines to
interpolate a DEM (digital elevation model) from digitized contours to a ne grid so that
the resulting topological net is free from discontinuities (Mitasova and Hoerka, 1993).
Unfortunately, although smooth interpolators guarantee continuity in surface topology,
they also constrain the topology to a single set of drainage lines, which may result inserious artefacts in hydrological derivatives such as wetness indices (see Burrough and
MacDonnell, 1998 for denitions). Simple methods, such as the D8 algorithm, for deriving
drainage nets from gridded surfaces, produce a unique solution in which the main stream
line is only one cell wide (e.g., Fig. 8a). Large differences in the size of the upstream
contributing catchment area between a cell on the main drainage line and its off-line
neighbor may arise. This is counter-intuitive, because we expect cells close to each other
to have similar conditions and contributing areas, especially in the bottoms of valleys. A
Figure 8. a: Single realization of a drainage network derived from a smooth DEM; b: average image
computed from 100 realizations derived from the initial DEM plus 10 cm root mean square (RMS)
error.
GIS and geostatistics 373
-
7/28/2019 GIS Geostatistics
14/17
better idea of surface water drainage may be obtained by considering the averageproperties of a suite of possible drainage nets that are obtained when surface roughness is
added to the DEM. The roughness can easily be modeled by a small Gaussian noise which
is added to each cell (a standard deviation equal to 0.1% of the maximum relief difference
in the area is enough as a rst approximation); the result yields one possible realization of
the net. Repeating the procedure for 1001000 times with different random values for
roughness creates an average probability density map of the cumulative contributing area
(Fig. 8b) which appears to be more realistic than the single deterministic solution. Note
that one cannot compute Fig. 8b by passing a moving window smoothing function over
Fig. 8a.
The effects of small errors on the derived ow paths may be effectively demonstrated by
displaying the whole set as a movie, when the amplitudes and locations of the swings of
drainage paths resulting from the minor errors will become very apparent. Though this
example uses spatially uncorrelated noise for each realization of the DEM surface, one
could of course examine the effects of spatially correlated noise on the model by rst
creating a set of conditional simulations based on a known or assumed variogram.
Repeating the analysis for multiple realizations and displaying these using dynamic
visualization enhances understanding of the results.
Adding stochasticity to make a deterministic process model work properly. In certain
situations it appears to be necessary to add roughness to a surface so that a well-known
deterministic process can be modeled effectively, and this is illustrated using the example
of the creation of an alluvial fan. If a hillside is modeled as a smooth inclined plane, then
the topology consists merely of a set of parallel lines that run from top to bottom, much
like the way rain falling on the windscreen of a stationary car runs off in parallel streams.
These streams can be ``forced'' to merge if the initial surface is roughend (e.g., Liverpooland Edwards, 1995). In the case of the alluvial fan, each ``event'' by which material falls
down the slope and is added to the fan modies the surface roughness in a way that is very
difcult to predict, but which must not be ignored. So the initial roughness is modied by
feedback from the sedimentation process so that for each cycle there is a new surface for
the ow and deposition. If the deposits are sufciently large, the surface topology changes
with each cycle.
The need for initial roughness which is modied but maintained during the development
of the delta is a nice example of how a better understanding of the physical process may
arise by linking geostatistics with interactive dynamic modeling. Ongoing research in
Utrecht and elsewhere is beginning to demonstrate the value of conditional simulation in
dynamic, as well as static models of landscape change (see Karssenberg et al., in press).
4. Non-stochastic tools for analyzing uncertainty in spatialdata: fuzzy subsets
In many situations we know there is uncertainty, but we do not know, nor can we construct
probability distributions. We may also be uncertain how to dene the geographical objects
in the data base (Burrough and Frank, 1996). The development of fuzzy subsets in
environmental science is increasingly being seen not as a replacement for statistics and
374 Burrough
-
7/28/2019 GIS Geostatistics
15/17
geostatistics, but as a complementary suite of methods for operating in uncertainconditions. The main uses of fuzzy subsets in GIS are for the selection and retrieval of data
under conditions of uncertainty (eg., Burrough and McDonnell, 1998; Canters, 1997), and
in creating multivariate classes that overlap (fuzzy k-means) (Burrough et al., 1999b).
Data retrieval using fuzzy subsets has been demonstrated to be less error prone than
conventional Boolean SQL methods (Heuvelink and Burrough 1993). Fuzzy memberships
can be interpolated using kriging (de Gruijter et al., 1997; Burrough and McDonnell,
1998) and the application of fuzzy k-means to derivatives of digital elevation models pro-
vides convincing and objective methods for classifying terrain (Burrough et al., 2000,
2001). Fuzzy subsets can also be used to address issues of the crispness of spatial bound-
aries (e.g., Lagacherie et al., 1996) or the intervisibility across 3D surfaces (Fisher, 1995).
Fuzzy subsets may also be used to dene sensible ways to select point data for kriging.
5. Conclusions
This review has demonstrated that GIS, statistics and geostatistics have much to give to
each other, particularly when GIS are used for environmental analysis. Geostatistics
benet from having standard methods of geographical registration, data storage, retrieval
and display, while GIS benets by being able to incorporate proven methods for testing
hypotheses and for handling and understanding errors in data and illustrating their effects
on the outcomes of models used for environmental management. In some situations,
geostatistics may be supplemented by non-probabilistic methods of handling uncertainty
such as provided by fuzzy subsets.
References
Bailey, T.C. and Gatrell, A.C. (1995) Interactive Spatial Data Analysis, Longman, Harlow, 413 pp.
Bierkens, M.F.P. (1994) Complex Conning Layers: A Stochastic Analysis of Hydraulic Properties at
Various Scales, Royal Dutch Geographical Association (KNAW)/Faculty of Geographical
Sciences, University of Utrecht, Utrecht, NL.
Burrough, P.A. (1996) Opportunities and limitations of GIS-based modeling of solute transport at the
regional scale. In: Application of GIS to the Modeling of Non-Point Source Pollutants in the
Vadose Zone, SSSA Special Publication 48, Soil Science Society of America, Madison, 1937.
Burrough, P.A. and Frank, A. (1996) (eds), Geographic Objects with Indeterminate Boundaries,
GISDATA Series 2, Taylor and Francis, London.
Burrough, P.A., van Gaans, P.F.M., and MacMillan, R.A. (2000) High-resolution landform
classication using fuzzy k-means. Journal of Fuzzy Sets and Systems, 113, 3752.Burrough, P.A., van Gaans, P.F.M., Wilson, J., and Hansen, A.J. (2001) Fuzzy k-means classication
of topo-climatic data as an aid to forest mapping in the Greater Yellowstone Area, USA.
Landscape Ecology, 16, 52346.
Burrough, P.A. and McDonnell, R.A. (1998) Principles of Geographical Information Systems,
Oxford, Oxford University Press, 330 pp.
Burrough, P.A. and Swindell J. (1997) Optimal mapping of site-specic multivariate soil properties.
In Precision Agriculture: Spatial and Temporal Variability of Environmental Quality, J. Lake,
G. Bock, and J. Goode (eds), Proc: CIBA Foundation Symposium 210, John Wiley and Sons,
Chichester, pp. 20820.
GIS and geostatistics 375
-
7/28/2019 GIS Geostatistics
16/17
Burrough, P.A., van der Perk, M., Howard, B., Prister, B., Sansone, U., and Voitsekhovitch, O.V.(1999a) Environmental mobility of Radiocaesium in the Pripyat Catchment, Ukraine/Belarus.
Water, Air and Soil Pollution, 110, 3555.
Burrough, P.A., van Gaans, P.F.M., and MacMillan, R.A. (2000) High-resolution landform
classication using fuzzy k-means. Journal of Fuzzy Sets and Systems, 113, 3752.
Canters, F. (1997) Evaluating the uncertainty of area estimates derived from fuzzy land-cover
classication. Photogrammetric Engineering and Remote Sensing, 63, 40314.
Coppock, J.T. and Rhind, D.W. (1991) The history of GIS. In: Geographical Information Systems,
Vol. 1, Principle, D.J. Maguire, M.F. Goodchild, and D.W. Rhind (eds), Longman Scientic
and Technical, New York, pp. 2143.
Cressie, N. (1991) Statistics for Spatial Data, Wiley, New York, 900 pp.
De Gruijter, J.J., de Walvoort, D., and van Gaans, P. (1997) Continuous soil mapsa fuzzy set
approach to bridge the gap between aggregation levels of process and distribution models.
Geoderma, 77, 16995.
Deutsch, C. and Journel, A.G. (1998) GSLIB Geostatistical Handbook, 2nd edition, Oxford.Fisher, P.F. (1995) An exploration of probable viewsheds in landscape planning. Environment and
Planning B: Planning and Design, 22, 52746.
Gomez-Hernandez, J.J. and Journel, A.G. (1992) Joint sequential simulation of multigaussian elds.
In: A. Soares (ed), Proc. Fourth Geostatistics Congress, Troia, Portugal. Quantitative Geology
and Geostatistics, (5), 8594, Dordrecht, Kluwer Academic Publishers.
Goovaerts, P. (1997) Geostatistics for Natural Resources Evaluation, Oxford University Press,
483 pp.
Goovaerts, P. (1999) Using elevation to aid the geostatistical mapping of rainfall erosivity. CATENA,
34, 22742.
Heuvelink, G.B.M. (1998) Error Propagation in Environmental Modeling, Taylor and Francis,
London, 127 pp.
Heuvelink, G.B.M. and Burrough, P.A. (1993) Error propagation in cartographic modeling using
Boolean logic and continuous classication. Int. J. Geographical Information Systems, 7, 231
46.
Heuvelink, G.B.M. and Lemmens, T. (2000) (eds), Accuracy 2000. Proceedings of the 4th
International Meeting on Accuracy in Spatial Data, Amsterdam, July, Delft University Press,
Delft.
Karssenberg, D.J., Torqvist, T., and Bridges, J. (2001) Conditioning a process-based model of
sedimentatry architecture to well data. Journal of Sedimentary Research, 71(6).
Lagacherie, P., Andrieux, P., and Bouzigues, R. (1996) Fuzziness and uncertainty of soil boundaries:
from reality to coding in GIS. In: P.A. Burrough and A.U. Frank (eds), Geographical Objects
with Indeterminate Boundaries, Taylor and Francis, London, pp. 27586.
Liverpool, T. and Edwards, S. (1995) Modeling meandering rivers. Physical Review Letters, 75,
3016.
Matheron, G. (1965) La Theorie des Variables Regionalisee et ses Applications, Masson, Paris.
Mitasova, H. and Hoerka, J. (1993) Interpolation by regularized spline with tension: Application to
terrain modeling and surface geometry analysis. Mathematical Geology, 25, 65769.Pannatier, Y. (1996) Variowin. Software for spatial data analysis in 2D. Statistics and Computing,
Springer Verlag, Berlin, 91 pp.
Pebesma, E. and Wesseling, C.G. (1998) GSTAT: A program for geostatistical modeling, prediction
and simulation. Computers and Geosciences, 24, 1731.
Pereira, J.M.C., Carreiras, J.M.B., and Perestrello de Vasconcelos, M.J. (1998) Exploratory data
analysis of the spatial distribution of wildres in Portugal 19801989. Geographical Systems,
5, 35590.
Takeyama, M. and Couclelis, H.M. (1997) Map dynamics: integrating cellular automata and GIS
through Geo-Algebra. International Journal of Geographical Information Science, 11, 7392.
376 Burrough
-
7/28/2019 GIS Geostatistics
17/17
Tukey, J.W. (1977) Exploratory data analysis, Addison-Wesley, Reading, Massachusets.Van Deursen, W.P.A. and Wesseling, C.G. (1995) PCRaster, Department of Physical Geography,
Utrecht University.
Wesseling, C.G., Karssenberg, D., Burrough, P.A., and van Deursen, W.P.A. (1996) Integrating
dynamic environmental models in GIS: The development of a dynamic modeling language.
Transactions in GIS 1, 408.
Wise, S., Haining, R., and Ma, J. (2001) Providing spatial statistical data analysis functionality for
the GIS user. The SAGE project. International Journal of Geographical Information Science,
15, 239254.
Biographical sketch
Peter A. Burrough, since 1984, is Professor of Physical Geography and Geographical
Information Systems, Faculty of Geographical Sciences, University of Utrecht. Dr.
Burrough is also the Director of the Utrecht center for Environment and Landscape
Dynamics (UCEL). He is Chairman of the Interfaculty center for Hydrology, Utrecht
(ICHU). He is a member of the advisory committee on Earth Sciences, Physical
Geography and Geology for the Dutch National Science Foundation NOW, and a member
of the Scientic Board for the ``Fonds voor Wetenschappelijk Onderzoek'' (FWO) for
Vlaanderen, Belgium.
GIS and geostatistics 377