rights / license: research collection in copyright - non ...26837/eth-26837-02.pdffrom linear and...

121
Research Collection Doctoral Thesis Spatial data management and decision making for soil improvement measures Author(s): Schnabel, Ute Publication Date: 2003 Permanent Link: https://doi.org/10.3929/ethz-a-004625319 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection . For more information please consult the Terms of use . ETH Library

Upload: others

Post on 09-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Research Collection

Doctoral Thesis

Spatial data management and decision making for soilimprovement measures

Author(s): Schnabel, Ute

Publication Date: 2003

Permanent Link: https://doi.org/10.3929/ethz-a-004625319

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

Page 2: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Diss. ETHNr. 15125

Spatial Data Management and Decision Making

for Soil Improvement Measures

A dissertation submitted to

Swiss Federal Institute of Technology Zurich

for the degree of

Doctor of Natural Sciences

presented by

Ute Schnabel

Dipl. Geogr., Technical University of Hannover

born 12 July 1966 in Lehrte

citizen of Germany

accepted on the recommendation of

Prof. Dr. Roland W. Scholz, examiner

Dr. Olaf Tietje, co-examiner

Dr. Martin Fritsch, co-examiner

Prof. Dr. Willy A. Schmid, co-examiner

2003

Page 3: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Summary

The demand to manage the natural environment in a way that takes into

account the ecological, economic, and social aspects of development challenges

environmental planners, policymakers and other decision-makers. Decisions that

are indispensable to support such a development often rely on uncertain and

incomplete but complex information. As a result, sparse information are analyzed

and translated from micro spatial scales to macro spatial scales (and wee versa).

Consequently data generation and scale translation create a high degree of

uncertainty in results and decisions. To cope with these uncertainties the purpose

of this thesis is to improve spatial data analysis, spatial prediction and evaluation

of heavy metals in soil for the purpose of decision-making on the application of soil

remediation methods.

For a combined spatial, statistical data analysis a non-parametric Kernel

density regression was used. The results were presented as blankets drawn from

three-dimensional regression, which clearly showed spatial tendencies, hot spots,

and outliers. Herewith the method bridges the gap between explanatory data

analysis and three-dimensional visualization of spatial interrelated parameters.

Moreover, the spatial interpolation with a Gaussian Kernel density function is a

convenient alternative to ordinary kriging when the spatial, stochastic structure of

underlying data shows a trend.

With a second data set, the spatial distribution of cadmium was predicted with

lognormal ordinary kriging and conditional simulation. The results showed a

geometric mean distribution of 1.08 mg cadmium per kg soil and a standard

deviation of 2.27 mg/kg. Since cadmium was emitted from a metal smelter, the

data exploration focused on trend models and anisotropy. However the

consideration of contamination patterns reveals no significant improvement for

spatial prediction of cadmium. Regardless of high mean uncertainty for the total

Page 4: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

area, the use of likelihoods to exceed a certain threshold at single points was

presented. Although the prediction of cadmium revealed no reliable, deterministic

estimation the handling of uncertainties with probability knowledge led to useable

and trustworthy results for the decision maker.

The uncertainty from prediction was used in multi-criteria utility analysis in

order to evaluate decisions on application of remediation alternatives like soil

washing, phytoremediation and no remediation. An overall utility was calculated

from linear and non-linear utility functions for the criteria i) human health ii) long-

term productivity of soil iii) cost of techniques and iv) change of market value.

Without weighting single criteria with a decision maker's preference 'human health'

and 'long-term productivity of soil' were decisive arguments for all remediation

alternatives.

The uncertainty of decision was quantified by the lognormal probability

distribution from predicted cadmium content. The expected utility for any true value

(or hypothesis) beyond the prediction assesses the quality of decision. Further the

goodness of decision (hit, false alarm, miss and correct rejection) describes the

deviation from hypothesized to predicted contamination. The evaluation revealed

that soil washing was preferred within a distance of 300 m to the source of

contamination, whereas no remediation is recommended beyond this distance.

Phytoremediation was only superior for concentrations around 3 mg cadmium per

kg soil.

The investigated methods improve scaling and transmission of information

and data. They are suitable for a spatial delineation of varying demands in the

context of soil improvement measures and advanced decision-making.

Page 5: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Zusammenfassung

Entscheidungsträger in der Umweltplanung und der Politik sehen sich in der

Förderung einer nachhaltigen Entwicklung der natürlichen Umwelt mit zahlreichen

Entscheidungen konfrontiert, die Eingriffe auf Ökosysteme, Landschaften und

andere natürlichen Ressourcen zur Folge haben. Die notwendigen

Entscheidungen müssen sich jedoch oft auf unsichere und unvollständige aber

komplexe Informationen stützen. Lückenhafte Datensätze werden für räumliche

Fragestellungen analysiert und über mehre räumliche Skalen transformiert. Die

Datenanalyse und die Datentransformation erzeugen dann einen hohen Grad an

Unsicherheit in den Ergebnissen und Entscheidungen. Diese Arbeit leistet einen

Beitrag zu der Verbesserung räumlicher Datenanalyse, räumlicher Interpolation

und der Bewertung von Schwermetallen im Boden mit dem Ziel einer verlässlichen

Entscheidungsfindung für die Anwendung von Bodenverbesserungsmassnahmen.

Für die räumlich-statistische Datenanalyse von Schwermetall¬

konzentrationen im Oberboden einer Schweizerischen Agglomerationslandschaft

(Furttal, Kanton Zurich) wird eine Kernel Density Regression eingesetzt. Mit dieser

Methode kann eine dreidimensionale Visualisierung von räumlichen Trends,

Verdichtungen und Ausreissern berechnet und gezeigt werden. Die Methode

schliesst damit eine Lücke zwischen Datenanalyse einerseits und der

Visualisierung von räumlich voneinander abhängigen Parametern andererseits.

Darüber hinaus ist die räumliche Interpolation mit einer Gauß'schen Kernel

Density Funktion eine geeignete Alternative zu Ordinary Kriging, wenn die

räumliche, stochastische Struktur der zugrunde liegenden Daten einen Trend

aufweist.

Mit einem zweiten Datensatz (Dornach, Kanton Solothurn) wird die räumliche

Verteilung von Cadmium mit einem lognormalen Ordinary Kriging Verfahren und

einer Conditional Simulation interpoliert. Die Ergebnisse ergeben eine mittlere

geometrische Verteilung von 1,08 mg Cadmium per kg Boden mit einer

Page 6: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Standardabweichung von 2,27 mg/kg. In der geostatistischen Datenanalyse

werden räumliche Trends und Anisotropie untersucht, da Cadmium von einem

Buntmetallwerk emittiert und mit dem Wind in den Oberboden transportiert wurde.

Die Berücksichtigung des Kontaminationsmusters erzielt jedoch keine signifikante

Verbesserung für die räumliche Vorhersage. Trotz der hohen mittleren

Unsicherheit für das gesamte Untersuchungsgebiet, können die

Eintrittswahrscheinlichkeiten für Grenzwerte an jedem einzelnen Punkt des

Untersuchungsgebietes vorhergesagt werden. Obgleich die Interpolation von

Cadmium keine verlässlichen deterministischen Ergebnisse erzielt, führt der

Umgang mit den Unsicherheiten in Form von Wahrscheinlichkeitsvorhersagen zu

einem verwendbaren und zuverlässigem Ergebnis für den Entscheidungsträger.

Die Unsicherheiten aus der Datenanalyse werden in einem weiteren Schritt

in den Prozess der Entscheidungsfindung integriert. Zunächst bestimmt eine multi-

kriteriellen Nutzenanalyse für die Sanierungsalternativen Phytoremediation,

Bodenwäsche und .keine Sanierung' den Gesamtnutzen durch die Berechung von

linearen und nicht-linearen Nutzenfunktionen. Die Abschätzung des funktionellen

Nutzens einer Cadmiumreduktion im Boden erfolgte für i) die Gesundheit des

Menschen, ii) die Langzeitproduktivität des Bodens, iii) die Verfahrenskosten und

iv) die Veränderung des Bodenmarktpreises. Ohne eine Gewichtung der einzelnen

Kriterien durch die Entscheidungsträger ist die menschliche Gesundheit und die

Langzeitproduktivität des Bodens ein entscheidungskräftiges Argument für oder

gegen die Anwendung der drei Sanierungsalternativen. Die Unsicherheit der

Entscheidung, die sich über eine lognormale Wahrscheinlichkeitsverteilung der

vorhergesagten Kontamination quantifizieren lässt, wird durch den

Erwartungsnutzen für jeden möglichen wahren Wert der Kontamination jenseits

der Vorhersage modelliert. Die Güte der getroffenen Entscheidung lässt sich

schliesslich mit der Abweichung einer hypothetischen möglichen Kontamination

zur berechneten Vorhersage validieren.

Die untersuchten Methoden zeigen einen transparenten und konsistenten

Ansatz zum Umgang mit Unsicherheiten in der Datenanalyse und Datenver¬

wendung. Somit können die dargelegten Ergebnisse zu einer verbesserten

räumlichen Interpretation und Darstellung verschiedenster Ansprüche im Kontext

von Bodenverbesserungsmassnahmen und Entscheidungsfindung beitragen.

Page 7: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Table of Contents

List of Figures VIII

List of Tables IX

1 Introduction 1

1.1 Motivation 1

1.1.1 Soil improvement measures 2

1.1.2. Spatial management of sparse data 3

1.1.3. Geostatistical interpolation techniques 5

1.1.4. Decision making under uncertainty 6

1.2 Aim of this study 7

1.3 References 8

2 Explorative data analysis of heavy metal contaminated soil usingmultidimensional spatial regression 11

2.1 Introduction 12

2.2 Data 13

2.3 Interpolation methods 15

2.3.1 Mollifier approach 15

2.3.2 Geostatistical approach 18

2.4 Results 19

2.4.1 Comparing Mollifier with kriging interpolation 20

2.4.2 Explorative Data Analysis with the Mollifier approach 25

2.4.3 Including co-variables for spatial prediction 31

2.5 Discussion and Conclusion 32

2.6 References 34

V

Page 8: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

3 Uncertainty assessment for management of soil contaminants with sparse

data 30

3.1 Introduction 40

3.2 Data 42

3.3 Methods 44

3.3.1 Analysis of spatial dependency 45

3.3.2 Ordinary kriging 46

3.3.3 Geostatistical conditional simulation 48

3.3.4 Uncertainty assessment 50

3.4 Results 51

3.4.1 Spatial estimation 51

3.4.2 Uncertainty assessment 56

3.4.2 Relevance for decisions on soil remediation 57

3.5 Conclusion 59

3.6 Literature Cited 61

3.6 Annotations 65

4 Decision making under uncertainty in case of soil remediation 67

4.1 Introduction 68

4.2 The case of contaminated land 69

4.3 Probability knowledge on the contamination 71

4.4 Decision alternatives on soil remediation 74

4.5 Utilities of soil remediation 75

4.6 Utility functions for decision alternatives 77

4.6.1 Construction of utility functions 77

4.6.2 The alternative ex situ remediation (soil washing) A^ 80

4.6.3 The alternative in situ remediation (phytoremediation) Ain 82

4.6.4 The alternative no remediation Am 84

4.7 Results 86

4.7.1 The utility functions 86

4.7.2 The overall utility of remediation alternatives 87

4.7.3 The expected utility at coordinate x 89

4.7.4 Deviation from predicted to hypothesized contamination 91

VI

Page 9: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

4.8 Discussion 92

4.9 Conclusion 94

4.10 References 95

5 Conclusion 101

5.1 Multiple spatial regression improves data management 101

5.2 Predictability versus uncertainty 102

5.3 'Correct' decisions for soil improvement measures 103

5.4 Outlook 104

Danksagung 102

Curriculum Vitae 104

VII

Page 10: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

List of Figures

Figure 1.1.1 Scaling processes in spatial data assessment 4

Figure 2.2.1 Area of the investigation and spatial distribution of measurements 14

Figure 2.3.1 One-dimensional Gaussian density for window size 0=1 16

Figure 2.3.2 Probability weight for an arbitrarily selected grid point 17

Figure 2.4.1 Comparison of Mollifier interpolation and lognormal kriging 22

Figure 2.4.2 Spherical variogram of logarithmic transformed cadmium 23

Figure 2.4.3 Cadmium depending on lead/zinc, lead/copper and chromium/nickel 24

Figure 2.4.4 Cadmium dependency on copper (x-axis) and zinc (y-axis) 26

Figure 2.4.5 Interpolated cadmium concentration and its uncertainty 28

Figure 2.4.6 Cadmium concentration depending on soil pH 30

Figure 2.4.7 Cadmium concentration depending on altitude and land use 30

Figure 3.2.1 Spatial distribution of sampling points in Dornach 42

Figure 3.3.1 Variogram map of the logarithmically transformed cadmium data 45

Figure 3.4.1 Anisotropic variograms in the directions D1-D4 53

Figure 3.4.2 Isotropic variogram 53

Figure 3.4.3 Spatial distribution ofcadmium after conditional simulation 55

Figure 3.4.4 A south-north transection of one singular realization 56

Figure 3.4.5 Map of the likelihood ofexceeding the threshold value C=2 mg/kg 57

Figure 4.2.1 Map of a cadmium contaminated land 70

Figure 4.3.1 Model for the true value of contamination 72

Figure 4.6.1 Two examples how to derive the ecological benefit from remediation 78

Figure 4.7.1 Utility functions derived for remediation options 86

Figure 4.7.2 Contribution of each utility dimension to the overall utility 87

Figure 4.7.3 Sum functions of utilities from four dimensions 88

Figure 4.7.4 Expected utility and feasible decisions 89

Figure 4.7.5 The evaluation of decision 91

VIII

Page 11: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

List of Tables

Table 2.1 Statistical description of the heavy metal concentration 15

Table 2.2 Different types of kernel density functions 21

Table 2.3 Comparative descriptive statistics for predicted cadmium 25

Table 2.4 Multiple R2 from multiple regression analysis 27

Table 2.5 Root mean square difference (RMSD) for Mollifier 31

Table 3.1 Statistic properties of the data 43

Table 3.2 Test of normality with a One-Sample Kolmogorov-Smirnov Test 43

Table 3.3 Legal threshold values for remediation of cadmium in soil 44

Table 3.4 Mean square error from cross validation for five different trends 52

Table 3.5 Geostatistical properties of the variograms 52

Table 3.6 Statistical properties of the interpolations (isotropic and anisotropic) 54

Table 3.7 Statistic properties of the interpolation for the core area ofpollution 55

Table 4.1 Different types of decisions depending on the estimation z(x) 73

Table 4.2 Three principles for the construction of utility functions 77

Table 4.3 Mean probabilities of occurrence for decisions 90

IX

Page 12: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

1 Introduction

1.1 Motivation

Today the growing population and increasing consumption of natural

resources create a shortage in usable land, land quality, and biodiversity of

ecosystems. In Switzerland the settlement area raised about 13% from 1985 to

1997 while at the same time the area for agricultural use decreased. Soil

resources are contaminated with heavy metals and other pollutants and in more

than 9 % of these resources soil fertility is constrained. With ongoing pollution,

degradation, and destruction of soil resources only 40 % of the natural grown

surface will have sustained its ecological functionality in 2050 (Desaules 1998;

Häberli et al. 2002). Clearly, these and other impacts on resources ecosystems

and landscapes challenge environmental planners, policymakers and other

decision-makers. They are confronted with the demand to manage the natural

environment in a way that takes into account the ecological, economic, and social

aspects of development.

However, decisions that are indispensable to support such development

often have to rely on uncertain, incomplete and complex information. As a result,

sparse information have to be assessed, analyzed and translated from micro

spatial scales to macro spatial scales (and wee versa). Both, data generation and

scale translation create a high degree of uncertainty that increases throughout the

investigation. Since the process of decision-making is the final step after

assessment of impacts on agents, uncertainties may lead to decisions that cause

legal, medical and economic unintended consequences e.g. when a pollutant is

cumulated in higher concentrations than predicted.

Page 13: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 1

Reducing uncertainty with supplementary information, e.g. through additional

data sampling is a widely discussed issue (Scholz et al. 1994; Van Groenigen

2000; Wagner et al. 2001). A benefit from information gain is not guaranteed, while

auxiliary sampling induces additional costs, which often expense the financial

budget of the project.

In order to cope with the uncertainties from data analysis and data

interpretation within the context of soil improvement measures this thesis focus on

a) the usefulness of an data analysis method for land management purpose

(Chapter 2)

b) the reliability of an interpolation method (Chapter 3) and

c) the trustworthiness of decisions under uncertainty (Chapter 4)

1.1.1 Soil improvement measures

Several research activities addressed different types of remediation

techniques for the spatial management of heavy metal polluted soil and land

(Hammer 2002; Genske 2003; Sasek 2003). 'Harsh' remediation methods like soil

washing, thermal treatment, electro kinetic removal, and others irreversibly change

soil structure and properties while removing the pollutant. Since these methods

cause high costs, they are mostly applied for small areas with high

contaminations. For large and low contaminated areas 'harsh' remediation

techniques are inefficient and economically not viable. Alternatively, the potential

and feasibility of 'gentle' removal techniques was intensively studied over the last

years (Krebs 1996; Wu et al. 1999; Kayser et al. 2001). 'Gentle' soil remediation is

defined as a remediation technique that does not constrain soil properties. The

technique is distinguished in

a) methods that aim for the immobilization of heavy metals in soil through the

addition of binding agents (Wenger 2000) and

b) phytoremediation that uses the ability of heavy metal tolerant plants to

extract and to translocate heavy metals from the soil to the aboveground

parts of the plant (Kayser 2000).

Comprehensive studies concluded that phytoremediation is suitable for large

areas of small and widely spread amounts of heavy metal contamination (McGrath

et al. 2002). Moreover, the method is appropriate where a decrease of

2

Page 14: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Introduction

contamination induce an improvement of soil fertility. Hepperle (2002) points out

that phytoremediation has a precautionary character. Soil remediation that

reduces the heavy metal content to an amount of concentrations, where all soil

functions are maintained, saves costs for monitoring and guarantees a

multifunctional use of soil and land. Hence, the cultivation of plants that

accumulate heavy metals from the soil is suitable to improve the quality of soil and

land, as also defined by Mc Grath and Zhao et al. (McGrath, et al. 2002) and

discussed by Gupta and Wenger (Gupta et al. 2002).

1.1.2. Spatial management of sparse data

Before the implementation of soil improvement measures the prior scaling of

information about the impact on soil is a main issue. Land use and soil

management studies scale data from sampling at single points (pedon in Figure

1.1) to an area-wide distribution of data. Hereafter, the information is translated to

the scale of decision and consequences. Management models focus on decisions

and functionality instead of detailed modeling of single processes. They try to give

a reasonable answers with the help of an empirical relation and a low degree of

complexity (Hoosbeck et al. 1992). The transmission of knowledge from a detailed

micro scale to a broad macro scale of policy requirements and decision-making,

therefore requires indicators, spatial inter- and extrapolations, and evaluations

(Dumanski et al. 1998; Hoosbeck et al. 1998). Figure 1.1.1 shows three different

levels of scaling for e.g. a land management study that aims for a decision on the

application of soil improvement measures. Soil data sampling is conducted here

on an area of 10 square meters (composite sampling). Heavy metals in soil can

have a direct environmental impact on the scale1 of a three dimensional natural

body that represents the natural variability (a pedon).

An interpolation assesses the spatial distribution of heavy metals at the field

scale. The "scale of decision" that evaluates economic and social aspects besides

Here, the term scale is not determined from the cartographical point of view, which defines

a large scale as a very small area. The term does embrace larger and smaller spatial areas, but

also indicates a hierarchy and aggregation of variables within a landscape and is similarly used to

the term level (Addiscott, T.M. 1998).

3

Page 15: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 1

environmental impact on soil, is understood as the macro scale or community

scale.

TcI 9

.°»3

1 cb

_

ts,

1 ' '

c

o

I evaluation3

'x

>—

1G%1*io«ro*

_0)r) field

ik

8'S

community c

2 to

t o

p.^

042en

"D •£ 8O

"•3.£S

TO

o

environmental impact 10%2

E^ ' 1Dedon

qualitative quantitativedegree of computation

Figure 1.1.1 Scaling processes in spatial data assessment and context of soil improvementmeasures. Pedon is understood as a three dimensional natural body large enough to present the

nature and of its variability as defined by Hoosbeck and Bryant (1993). The degree of complexity

explains the relation between variables on this level. Arrows indicate the research chain and the

transfer of knowledge.

In most European countries data are stored in land register, soil databases

and in analog or digital maps. They differ in quality, collection standards and

spatial geometry (Burrough et al. 1998). Therefore, they rarely provide consistent,

comprehensive and area-wide data on soil and land use. International and national

land use and soil databases, for example provided by the Dutch National Institute

of Public Health and the Environment (RIVM 1998), or CORINNE land cover data

(European Environmental Agency 2000) and others, also store data on common

classification and geometry. However, for local and regional studies these data are

too sparse. They do not present the necessary degree of detail and spatial

reference in order to evaluate objectives for sustainable development of land and

soil.

4

Page 16: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Introduction

Hence, a great challenge in the management of land quality and other

environmental issues is the lack in consistent, homogenously and densely

sampled data for inter- or extrapolation, evaluation and finally decision-making. As

a consequence, up scaling from one level to another causes both, a loss of

information and an increase in uncertainty that accumulates from different data

sources and methods. Then, the final decision based on data scaling and

interpretation is likely to bare high risks of errors, negative consequences and

other drawbacks.

A crucial first step in data assessment is the range of uncertainty obtained

from sampling techniques and chemical soil analysis. If soil data are gathered,

sampling and sample investigation, such as chemical analysis, are the main

sources of uncertainty (Lischer 2001; Wagner, Desaules et al. 2001). Particularly,

soil parameter show natural heterogeneity as well as spatial autocorrelation. The

latter is taken into account when scaling is conducted with geostatistical

interpolation methods.

1.1.3. Geostatistical interpolation techniques

Geostatistical methods were originally proposed to predict the metal

concentration in block of rocks in South African goldmines by Krige and in the

French mining school by Matheron (Matheron 1963; Krige 1966). Within the 1980s

soil scientists discovered that the geostatistical interpolation, the so-called kriging,

suitably analyzes and predicts the spatial heterogeneity of soil and soil properties

(Burgess et al. 1980; McBratney et al. 1981; Yost et al. 1982). The implicit theory

of regionalized variable assumes a spatial model with a structural component

(Journel et al. 1978). This component consists of a) a constant mean or a trend, b)

a random component, that is spatially correlated, and c) a spatially uncorrelated

random noise. However, in environmental case studies the spatial distribution of a

soil pollutant is often induced by point sources (like industrial plants or dumps) and

transportation through a medium like air or water. As a result, concentrations of

contaminants differ with distance to the source and do not fulfill the assumption of

a random component. As a consequence several studies uses non-linear

methods, like disjunctive kriging (von Steiger et al. 1996), indicator kriging

(Barabas et al. 2001; Goovaerts 2001), and lognormal kriging (Saito et al. 2000) to

5

Page 17: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 1

correctly predict the spatial distribution of contaminants in soil. Additionally, the

consideration of trend models in geostatistical prediction is widely discussed

(Crawford et al. 1997; Oliver et al. 2001 ; Saito et al. 2001).

1.1.4. Decision making under uncertainty

At the end of an environmental case study the information gathered

throughout the investigation are considered to support a rational choice among

different remediation alternatives. Moreover, decision-makers have to deal with the

uncertainties about the state of the real world and are faced with probabilistic

information, e.g. about the concentration of contaminants. To achieve a 'good'

decision, decision makers trade uncertainties against the consequences of the

decision. Keeney and Raiffa (Keeney et al. 1976) defined the paradigm of

decision-making with five steps:

a) the pre-analysis shapes the problem to decide about and identifies various

alternatives

b) the structural analysis asks for the information that is needed for decision

and formalizes the structure of the decision and its choices within a decision

tree

c) the uncertainty analysis assigns probability knowledge about the empirical

data analysis and the subjective judgments of the decision maker to the

branches of the decision tree

d) the utility analysis associates consequences (utilities) to the implications of

a decision and finally

e) the optimization analysis that calculates and maximizes the expected

utility.

The construction of utilities for decision alternatives provides an information

gain and hence increases the ability to make a correct or "good" decision

(Papazoglou et al. 2000; Schmoldt et al. 2001). Although utilities are subjective

measures, they can be seen as a compromise between accurate scientific

knowledge and concise information (von Winterfeldt et al. 1986; Beinat 1997).

However, the set of attributes assigned for different utility functions has to be

sufficient and relevant with respect to the objectives of the study (Scholz et al.

2002).

6

Page 18: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Introduction

1.2 Aim of this study

The overall aim of this thesis is the improvement of spatial data analysis,

prediction and evaluation of sparse data for the purpose of land and soil use

management. In particular the following questions were addressed within three

publications:

• Does the use of additional co-variables like pH, soil use or land quality

indicators improve analysis and interpolation based on kernel density

regression? (Chapter 2)

• What are the differences between the spatial prediction of cadmium

concentrations with a kernel density regression and kriging? (Chapter 2)

• How can distant, irregularly distributed data be efficiently interpolated for

environmental management purposes? (Chapter 3)

• What kind of uncertainty assessment is appropriate in practical applications

and how can the uncertainty in an interpolation be assessed accordingly?

(Chapter 3)

• What are the chances and limits of the interpolation method and its resulting

uncertainty for practical purpose? (Chapter 3)

• How to use uncertain spatial data of soil contamination in a decision-making

process on soil remediation methods? (Chapter 4)

• How to derive multi-criteria utility for the evaluation of remediation

alternatives including the cost of remediation, impacts on human health,

agricultural productivity, and economic gain? (Chapter 4)

7

Page 19: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 1

1.3 References

Barabas, N., Goovaerts, P. and Adriaens, P. (2001) Geostatistical assessment and validation of

uncertainty for three-dimensional dioxin data from sediments in an estuarine river.

Environmental Science and Technology 35: 3294-3301.

Beinat, E. (1997) Value functions for environmental management, Environment & Management, 7.

Dordrecht Boston London, Kluwer. 241 pp.

Burgess, T.M. and Webster, R. (1980) Optimal interpolation and isarithmic mapping of soil

properties. Journal of Soil Science 31: 315-331.

Burrough, P.A. and McDonnell, R.A. (1998) Principles of geographical information systems. In:

Burrough, P. A., Goodchild, M. F., McDonnell, R. A.et al, Spatial information systems and

geostatistics. New York, Oxford University Press. 333 pp.

Crawford, C.A.G. and Hergert, G.W. (1997) Incorporating spatial trends and anisotropy in

geostatistical mapping of soil properties. Journal of Soil Science Society of America 61,1:

298-309.

Desaules, A. (1998) Nationale Bodenbeobachtung (NABO). Informationshefte Raumplanung 4: 8-

10.

Dumanski, J., Pettapiece, W.W. and McGregor, R.J. (1998) Relevance of scale dependent

approaches for integrating biophysical and socio-economic information and development of

agroecological indicators. Nurtient Cyclling in Agroecosytems 50:13-22.

European Environmental Agency (2000) CORINNE Land Cover. European Topic Centre on

Terrestrial Environment, available at:

Http://dataservice.eea.eu.int/dataservice/othersources.asp.

Genske, D. (2003) Urban land: Degradation, investigation, remediation. Berlin, Springer. 331 pp.

Goovaerts, P. (2001) Geostatistical modelling of uncertainty in soil science. Geoderma 103: 3-26.

Gupta, S.-K., Wenger, K., Tietje, O. and Schulin, R. (2002) Von der Idee der sanften Sanierung

zum nachhaltigen Umgang mit schwermetallbelasteten Böden. Terra Tech 2: 53-54.

Häberli, R., Gessler, R., Grossenbach-Mansuy, W. and Lehmann Pollheimer, D. (2002) Vision

Lebensqualität. Zürich, vdf Hochschulverlag. 345 pp.

Hammer, D. (2002) Application of phytoextraction to contaminated soils: Factors, which influence

heavy metal phytoavailability. Ph D Thesis, Ecole Polytechnique Fédérale de Lausanne.

169 pp.

Hepperle, E. (2002) Nachhaltiger Umgang mit Böden in der Schweiz: Rechtliche Instrumente.

Terra Tech 2: 49-52.

Hoosbeck, M.R. and Bouma, J. (1998) Obtaining soil and land quality indicators using research

chains and geostatisitical methods. Nutrient Cycling in Agroecosystems 50: 35-50.

Hoosbeck, M.R. and Bryant, R.B. (1992) Towards the quantitative modeling of pedogenesis - a

review. Geoderma, 55:183-210.

Journel, A.G. and Huijbregts, C.J. (1978) Mining Geostatistics. London, Academic Press. 599 pp.

8

Page 20: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Introduction

Kayser, A. (2000) Evaluation and enhancement of phytoextraction of heavy metals from

contaminated soils. Ph D Thesis, Swiss Federal Institute of Technology Zurich. 149 pp.

Kayser, A., Schröder, T.J., Grünwald, A. and Schulin, R. (2001) Solubilization and plant uptake of

zinc and cadmium from soils treated with elemental sulfur. International Journal of

Phytoremediation 3,381-400.

Keeney, R.L. and Raiffa, H. (1976) Decisions with multiple objectives: Preferences and value

tradeoffs. New York, John Wiley & Sons. 505 pp.

Krebs, R. (1996) In situ immobilization of heavy metals in polluted agricultural soil - an approach to

gentle soil remediation. Ph D Thesis, Swiss Federal Institute of Technology. 110 pp.

Krige, D.G. (1966) Two-dimensional weighted moving average trend surfaces for ore-evaluation.

Journal of the South African Institute of Mining and Metallurgy 66: 13-38.

Lischer, P. (2001) Quantifying uncertainty of the reference sampling procedure used at Dornach

under different soil conditions. The Science of the Total Environment 264: 119-126.

Matheron, G. (1963) Principles of geostatatistics. Economic Geology 58:1246-1266.

McBratney, A.B., Webster, R. and Burgess, T.M. (1981) The design of optimal sampling schemes

for local estimation and mapping of regionalized variables. Computers & Geoscience 7,4:

331-334.

McGrath, S.P., Zhao, F.J. and Lombi, E. (2002) Phytoremediation of metals, metalloids, and

radionuclides. Advances in Agronomy 75:1-56.

Oliver, M.A., Frogbrook, Z.L., et al. (2001) Exploring the spatial variation in wheat quality using

geostatistics. Aspects of Applied Biology 64: 207-208.

Papazoglou, I.A., Bonanos, G. and Briassoulis, H. (2000) Risk informed decision making in land

use planning. Journal of Risk Research 3,1: 69-92.

RIVM (1998). RIVM Environmental research 1998: World regions and sub regions. Bilthoven.

Saito, H. and Goovaerts, P. (2000) Geostatistical interpolation of positively skewed and censored

data in a dioxin-contaminated site. Environmental Science and Technology 34: 4228-4235.

Saito, H. and Goovaerts, P. (2001) Accounting for source location and transport direction into

geostatisitcal prediction of contaminants. Journal of Environmental Science and

Technology 35: 4823-4829.

Sasek, V. (2003) The utilization of bioremediation to reduce soil contamination: problems and

solutions, NATO science series. Series IV, Earth and Environmental Sciences, 19.

Dordrecht, Kluwer. 417 pp.

Schmoldt, D.L., Kangas, J. and Menoza, G.A. (2001) Basic principles of decision making in natural

resources and the environment. In: Schmoldt, D. L., Kangas, J. and Menoza, G. A., The

analytic Hierarchy process in natural resource and environmental decision making

Dordrecht Boston London, Kluwer. 3:1-13.

Scholz, R.W., Nothbaum, N. and May, T.W. (1994). Fixed and hypothesis-guided soil sampling

methods - principles, strategies, and examples. In: Markert, B. environmental sampling for

trace analysis. Weinheim, VCH, 335-345 pp.

9

Page 21: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 1

Scholz, R.W. and Tietje, O. (2002) Embedded case study methods. Thousand Oaks, London, New

Delhi, Sage Publications. 392 pp.

Van Groenigen, J.W. (2000) The influence of variogram parameters on optimal sampling schemes

for mapping by kriging. Geoderma 97: 223-236.

VBBo (1998). Verordnung über Belastungen des Bodens vom 1. Juli 1998 (Swiss Ordinance on the

Pollution of Soils). Bern,

von Steiger, B., Webster, R., Schulin, R. and Lehmann, R. (1996) Mapping heavy metals in

polluted soil by disjunctive kriging. Environmental Pollution 94,2: 205-215.

von Winterfeldt, D. and Edwards, W. (1986) Decision analysis and behavioral research, Cambridge

University Press. 604 pp.

Wagner, G., Desaules, A., et al. (2001) Harmonisation and quality assurance in pre-analytical steps

of soil contamination studies - conclusions and recommendations of the CEEM Soil project.

The Science of the Total Environment 264:103-117.

Wenger, K. (2000) Ligand-assisted phytoextraction of heavy metals from contaminated soils using

common crop plants. Ph D Thesis, Swiss Federal Institute of Technology Zurich. 130 pp.

Wu, J., Hsu, F.C. and Cunningham, S.D. (1999) Chelate-assisted Pb phytoextraction:Pb

availability, uptake and translocation constraints. Environmental Science and Technology

33:1898-1904.

Yost, S.R., Uehara, G. and Fox, R.L. (1982) Geostatistical analysis of soil chemical properties of

large land areas. I. Semi-variograms. Soil Science Society of America 46:1028-1032.

10

Page 22: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

2 Explorative data analysis of heavy metal contaminated soil

using multidimensional spatial regression

Ute Schnabel and Olaf Tietje (2003) Environmental Geology 44 (8) 893-904

ABSTRACT/To obtain data on heavy metal contaminated soil requires

laborious and time-consuming data sampling and analysis. Not only has the

contamination to be measured, but also additional data characterizing the soil and

the boundary conditions of the site, such as pH, land use, and soil fertility. For an

integrative approach, combining the analysis of spatial distribution, and of factors

influencing the contamination, and its treatment, the Mollifier interpolation was

used, which is a non-parametric kernel density regression. The Mollifier was

capable of including additional independent variables (beyond the spatial

dimensions x and y) in the spatial interpolation and hence explored the combined

influence of spatial and other variables, such as land use, on the heavy metal

distribution. The Mollifier could also represent the interdependence between

different heavy metal concentrations and additional site characteristics. Although

the uncertainty measure supplied by the Mollifier first seems somewhat unusual, it

is a valuable feature and supplements the geostatistical uncertainty assessment.

Page 23: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

2.1 Introduction

Heavy metal soil pollution, for example with cadmium, copper, and zinc, has

lead to an enormous number of investigations studying the spatial distribution of

heavy metal concentrations, statistical relationships between soil properties and

soil use or deriving models for heavy metal behavior in soil, subsoil, and plant

compartments. The spatial distribution of the contaminants is mostly analyzed by

geostatistical methods (Goovaerts et al., 1997; Papritz and Moyeed, 1999;

Schloeder et al., 2001; von Steiger et al., 1996). Other Statistical analyze often

apply ANOVA (Brandvold and McLemore, 1998; Yun et al., 2000; Hoosbeek et al,

1998) and factor analysis (Davies, 1997; Korre, 1999). Models for heavy metals in

soil often rely on empirical relationships between different variables (Diels et al.,

1996; Filius et al., 1998; Krebs et al., 1998). Data analysis for heavy metal

contaminated soil management includes modeling and estimation techniques

(Batjes et al., 1997), and soil evaluation procedures (soil quality assessment)

(Fraters and van Beurden, 1993; Stein and Staritsky, 1995; Tiktak et al., 1998). In

this context Keyzer and Sonneveld (1997) and Albersen and Keyzer (1998)

proposed the Mollifier method for a combined spatial and statistical soil data

analysis including management related issues like e.g. the integration of

biophysical and socio-economic data. The approach applies a Nadaraya-Watson

Kernel density function (Hardie, 1990), as a statistical data smoothing in multiple

dimensions. The kernel density function determines the weights for the

interpolation. It uses the window size (or bandwidth) as smoothing parameter to fit

the function to the data. The Mollifier is a non-parametric method and - applied for

spatial interpolation - an alternative to ordinary kriging or - in case of combining

the spatial dimensions with additional co-variables - to co-kriging. Albersen and

Keyzer (1998) implemented the Mollifier interpolation and its visualization within

the SAS environment and demonstrated its usefulness when estimating soil

erosion (Keyzer and Sonneveld, 1997) by using a dataset for the universal soil

loss equation (Keyzer and Sonneveld, 2001). The method is capable of

investigating smooth functional relations between the independent and dependent

variables. However, Keyzer and Sonneveld (2001) emphasize that the derived

regression model for interpolation is strongly data driven. Moreover, Voortman and

12

Page 24: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Brouwer (2001) used the Mollifier method in combination with a parametric model

estimation based on autocorrelation and emphasize, that the spatial variability of

millet yield is better explained with a spatial autocorrelation function, rather than

with a regression of variables only. Nevertheless they considered the Mollifier

regression as an essential method for the identification of the functional form,

which fits the data best.

In this investigation the proposed Mollifier method is applied to spatially

distributed heavy metal soil data and compared with a standard kriging

interpolation. The Mollifier method is implemented in a Delphi® software

environment (Borland, 2001) and is extended to cope with different window sizes

for different co-variables. Additionally this implementation is able to visualize the

differences between the interpolation and the individual measurements and

calculates the root mean squared difference for the set of measurements as a

whole (Tietje, 2003). The study shows the potentials and limits of the Mollifier

interpolation discussing the following issues

• Does smoothing and mollifying help to analyze the dependence of cadmium

on other heavy metals and on variables known to influence the

contamination, such as soil pH?

• Does the use of co-variables like pH, soil use or a land quality indicator

improve the Mollifier interpolation of cadmium concentrations in soil?

• What are the differences between predicting cadmium concentrations with

the Mollifier and with Kriging?

2.2 Data

The study is based on a data set of heavy metal measurements in topsoil.

The site is located in the agglomeration area in the north west of Zurich

(Switzerland). Different types of land use like agriculture, pasture, forest, dwelling

and a few industrial sites characterize it. The main sources of the heavy metal

content in soil were the application of fertilizer in intensive market gardening,

winegrowing (former use) and private gardening. Local points of high pollution

were found at old industrial places like foundries or dumps (Gsponer, 1996). The

data set was gathered in 1992 and 1993 (Meuli, 1997). It is based on the nested

13

Page 25: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

sampling design proposed by Youden and Mehlich (1937) (compare Figure 2.2.1).

At every point, soil was taken at 25cm of depth from five different locations and

combined into a composite sample later in the laboratory. Most of the

measurements were located at a height around 400 m above sea level; the

highest point reached 619 m altitude.

672200

253508

••• ••• • ••

• • • • •

• • • * • & • w m

m ••• ••• • •• ••

• • • ••

• • • • • •

• ••

•#

•m •• * • 9 •

m • m

•••• • ••• ••

• • •

• •••• ••••

• • •* • •# • «fc •

• • • • • .

• • • • • •

• ••• •••

• • • ^ • • f . K • *

•• • • •

•• •••• • •

• ••• ••• ••••

»• •

• • •

• • • • •

• • ••

258464

675807

N

AFigure 2.2.1 Area of the investigation and spatial distribution of measurements in

Furttal, Canton Zurich, Switzerland

14

Page 26: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Statistical parameter Cadmium Copper Zinc Lead Chromium Nickel PH

n=299 [mg/kg] [mfl/kfll [mg/kg] [mg/kg] [mg/kgl [mg/kg]

Mean 0.36 37.5 66 24 23.3 27.3 6.56

Geometric mean 0.31 25 59.4 22.6 22 25.8 6.44

Variance 59.2 3152 4126 219.8 68.5 71.4 1.18

Minimum 0.06 4.4 18.4 9.8 8.2 6.2 3.33

Maximum 2.7 587.7 1025 215.7 50.8 53.9 7.60

Kurtosis 35.6 43.9 174.2 117.8 0.78 -0.08 1.77

Skewness 4.6 5.9 12.3 10 0.99 0.15 -1.7

Guide value 0.8 40 150 50 50 50 -

(VBBo, 1998)

Table 2.1 Statistical description of the heavy metal concentration. The guide value indicates the

concentration in top soil beyond which negative impacts on soil fertility are assumed (Gysi et al.,

1991; VBBo, 1998)

For the interpolation with co-variables soil and land use data from the local

office for regional planning and surveying provided the spatially distributed data for

pH, land use and the soil quality indicator (Amt für Raumordnung und

Vermessung, 1996; Jäggli et al., 1998)

2.3 Interpolation methods

2.3.1 Mollifier approach

The Mollifier interpolation approach (Albersen and Keyzer, 1998) uses a

kernel density regression (Nadaraya, 1964; Watson, 1964) for smoothing spatially

distributed data. The regression is

y = R(x) + e (1)

where e is an error term with mean zero, bounded variance, and unknown density.

The regression R(x) is calculated as a weighted mean of the S measurements

y' y5

Ä(jo-/tf(*)=iyws=l

(2)

For a given value of x the weighting function P°(x) determines the weight of

the measurement y5 with which it contributes to the overall estimation of the

dependent variable y at the point x. For spatial interpolation x is the (in our case

two-dimensional) vector of coordinates. But due to the general approach x might

be any multidimensional vector of independent variables. This general approach to

15

Page 27: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

calculate the weights uses a kernel density function (Nadaraya, 1964; Watson,

1964) such as a one dimensional Gaussian kernel density (XploRe, 1999).

4Va)=;srxp —u (3)

with t/Bfx-x'J/land 9 a scaling factor (window size)

The kernel density (see Figure 2.3.1) has its maximum at u=0 (i.e. when the

weight is calculated at a value of x, for which a measurement Xs is available),

decreases with increasing distance to a measurement, and depends on the

window size 9. Further characteristics of kernel regression methods and additional

kernel density functions can be found in Albersen and Keyzer (1998) and Hardie

(1990) as well as on the world wide web (XploRe, 1999). The resulting weight of

measurement y5 is

WH

Ve{{x-x*)l9)if^(jc)>0

yl(x)0

(4)

otherwise

The denominator ensures, that all the weights sum up to 1

5

*=1

*Z(*)=£%«*-*)/?) (5)

1-dimensional Gaussian kernel density

0.45

0.4

0.3S

0.3 -

0.25

0,2 -

0.15

0.1

0.05

0

-5 -3 -1 0 1

distance to a measurement

Figure 2.3.1 One-dimensional Gaussian density for window size 0=1

The Mollifier interpolation determines a soft blanket between the minimal and

maximal measurements. The kind of density function and the window size

determines the degree of smoothing or "mollifying" the interpolation. The optimal

window size depends on the density of sample points (see Equation 4,5). The

16

Page 28: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Mollifier approach also provides an uncertainty measure for the interpolation at

every estimation point. The likelihood ratio (Albersen and Keyzer, 1998)

8

S*¥(0)m

estimates the probability of y(x) coinciding with any of the measurements y3,. The

Mollifier interpolation requires for each point y(x), to be interpolated, and each

measurement / the calculation of the weight according to Equation (4). For an

arbitrarily selected grid point (xt= 673,985 m, x2=255,900 m, interpolated cadmium

concentration c= 862.98 u.g/kg) the weights Pg(x) are shown in Figure 2.3.2

3000

2500

2000

a.

I 1500

<3

1000

500

04-

* measured

interpolated

lower

- - - - upper

0.05 0.1 0.15

Probability Weight P(x,theta)

0.2

Figure 2.3.2 Probability weight for an arbitrarily selected grid point (Xi= 673,985 m, x2=255,900 m,

interpolated cadmium concentration c= 862.98 jug I kg ). The weights p£(x) are shown, which are

used for the interpolation according to Equation (2). The uncertainty Q(x,a) (see Equation 7) of the

interpolation is measured as the sum of the probability weights of those measurements, whose

difference to the interpolated value (yse) is larger than \a\ (with a defined as 200 fig I'kg, in this

case Q(x,a)=0.263)

17

Page 29: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

The sum of the weights of measurements outside a given range (see Figure 2.3.2

where a equals 200 ßglkg) estimates the uncertainty of the interpolation

(Albersen and Keyzer, 1998)

Q0(x,a) = £ P;(x) withS(x,a) = {s\ \\ys ->£(x)|>a] (7)S(x,a)

2.3.2 Geostatistical approach

The geostatistical model of spatial variability assumes that the sampling S is

one realization of a stochastic process (Cressie, 1993; Webster and Oliver, 2001).

The measurements y(xs) indicate the heavy metal concentration at data

coordinates xi,...,xs and differ in dependence on the distance vector h (Matheron,

1963). The spatial dependency is shown in the variance model (semivariogram) by

calculating half the variances y{h)of concentration differences y(x)-y(x+h) of all

measurement pairs with the same distance

2ntf

where h is the (e.g. Euclidean) distance, n the number of pairs and h > 0

The derived empirical model of spatial dependency is fitted to a variogram

model, which in our case of cadmium data (see Table 2.1) is a spherical function

(Armstrong, 1998; Geovariances, 1997; Webster and Oliver, 2001).

y(h) =c0 + q

v2ûf \a)j

for h <; a(9)

c0 + c1 otherwise

where h is the distance between two points in two dimensions and a is the range,

the distance of spatial dependency. Other functions (like gaussian, exponential)

are discussed, for instance in (Cressie, 1993; Journel and Huijbregts, 1978;

Matheron, 1963, 1973; Tietje and Richter, 1992). At small distances any

discontinuity for y{h) is shown by the nugget effect (co), that is where the

variogram function (see Figure 2.4.2) intercepts the ordinate. This might be

caused by a lack of data (sparse data sampling) or high local heterogeneity of the

18

Page 30: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis ofheavy metal contaminated soil

variable (Webster and Oliver, 2001). The interpolation method of ordinary kriging

is a weighted average for every estimation point.

where y*(xQ) is the estimated value at the coordinate x0, and /(xjis the

measured value at xs, and the A, are the weights determinded by minimization of

the variance of the interpolation (Cressie, 1993; Goovaerts, 1997; Journel, 1989).

The uncertainty of the interpolation is expressed by the kriging variance a^(Xs),

n

which is a] = £Xsy{x, ,x0 )+i//{x0 ) (11)

where Xs are the weights distributed in terms of the data model and y/ is the

Lagrangean multiplier (Armstrong, 1998; Goovaerts, 2001). The kriging variance is

a weighted sum of the semivariances y{xs,x0) between the estimation pointxq and

the measured point x$. All weights sum up to one and are determined by

minimizing the estimation variance with the help of Lagrangean multipliers.

Assuming that estimation points near by a measurement are more correlated to

the measurement than those more distant, they carry a higher weight. The nugget

effect influences the distribution of weights in such a way that more distant

estimation points carry a slightly larger weight than they would do without a nugget

effect (Webster and Oliver, 2001).

2.4 Results

The results are divided into three sections. In the first section the

interpolation of cadmium with the Mollifier density regression is compared with

geostatistical interpolation. In the second section results of the Mollifier

interpolation are shown, where the cadmium concentration was analyzed for any

interdependency on copper, lead, nickel, and chromium. In the third section co-

variables like a soil quality indicator, land use and altitude have been included for

spatial prediction of cadmium.

19

Page 31: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

2.4.1 Comparing Mollifier with kriging interpolation

The Mollifier approach as spatial interpolation in two dimensions is

comparable to other interpolation methods like inverse distance interpolation

(Gotway et al., 1996), spline interpolation (Hutchinson and Gessler, 1994; Press et

al., 1989) or kriging (Webster and Oliver, 2001). Heavy metal concentrations in soil

are known to exhibit spatial heterogeneity and geostatistical dependence (Meuli,

1997; Van Groenigen, 2000). Several investigations have shown how ordinary

kriging (Kravchenkto and Bullock, 1999; Saito and Goovaerts, 2000;

Schloeder.Zimmermann, Jacobs, 2001), lognormal kriging (Papritz and Moyeed,

1999), indicator kriging (Goovaerts.Webster, Dubois, 1997; Stein et al., 2001), and

co-kriging (Castrignanô et al., 2000; Einax and Soldt, 1998; Yu-Pin, 2002) can

lead to a reasonable spatial assessment. For this reason, a comparison of the

Mollifier interpolation with kriging is a kind of benchmarking for the Mollifier

approach.

The Mollifier depends on the number of independent variables (see below),

the type of kernel density function and, its main parameter, the window size. In

principle any measurable decreasing function with its maximum at the origin can

be used for mollifying. For practical purpose there are three types of kernel density

functions:

1. piecewise continuous functions on a finite window, which are best

applicable to non-continuous (ordinal or even categorical) data, such as

land use type or classes of soil fertility.

2. Continuously differentiable functions within a finite window, which yield a

more 'soft' interpolation and hence are best applicable to continuous

(metric) data, such as soil characteristics like the pH or contaminant

concentrations

3. Continuously differentiable functions within a principally infinite window,

which is mainly the Gaussian density, leading to the softest interpolation

'blankets'.

20

Page 32: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Table 2.2 gives an overview of tested kernel density functions.

Mollifier Kernel Tested Comment

density dimension

function

Uniform square 2 and 3 piecewise continuous within a finite window

Uniform circle 2 piecewise continuous within a finite window

Cone 2 piecewise continuous within a finite window

Epanechnikov 2 Continuously differentiable within a finite window

Quartic 2 Continuously differentiable within a finite window

Triweight 2 Continuously differentiable within a finite window

Fourweight 2 and 3 Continuously differentiable within a finite window

Gaussian 2 and 3 Continuously differentiable within a principally

infinite window

Cosine 2 Continuously differentiable within a finite window

Table 2.2 Different types of kernel density functions implemented and tested in MolliD (Tietje,

2003). A more detailed description of the kernel functions can be found in Parzen (1961) and

Cressie (1993)

The window size is the parameter determining the spatial range in between

measurements that is taken into account for interpolation. For a small window size

the interpolation results are heterogeneous and tend to map the sampling points

only. For a window size lower than half of the distance between the measurements

an interpolation is not possible. Then, the Mollifier interpolation shows 'wholes' in

the interpolated blanket (like kriging interpolation would do, if there were grid

points outside the geostatistical range of any measurements). For large window

sizes the interpolation is smoothed. If the window size is larger than the

interpolated area, then the result is an almost constant value throughout the area.

Figure 2.4.1 shows a on the right hand side a Mollifier interpolation with a

Gaussian density and a window size of 200 m. The 'blanket' is largely 'mollified'

with a reduced heterogeneity, leaving some large differences between the

measurements and the interpolation. Please note that this is a non-parametric

regression with the value of the window size determining the degree of data

smoothing. This parameter has to be defined according to the purpose of

interpolation.

21

Page 33: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

ALognormalkr

igin

g

RMSD=

163.68(n=299)

RRMSD-

112

*2-Y

y1-Cd

[OlÉUUU.HZJ3UUUWI

BMollifier

RMSD=

14100[n-299]

RRMSD=

D.10

(672

0000

2590

00.0

1(6

7600

0.0

2590

00.0

)

(6733000

2530000)

(S7600O0253000HJ

Figure2.4.1A:ComparisonofMollifierinte

rpol

atio

nandlognormalkr

igin

goftotalcadmiumconcentration(between100and2700yg/kg)

onasp

atia

lgndfrom672,000

mto676,000m

(easting)and253,000m

to259,000m

(northing).The

verticalbarsrepresentthecadmiummeasurementsabove

(fightgrayba

r),below(darkgrayba

r),

orequal(black

dots

)tothein

terp

olat

ion.

B:LognormalKr

igin

gin

terp

olat

ionwithsp

heri

calvariogram,range=2177,

sill=0.2,nugget=0.08

Page 34: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

For the geostatistical analysis the data were logarithmically transformed and

fitted to a spherical variogram (Equation 7, Figure 2.4.2). The data show a spatial

dependency within a distance of 2,177 m. The model behaves discontinuous near

the origin and reveals a nugget effect of 0.08(ng/kg)2 and a sill of 0.22 (jag/kg)2

(both on the log-scale). Lognormal ordinary kriging was performed with the

parameters of the variogram discussed above.

0.3

0.2

rih)

0.1

°'°0. 1000. 2000. 3000. 4000. "5000.

distance [meter]

Figure 2.4.2 Spherical variogram of logarithmic transformed cadmium with a sill of 0.22 (ng/kg)2, a

nugget effect of 0.08 (|J.g/kg)2 and a spatial range of 2,177 m

The comparison of both methods, kriging and the Mollifier interpolation,

reveals a similar smoothing of the data. Both blankets are presented in Figure

2.4.1. Table 2.3 shows the descriptive statistics of both interpolations. If a

variogram model (such as the spherical model in Figure 2.4.2) can be well fitted to

the empirical variogram (Equation 10), then kriging is superior, because it

consistently explains the stochastic spatial structure of the data. Otherwise the

Mollifier interpolation is preferred, because it does neither rely on nor indicate a

questionable geostatistical characteristic of the data. If the Mollifier software

23

y-^

\y

Page 35: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Cadmium

Interpolation

Uncertainty

Q0

(>-,

200

)ofCadmium

inte

rpol

atio

n

A2

Copper

Zinc

-JSOO,6GOt

io

Nickel

Figure2.4.3Cadmiumdependingon

lead/zinc,

lead/copperandchromium/nickel.Forcopperandzincthewindowsizeis

10mg/kg,

foraitothervariables5

mg/kg.

Note

the

different

scales

of

the

independent

vari

able

s,the

cadmium

inte

rpol

atio

n,and

the

sum

of

weights

Q(x,a)

Page 36: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

became more widespread, it could be easily applied for the screening of spatial

data. The three-dimensional representations of the interpolated grid together with the

resulting data deviations (gray and black bars) largely support the interpretation of

both interpolation techniques. Especially because a good fit of the variogram (Figure

2.4.2) could be falsely interpretated as indicating a good fit of the kriging interpolation

(RMSD=163, Equation 12 and Figure 2.4.1A), which might not be better than the

Mollifier interpolation (RMSD=141, Equation 12 and Figure 2.4.1B).

Mollifier interpolation

cadmium [u.g/kg]Lognormal ordinary kriging

cadmium [u.g/kg]

Fitting parameter gaussian density function,window size 200

spherical model, sill 0.22,

nugget 0.08, range 2,177 m

Mean 308 275

Variance 41,997 21,819

Standard Deviation 205 148

Maximum 1,898 1,458

Minimum 95 72

Geomean 268 249

Table 2.3 Comparative descriptive statistics for predicted cadmium concentrations with both

interpolation methods

2.4.2 Explorative Data Analysis with the Mollifier approach

For the explorative data analysis the dependency of cadmium on other heavy

metals (like copper, lead, nickel and chromium) in soil is analyzed with the Mollifier

interpolation. In Figure 2.6 the results of the regression analysis for the relevant

heavy metals copper, zinc and lead are shown.

Comparing the blankets for the independent variables lead/zinc, lead/copper

and chromium/nickel it is obvious that the latter reveals the smallest deviations from

measurements (indicated as gray and black bars below and above the blanket). The

cadmium content seems to increase with increasing chromium and nickel content.

For lead and copper only a few measurements determine the increase of cadmium.

25

Page 37: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

15IB

CO

1001

00

500.

00

250000

200000

1500.00

1000.00

500.00

0.00

POO

0.0]

Figure2.4.4Cadmiumdependencyoncopper

(x-axis)

andzinc

(y-axis).Thewindowsizeonthe

righ

thandside

is10timessmallerthanonthe

lefthand

side.Theregression

iscalculatedwithaGaussiandensityfunction.Thecadmiumaxis

isdenoted

inpg/kg

Page 38: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Although calculated with the same parameters of fit (density function, window size)

the influence of copper as independent variable is different to that of zinc for the

prediction of cadmium. Compared with a multiple regression analysis the squared

sum of the multiple correlations (multiple R squared between cadmium, copper, zinc

and lead reveals only a weak dependency between the variables. However it reveals

a similar portion of variance for the dependent variable (see Table 2.4). As opposed

to that the prediction of cadmium with the Mollifier results in different blankets for the

independent variables copper/lead and zinc/lead.

Cu/Pb Zn/Pb Cr/Ni Cu/Zn

Multiple R squared 0.569 0.576 0.230 0.477

Table 2,4 Multiple R2 from multiple regression analysis for the explorative variable cadmium. As a

measure of fit for the regression model it indicates the explainable proportion of variation in the

dependent variable cadmium

A detailed view of the distribution of probability weights in Figure 2.4.3 shows

that the highest interpolations of the dependent variable are calculated from widely

scattered measurement points. The higher the weights, the more the interpolated

value differs form the accounting associated point. In this case the probability that the

difference between interpolated point and any measurement point is larger than 200

pg/kg is summed up for the uncertainty measure Q0(x,à) (compare Equation 7). The

highest cadmium contents reveal the highest probabilities of being outside a range of

200 pg/kg around the interpolated value. This is due to sparse data sampling

(compare Figure 2.4.3). Although the results are uncertain for high cadmium contents

the distribution of the uncertainty and the contributing measurements are clearly

visible. Therefore an explorative data analysis with the Mollifier gives a

comprehensible view on the data point.

However, in order to analyze any interdependency a cautionary fit of

interpolation parameters is necessary. On the left side in Figure 2.4.4 the

interpolation turned into an extrapolation, because two single measurements

determine a large area without any further data whereas most of the data

concentrate in the lower left corner with marginal influence on the interpolated area.

In this case the window size and the scale (size) of the interpolated area have to be

27

Page 39: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

[3.0.110.0)10

1100)

Cd[u

g/kg

]

(3.0.1100)10.0)

Zn[mg/kg]

[3.0

1001(110.10.01

pH

ß,(*,200)

Zn[m

g/kg

]

(30.(110.100]

pH

Figure2.4.5In

terp

olat

edcadmiumconcentrationand

itsuncertaintyQ(

x,a)

asdependentonpHandzinc

Page 40: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

reduced. The result is visible on the right hand side of Figure 2.4.4. The

correlation of cadmium (with copper and zinc) is visible with the increase of the

blanket in the upper right corner. All regressions were computed with a Gaussian

kernel density function.

In a next step soil and land use data are tested for any interdependency with

cadmium. Total cadmium and zinc contents in soil behave similar in same pH-

milieu. Both heavy metals are soluble at a pH-value lower 7 (Filius.Streck, Richter,

1998; Geiger, Schulin, 1995). Therefore cadmium dependency is tested for pH-

value and zinc in Figure 2.4.5. From the interpolation it is visible that the cadmium

content in soil tends to rise with increasing pH-value and zinc content. High

cadmium concentrations are correlated with high zinc contents above a pH-value

of 6. For further information on the spatial and stochastic distribution of cadmium

additional land and land use properties are taken into account. In Figure 2.4.6 it is

visible that pH-values below 5 occur mainly above 500 m altitudes (above see

level), whereas high cadmium concentrations occur at the lowest altitude with high

pH-values. This is mainly due to agricultural use in the bottom of the valley and to

forest and winegrowing plots in higher altitudes.

A blanket (Figure 2.4.7) of the independent variables 'altitude' and 'land use'

against cadmium concentrations confirms the results from Figure 2.4.6 High

cadmium contents are interpolated for land uses in low altitude, like agriculture,

dwelling and industry, while the cadmium concentration decreases with increasing

altitude and mainly under forest. This also explains low cadmium contents in soil

with pH-values below 5 (see Figure 2.4.6), which are common for soils under

forestry use. Note that this blanket is calculated with a Uniform square kernel

density function and a small window size for the dimension of land use (20 m) so

that the interpolations of different land uses do not influence each other. This

piecewise continuous kernel function (see Table 2.2), applied within a small finite

window size, results in a more reasonable interpolation of ordinal scaled data than

the Gaussian kernel functions for the same data does (blanket not shown).

However the blanket shows general tendencies of spatial distribution of the

dependent variable. In all land use categories the interpolated cadmium

concentration is slightly above the measured value, except for a few locations with

large underestimations. This reflects the lognormal distribution of cadmium, which

was not accounted for in the non-parametric Mollifier interpolation.

29

Page 41: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

(30, 70) (80, 70)

2500 00

2000 ro

Cadmium [ug/kg ]

1500 00 '

1000 00

500 00

(30 40)

pH

(80 40)

2500.00

3300 00

1500.00

1000 00

500 00

0.00

Altitude

Figure 2.4.6 Cadmium concentration depending on soil pH (window size 0.5) and altitude (elevation

above sea level, window size 20 m)

(400,6 0m)

i

(700m, 6 0)

1 250000

I 1 BOO 00

Cadmium [|jg/kg ]

.000

6: Unused Area

(400m, 0 0) (0 0,700m)

Altitude

Figure 2.4.7 Cadmium concentration depending on altitude (400 - 700 m, window size 20 m) and

against land use (window size 0.1). The kernel density function is Uniform square

30

Page 42: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

2.4.3 Including co-variables for spatial prediction

In order to improve the interpolation of cadmium, soil agricultural suitability is

included as a third independent variable. However the results show only marginal

change of the interpolated value compared to mollifying without co-variables

(presumed same grid, function and window size). As a general measure of the

goodness of the interpolation serves a root mean squared difference criterion:

RMSD =

y/2

-!,(?,-y'J\n~i

(12)

where the differences between the predicted (interpolated) yp' and measured

cadmium concentrations ym' of the n=299 measurements is assessed.

Although the interpolation reveals similar results the use of the co-variable

allows a more precise prediction in the vicinity of the measurements if the window

size is varified according to the third independent variable. The use of a third

window size for the co-variable reveals also a less smoothed blanket as a

consequence of the higher preciseness (see Table 2.5). The uncertainty of

prediction can even be improved by the use of a density function with finite window

size (compare Table 2.2). But the choice of density function has to be made with

respect to the data (compare Figure 2.4.6).

Co-Mollifier Mollifier

Xr=laltitude , x2= lonqitude, Xa= soil suitability Xi=laltitude, x2= lonqitude,

Kernel Density Function Window size RMSD RMSD

Uniform square 200 m, 200 m, 5 106.10 109.61

200 m, 200 m, 1 96.16 -

Fourweight 200 m, 200 m, 1 47.47 55.24

Gaussian 200 m, 200 m, 1 128.82 141.00

100 m, 100 m, 0.1 51.55 63.50

50 m, 50 m, 0.1 42.91 47.95

Table 2.5 Root mean square difference (RMSD) for Mollifier with and without co-variable according

to change of density function and window size. Soil suitability is classified from 1 to 11

(nondimensional) according to the soil map (Jäggi et al. 1996)

31

Page 43: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

2.5 Discussion and Conclusion

The results from different interpolations with kernel density regression show

the explorative data analysis as a main use of the Mollifier approach. Especially for

multivariate explorative analysis of spatial distributed data the three-dimensional

blanket of dependency between variables is able to show general tendencies,

outliers, hot spots, and punctual dependencies. Herewith it is a useful tool

between a Geographical Information System (GIS, like Arc/Info) (Burrough and

McDonnell, 1998; Hendriks et al., 1998) and descriptive statistics, using e.g.

scatter plot matrices or regression analysis. A GIS often is able to spatially

interpolate single variables (with e.g. Spline interpolation, Inverse Distance

Weighting or Kriging), to create overlay maps, and to query first-order statistics of

different variables. The results are mostly shown in two dimensions only (Fedra,

1998; Kooistra et al., 2001). In contrast any correlation between variables is simply

to analyze with descriptive analysis, but gets more difficult when more than two

variables and the spatial extension are of interest. Within this gap, the Mollifier

reveals convenient results as it is seen for example in Figure 2.4.1 and Figure

2.4.3. Moreover the method is easy to apply and results are quickly visible and

useable. The comparison with a multiple regression analysis (Table 2.4) showed

that the variance in the predicted variable is visible with the Mollifier regression at

single points and yields a more detailed analysis therefore. As opposed to the

Mollifier version used by Albersen and Keyzer (1998), the Mollifier program used

in this study (see Tietje, 2003), shows the deviation of the data from the

interpolated blanket as black and gray bars at the sampling points. This allows a

valid interpretation of the blanket with respect to the measured values. The

interpretation is additionally supported by the uncertainty measure (Equation 7).

Moreover the preciseness of interpolation (in terms of the root mean squared

difference between the measurements and the interpolation) can be improved with

an additional independent variable, especially around the measurements. The

improvement is even enlarged, when the choice of the density function and the

window size is adapted to the data.

Compared with the widely used geostatistics and kriging (Andronikov et al.,

2000; Goovaerts, 2000; Juang and Lee 1998) the Mollifier approach is

conveniently usable for spatial prediction. The method strongly depends on the

32

Page 44: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

density of the data and therefore the kind of density function and the window size

are an important choice to be made. The type of the kernel density function

determines the degree of smoothing, but even more the window size does.

Therefore both parameters have to be chosen carefully and appropriate to data

structure and the purpose of interpolation. From the analysis of interdependency of

cadmium to land use it is shown that a continuous differentiable density function,

like the gaussian, is recommended for quantitative continuous data. In most of the

cases the interpolation of qualitative ordinal data can be better calculated with any

continuous differentiable kernel density function with finite window size. In the

case of sparse sampling the flexibility of the non-parametric approach turns in a

disadvantage because the measures of fit, Q and A (Equation 6 und 7), are not

reliable for low data density (Stahel, 2000). In environmental case studies the

spatial distribution of a contaminant is often caused by point sources (like plants or

dumps) and transportation through a medium like wind or water. As a result

concentrations of pollutants differ with distance to the source and therefore do not

fulfill the assumption of a gaussian random distribution with a stationary mean.

However this presumption is needed for (geo)statistical predictions (Limpert et al.,

2001). Then, non-linear methods have to be used instead which are reported to

interpolate with a loss of preciseness (Papritz and Moyeed, 1999). The Mollifier

approach may serve as an appropriate alternative to geostatistical methods if the

required assumptions of stationarity cannot be validated.

Acknowledgments

We are grateful to Günter Fischer and Silvia Prieler at the International Institute of Applied

System Analysis, Austria for the first contact and help with the Mollifier, M. Keyzer and B.

Sonneveld at the Centre for World Food Studies, The Netherlands for helpful comments on a first

manuscript of this paper and also P.J. Albersen at the same institute for the help with the Mollifier

programme. Last but not least we would like to thank Martin Fritsch, R.W. Scholz and the

colleagues at UNS for discussions and support of our work. The project was funded by the Swiss

National Science of Foundation (Project Number 50011-44758) and the Swiss Federal Institute of

Technology (Project Number 0048-41-2609-5)

33

Page 45: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

2.6 References

Albersen PJ, Keyzer MA (1998) Documentation Mollifier Programm. Draft, pp 19

Amt für Raumordnung und Vermesssung (1996) Bodenkarte Raum Furttal. Digital map, CD-Rom

Andronikov SV, Davidson DA, Spiers RB (2000) Variability in contamination by heavy metals:

sampling implications. Water, Air and Soil Pullution 120:29-45

Armstrong M (1998) Basic Linear Geostatistics, Springer, Berlin, Heidelberg

Batjes NH, Fischer G, Nachtergaele FO, Stolbovoy VS, van Velthuizen HT (1997) Soil data

derivedfrom WISE for use in global and regional AEZ Studies.lnterim Report, IR-97-

025/May. International Institute of Applied System Analysis, Laxenburg, pp 27

Borland (R) (2001) Delphi(TM) 6, Development Environment. Borland Software Corporation,

Inc.Langen, Germany

Brandvold L, McLemore V (1998) A study of the analytical variation of sampling and analysis of

stream sediments from areas contaminated by mining and milling. Journal of Geochemical

Exploration 64:185-196

Burrough PA, McDonnell RA (1998) Principles of Geographical Information Systems, Oxford

University Press, New York

Castrignanô A, Giugliarini L, Risaliti R, Martinelli N (2000) Study of spatial relationships among

some soil physico-chemical properties of a field in central Italy using multivariate

geostatistics.Geoderma 97:39-60

Cressie NAC (1993) Statistics for spatial data, John Wiley, New York

Davies BE (1997) Heavy metal contaminated soils in a old industrial area of Wales, Great Britain;

source identification through statistical data interpretation. Water, Air and Soil Pollution

94:85-98

Diels J, Van Orshoven J, Vanciooster M, Feyen J (1996) Using Soil Map and SimulationModeling

for the Analysis of Land Use Scenarios. The Role of Soil Science in

InterdisciplinaryResearch SSSA Special Publication: 123-143

Einax J, Soldt U (1998) Multivariate geostatistical analysis of soil contaminations. Fresenius

Journal of Analytical Chemistry 361:10-14

Fedra K (1998) Integrated risk assessment and management: overview and state of the art.

Journal of hazardous Materials 61:5-22

Filius A, Streck T, Richter J (1998) Cadmium Sorption and Desorption in Limed Topsoils as

Influenced by pH. Isotherms and Simulated Leaching. Journal of Environmental Quality

27:12-18

Fraters D, van Beurden AUCJ (1993) Cadmium mobility and accumulation in soils of the European

Communities. National Institute of Public Health and Environmental Protection, Bilthoven, pp

65

Geiger G, Schulin R (1995) Risikoanalyse, Sanierungs-und Überwachungsvorschläge für das

schwermetallbelastete Gebiet von Dornach. 2. Volkswirtschaftsdepartement des Kantons

Solothurn, Amt für Umweltschutz, Solothum, pp 79

34

Page 46: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis of heavy metal contaminated soil

Geovariances (1997) Isatis. 3.1.2, Avon

Goovaerts P (2000) Geostatistical approaches for incorporating elevation into the spatial

interpolation of rainfall. Journal of Hydrology 228:113-129

Goovaerts P (2001) Geostatistical modelling of uncertainty in soil science. Geoderma 103:3-26

Goovaerts P (1997) Geostatistics for Natural Resources Evaluation. Oxford University Press, New

York Oxford

Goovaerts P, Webster R, Dubois JP (1997) Assessing the risk of soil contamination in the Swiss

Jura using indicator geostatistics. Environmental and Ecological Statistics 4:31-48

Gotway CA, Ferguson RB, Hergert GW, Peterson TA (1996) Comparison of Kriging and Inverse-

Distance Methods for Mapping Soil Parameters. American Journal of Soil Science Society

60:1237-1247

Gsponer R (1996) Ursachendifferenziertes Vorgehen zur verdachtsorientierten Erkundung von

Schwermetallbelastungen im Boden, Ph.D. Thesis, Zürich, pp 207

Gysi C, Gupta S, Jäggi W, Neyroud JA (1991) Wegleitung zur Beurteilung der

Bodenfruchtbarkeit. Eidgenössische Forschungsanstalt für Agrikulturchemie und

Umwelthygiene, Liebefeld-Bern, pp 89

Hardie W (1990) Applied nonparametric regression, Cambridge University Press, Cambridge

Hendriks LAM, Leummens H, Stein A, De Bruin PJ (1998) Use of Soft Data in a GIS to Improve

Estimation of the Volume of Contaminated Soil. Water, Air and Soil Pollution 101:217-234

Hoosbeek MR, Stein A, van Reuler H, Jannsen BH (1998) Interpolation of agronomic data from plot

to field scale. Using a clustered versus a spatially randomized block design. Geoderma

81:265-280

Hutchinson MF, Gessler PE (1994) Splines - more than just a smooth interpolator. Geoderma

62:45-67

Jäggli F, Peyer K, Pazeller A, Schwab, P (1998) Grundlagenbericht zur Bodenkartierung des

Kantons Zürich. Eidgenössische Forschungsanstalt für Agrarökologie und Landbau, Zürich,

pp207

Journel AG (1989) Fundamentals of Geostatistics in Five Lessons. American Geophysical Union,

Washington D.C.

Journel AG, Huijbregts CJ (1978) Mining Geostatistics, Academic Press, London

Juang KW, Lee DY (1998) A Comparison of Three Kriging Methods Using Auxiliary Variables in

Heavy-Metal Contaminated Soils. Journal of Environmental Quality 27:355-363

Keyzer MA, Sonneveld BGJS (1997) Using the mollifier method to characterize datasets and

models: the case of the universal soil loss equation. ITC 3/4:263-272

Kooistra L, Leuven R, Nienhuis P (2001) A Procedure for Incorporating Spatial Variability in

Ecological Risk Assessment of Dutch River Floodplains. Environmental Management

28:359-373

Korre A (1999) Statistical and spatial assessment of soil heavy metal contamination in areas of

poorly recorded, complex sources of pollution. Part 1 : factor analysis for contamination

assessment. Part 2: Canonical correlation analysis and GIS for the assessment of

35

Page 47: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 2

contamination sources. Stochastic Environmental Research and Risk Assessment 13:260-

316

Kravchenkto A, Bullock DG (1999) A Comparative Study of Interpolation Methods for Mapping Soil

Properties. Agronomy Jounal 91:393-400

Krebs R, Gupta SK, Furrer G, Schulin R (1998) Solubility and Plant Uptake of Metals with and

without Liming of Sludge-Amended Soils. Journal of Environmental Quality 27:18-23

Limpert E, Stahel WA, Abbt M (2001) Log-normal distributions across the Sciences: Keys and

Clues. Bioscience 51:341-352

Matheron G (1963) Principles of Geostatatistics. Economic Geology 58:1246-1266

Matheron G (1973) The intrinsic random functions and their applications. Adv.Appl.Prob. 5:439-468

Meuli RG (1997) Geostatistical Analysis of Regional Soil Contamination by Heavy Metals, Ph.D.

Thesis Zürich, pp 191

Nadaraya EA (1964) On estimation regression. Theory Prob.Appl. 10:186-190

Papritz A, Moyeed RA (1999) Linear and Non-Linear Kriging Methods: Tools for Monitoring Soil

Pollution. In Barnett V, Stein A, Turkman KF (eds) Statistics for Environment 4: Pollution

Assessment and Control, John Wiley & Sons, London, pp 305-339

Parzen E (1961 ) On estimation of a propbability density funtion and mode. Annals of Math. Stat.

33:12065-11076

Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1989) Numerical recipes. The art of

scientific computing, Cambridge University Press, Cambridge

Saito H, Goovaerts P (2000) Geostatistical Interpolation of Positively Skewed and Censored Data

in a Dioxin-Contaminated Site. Environmental Science and Technology 34:4228-4235

Schloeder CA, Zimmermann NE, Jacobs MJ (2001) Comparison of Methods for Interpolating Soil

Properties Using Limited Data. American Journal of Soil Science Society 65:470-479

Stahel WA (2000) Statistische Datenanalyse, Vieweg, Braunschweig/Wiesbaden

Stein A, Riley J, Halberg N (2001) Issuses of scale for environmental indicator. Agriculture,

Ecosystems & Environment 87:215-232

Stein A, Staritsky IG (1995) Spatial variability of soil contamination and the consequences for

environmental risk assessment. Report, Vol. 4. The Netherlands Integrated Soil Research

Programme, Wageningen

Tietje O (2003) MolliD - Mollifier Interpolation (Version 1.1). Zürich: o.tietje.Systaim GmbH,

www.systaim.ch

Tietje O, Richter O (1992) Stochastic modeling of the unsaturated water flow using autorcorrelated

spatially variable hydraulic parameters. Modeling Geo-Biosphere processes 1:163-183

Tiktak A, Alkemade JRM, van Grinsven JJM, Makaske GB (1998) Modelling cadmium

accumulation at a regional scale in the Netherlands. Nutrient cycling in Agroecosystems

50:209-222

Van Groenigen JW (2000) The influence of variogram parameters on optimal sampling schemes

for mapping by kriging. Geoderma 97:223-236

36

Page 48: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Explorative data analysis ofheavy metal contaminated soil

VBBo (1998) Verordnung über Belastungen des Bodens vom 1. Juli 1998 (Swiss Ordinance on the

Pollution of Soils). SR 814.12. Bern, pp 12

von Steiger B, Webster R, Schulin R, Lehmann R (1996) Mapping heavy metals in polluted soil by

disjunctive kriging. Environmental Pollution 94:205-215

Voortman RL, Brouwer J (2001) An Empirical Analysis of the Simultaneous Effects of Nitrogen,

Phosphorus and Postassium in Millet Production on Spatially Variable Fields in SW Niger.

Staff Working Paper, WP-01-04. Centre for World Food Studies, Amsterdam, pp 36

Watson GS (1964) Smooth regression analysis. Sankhya Series A:359-372

Webster R, Oliver MA (2001 ) Geostatistics for Environmental Scientists, John Wiley & Sons,

Chichester

XploRe (1999) Combining statistical Methods and Data - the ideal platform for applied data

analysis. Available at web site:

www.xplorestat.de/tutorials/smoothernode2.html, Update 2002

Youden WJ, Mehlich A (1937) Selection of efficient methods for soil sampling. Contributions of the

Boyce Thompson Institute for Plant Research 9:59-70

Yun S, Choi B, Lee P (2000) Distribution of heavy metals (Cr,Cu,Zn,Pb,Cd,As) in roadside

sediments, Seoul Metropolitan City, Korea. Environmental Technology 21:989-1000

Yu-Pin L (2002) Multivariate geostatistical methods to idenify and map spatial variations of soil

heavy metals. Environmental Geology 42:1-10

37

Page 49: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Seite Leer /

Blank leaf

Page 50: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

3 Uncertainty assessment for management of soil contaminants

with sparse data

Ute Schnabel, Olaf Tietje, Roland W. Scholz (In press), Environmental Management

ABSTRACT/ln order for soil resources to be sustainably managed, it is

necessary to have reliable, valid data on the spatial distribution of their

environmental impact. However, in practice, one often has to cope with spatial

interpolation achieved from few data that show a skewed distribution and uncertain

information about soil contamination. We present a case study with 76 soil

samples taken from a site of 15 square km in order to assess the usability of

information gleaned from sparse data. The soil was contaminated with cadmium

as a result of airborne emissions from a metal smelter. The spatial interpolation

applies lognormal anisotropic kriging and conditional simulation for logtransformed

data. The uncertainty of cadmium concentration acquired through data sampling,

sample preparation, analytical measurement, and interpolation is factor 2 within

68.3 % confidence. Uncertainty predominantly results from the spatial interpolation

necessitated by low sampling density and spatial heterogeneity. The interpolation

data are shown in maps presenting likelihood's of exceeding threshold values as a

result of a lognormal probability distribution. Although the results are imprecise,

this procedure yields a quantified and transparent estimation of the contamination,

which can be used to delineate areas for soil improvement, remediation, or

restricted area use, based on the decision makers' probability safety requirement.

Page 51: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

3.1 Introduction

Land-use management is challenging for the planning departments of both

local and regional governments, especially where land use is extensive and

contaminants are widespread (Roe and van Eeten, 2001). Low concentrations of

pollutants have long-term impacts, e.g., on soil fertility in the case of soil

contaminants (Grunewald, 1997; Gysi and others, 1991). The impact of

contaminants contributes to a shortage of usable land and consequently to

competition of interest among different land uses. As a result, the market value of

usable land may change and thus affect sustainable land-use management as

Prato (2000) demonstrated.

We present a case study involving a location with cadmium-contaminated

soil. For a large part of the area considered, soil measurements show

contamination levels above the guide value, but below the remediation value as

outlined in the Swiss Ordinance Relating to Pollutants in Soil (VBBo, 1998) (Table

3.3 and Table 3.1). If the guide value is exceeded, remediation of the soil may be

considered appropriate where contamination of groundwater, crops, or soil fertility

could be endangered. However, in this case, there are no standards mandating a

cleanup. According to Swiss Ordinance Relating to Pollutants in Soil (VBBo, 1998)

soil contamination levels between the guide value and remediation value indicate

the necessity of a change in land use, for example, from food production to

residential use. Such changes often provide a less expensive alternative to

remediation. Remediation is only legally obligatory when the remediation value is

exceeded. This legal situation shows that assessment of soil contamination is

relevant for sustainable land management, even when the pollutants pose no risk

to human health. Assessment of soil contamination for precautionary soil

improvement is difficult, because of the lack of data and high degree of uncertainty

from sparse data. Decisions on soil use change are difficult to substantiate. In

order to evaluate the affected areas in a valid, reliable, and cost-effective way, a

spatial interpolation that includes appropriate uncertainty information is required.

The common sources of uncertainty in spatial prediction are data sampling

(sparseness and inaccuracy), the preparation and measurement of samples in the

laboratory (Dubois and Schulin, 1993; Muntau and others, 2001), the spatial

40

Page 52: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management ofsoil contaminants

interpolation of the data (Goovaerts, 2001; Meuli, 1997), and subsequent modeling

(Hendriks and others, 2000; Keller and others, 2001; 2002). Several geostatistical

models address the amount of uncertainty from interpolation techniques (Barabas

and others, 2001; Bouma and others, 1996; von Steiger and others, 1996).

Modeling uncertainty by means of stochastic simulation is also widely used

(Gotway and Rutherford, 1993; Pan, 1997; Van Meirvenne, 2001). For example,

Goovaerts (2001) compared kriging-based estimation to simulation-based spatial

estimation and concluded that simulation offers several advantages, the most

important one being that it provides a model of spatially local uncertainty. In cases

of lognormally distributed data, ordinary lognormal kriging is easy to implement

and yields better results than other kriging methods (Papritz and Moyeed, 1999;

Saito and Goovaerts, 2000). Other investigations applied indicator kriging (Van

Meirvenne, 2001; Webster and Oliver, 2001) or disjunctive kriging (Papritz and

Moyeed, 1999; von Steiger and others, 1996), because of the complications to

estimating local probability density function when applying a traditional

backtransformation (Geovariances, 1997; Webster and Oliver, 2001, p. 19).

Therefore, we follow the approach of Limpert and others (2001), who presented

the uncertainty on the raw scale by means of the geometric mean, the

multiplicative standard deviation, and consequently, the multiplicative confidence

intervals for a multitude of lognormal distributed data.

Thompson and Fearn (1990, p.271) discuss the relation between sampling

and cost of analysis as well as the quality of data appropriate for data assessment.

They define fitness for purpose as "the property of data, produced by a

measurement process, that enables a user of the data to make technically correct

decisions for a stated purpose." The decision whether or not the information

supplied by the available data and its degree of uncertainty are useful depends on

the purpose for which the data will be used. Decision-makers have to judge the

appropriateness of using uncertainties as a substantive base for their decision

(Ramsey and others, 1998; Thompson, 1995). It is the decision and its

consequences that matter, rather than the accuracy of the prediction (Goovaerts

and Meirvenne, 2001). Thus, the main objective of this study is to analyze the

usability of the information given by spatial interpolation of few measurements,

41

Page 53: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

combined with probabilistic reasoning based on sparse data for the case of

Dornach (Canton Solothum, Switzerland).

The following questions will be addressed:

• What kind of spatial prediction is suitable when distant, irregularly

distributed measurements are analyzed for environmental management

purposes?

• How can the uncertainty in an interpolation be best assessed and

presented?

• What possibilities and limitations are posed by applying the

interpolation method and its resulting uncertainty for practical planning

purpose?

3.2 Data

We present a case study from Switzerland where 76 soil samples were taken

in an area of 15 square kilometers. Each was a composite sample from topsoil (20

cm depth) within a range of 10 square meters (Wirz and Winistörfer, 1987). Soil

samples were taken without a specific sample design, and the sites were more or

less randomly distributed. An attempt was made to cover the supposed area of

contamination (Figure 3.2.1) and take local circumstances into account.

42

Page 54: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

The total heavy metal content of the soil was determined in a 2 M HNO3

extract following the Swiss Ordinance Relating to Pollutants in Soil (VBBo, 1998)

and measured with a atom adsorption spectrometer (AAS) using the flame-AAS

method.

Value Cd (mg/kg) In Cd (mg/kg)

Count 76 76

Geometric Mean 0.93

Average 1.43 -0.08

Standard deviation 2.69 0.80

Variance 7.26 0.64

exp(average(ln(Cd))) 0.93

exp(stddev(ln(Cd))) 2.22

Median 0.86 -0.16

exp(median(ln(Cd))) 0.86

Skewness 7.36 0.78

Kurtosis 59.67 2.89

Min 0.12 -2.12

Max 23;3 3/I5

Table 3.1 Statistic properties of the data

The data analysis in Table 3.1 reveals that the distribution of heavy metal

measurements is highly skewed (Skewness 7.36). This also shows the difference

between the median (0.86 mg/kg) and the geometric mean (0.93 mg/kg) of the raw

data. For such data, logarithmic transformation may be appropriate before

interpolation. The statistics from a Kolmogorov-Smirnov-Test show that the

logarithmically transformed data follow a normal distribution (p > 0.35, Table 3.2).

Because the test presumes independency of the observations, a random set from

the data, chosen with the most possible distance from every other measurement

point, was also tested and accepted (p=0.74, Table 3.2).

Cd (mg/kg) Ln(Cd) (mg/kg) Ln(Cd) (mg/kg)

N 76 76 29

Normal Parameters Mean 1.43 -0.08 -0.06

Std. Deviation 2.69 0.80 0.98

Most Extreme Differences Absolute 0.31 0.11 0.13

Positive 0.30 0.11 0.13

Negative 0.31 -0.06 0.13

Kolmogorov-Smirnov Z 2.74 0.94 0.68

Asymp. Sig. (2-tailed) 0.00 0.35 0.74

Table 3.2 Test of normality with a One-Sample Kolmogorov-Smirnov Test for the logtransformeddata. A subsample of spatial independent data fulfill the assumption of independency (test

distribution is normal, mean and standard deviation are calculated from data)

43

Page 55: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

Cadmium was emitted from a metal smelter (Figure 3.2.1, Dornach, Canton

Solothum, Switzerland) during the past century. Today the pollutant can be found

on the west side of Dornach (Figure 3.1), a result of the west to east dominant

wind direction (Schweizerische Meteorologische Anstalt, 1982-91) and the

northwest facing slope on which the community of Dornach was built (Hesske and

others, 1998). In addition to the airborne contamination, the intrinsic cadmium

content in parent material is above the Swiss average of 0.3 mg per kilogram of

soil (Tuchschmid, 1995, Keller, 2001).

Swiss Ordinance Relating to Pollutants in Soil Cadmium, total,

Verordnung über Belastungen im Boden (VBBo) HN03 soluble

vom 1. Juli 1998 fmg/kg]

Guide value / Richtwert 0.8

Trigger value / Prüfwert

- use with probable oral intake / Nutzung mit möglicher direkter 10

Bodenaufnahme 2

- use for food planting / Futterpflanzenanbau 2

- use for feed planting / Nahrungspflanzenanbau

Remediation value / Sanierungswert-agriculture and horticulture / Landwirtschaft und Gartenbau 30

- gardening / Haus- und Familiengärten 20

- playgrounds / Kinderspielplätze 20

Table 3.3 Legal threshold values for remediation ofcadmium in soil in Switzerland (VBBo, 1998)

Geiger and Schulin (1995) found high variability (n=50, average^ 4.38 mg/kg,

variance= 0.72 (mg/kg)2, min=2.37 mg/kg, max=5.92 mg/kg) on a 40 m transection

in the same area. Soil parameters and contaminants tend to behave randomly on

the small scale (Webster, 2000). Thus, for this case study, it was necessary to

take the variability of the data on a scale of a few tens of meters into account when

we analyzed it.

3.3 Methods

Within the framework of geostatistical analysis, the measured cadmium

concentrations (see above) are assumed to be one realization a stochastic

process {Z(x):xeD} (Cressie, 1993; Goovaerts, 1997), where D denotes the

sample space for the investigated case study area of Dornach (see Figure 3.1).

Let xv...,xn denote the data coordinates and Z(Xj),...,Z(^) the logarithms of the

44

Page 56: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

measured cadmium concentrations of the 76 measurements. Then, the

differenceZ(x)-Z(x + h) with x,x + h<=D depends on the distance vector h

(Matheron, 1963). This dependency is expressed by means of the semivariance

y(h), which is defined as half the variance of concentration differences of all data

pairs with the same distance h (Journel, 1989). The range r - the "spatial

dependency" - indicates the maximum distance between sites. Beyond this,

concentrations are considered to be uncorrelated. Properties and characteristics of

this function are discussed in geostatistical textbooks (e.g., Cressie, 1993, pp.

58ff).

3.3.1 Analysis of spatial dependency

The spatial dependency of the cadmium concentrations is empirically

investigated using standard geostatistical software (Geovariances, 1997). This

investigation includes the (supposed) total area of contamination.

1000°

D4 70° 1000^

7(h)

x meter

Figure 3.3.1 Variogram map of the logarithmically transformed cadmium data. Each grid cell shows

the semivarince ofthe lag vector (x,y). D1-D4 indicate the direction ofthe anisotropic variograms

A variogram map (see Figure 3.3.1) of the logarithmically transformed

cadmium data shows an empirical semivariance function, which depends on the

length and direction of distance vector h. It shows that in the southeast direction

the range is larger than in other directions. This indicates that an anisotropic

45

Page 57: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

semivariance should be considered. For the variogram model we used a spherical

function (see Figure 3.4.1), which is expressed as

fth,9) = c0+Cl<3/i 1

20(9) 2 Q(9)forO</*<ß(0) (1)

y(h, 0) = c0 + cx for h > 0.(9), y(0) = 0

where y{h,9) is the semivariance dependent on h (the lag, which is defined as the

Euclidian distance between 2 points in 2 dimensions), and on 6, the angle

representing the direction of the lag. c0 is the nugget effect; d, the sill; and

0(9) = (a2 cos2 (9- a)+#2sin2 (9 - a)f, which defines the geometric anisotropy by

means of an ellipsoid with angle a, the direction in which the continuity is

greatest; A, the maximum diameter; and B, the minimum diameter, which is

perpendicular to A (Webster and Oliver, 2001). This expresses the so-called

geometric anisotropy, for which the range (but not the sill) depends on the

direction. Referring to the variogram map, the direction a of the largest diameter

is set to -65° (155° in geographical notation, see Figure 3.3.1). Other kinds of

semivariance functions (like Gaussian or exponential) and zonal anisotropy are

discussed by authors such as Journel and Huijbregts (1978) and Matheron (1973).

Further, for the analysis of spatial contamination pattern, trend models for the

components of distance to the metal smelter (of) and deviation from the dominant

wind direction (A9) are used as suggested by Saito and Goovaerts (2001). Trend

factors are fitted with five different regressions and residues are used to predict

the spatial dependency. Hereafter the mean square error (MSE) from cross

validation (Goovaerts, 2001; Armstrong, 1998) indicates the goodness of

prediction with a trend.

3.3.2 Ordinary kriging

The interpolation applies an ordinary kriging method, where the data are

logarithmically transformed before kriging, and then transformed back afterwards.

Lognormal ordinary kriging interpolates by means of a weighted average

r(x0) = £4z(x,) (2)

46

Page 58: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

where Z*(x0) is the estimated value at point x0> Z(x;) is the measured value at xn

and Xt are the weights (adding up to 1). Nearby measurements are considered to

be more correlated to the estimation point and, therefore, carry a higher weight

than more distant measurements. A moving (ellipsoidal) neighborhood is defined

by the parameters of anisotropy, i.e., a rotation of «=-65° and a range of A=900 m

(direction D1, D2) and B=500 m (direction D3, D4). This distributes the weight of

the sampling points along the preferred direction, thus emphasizing the local

spatial structure of contamination.

The kriging variance o-J(jc0) is computed as the weighted sum of the

estimation point (jc0)and the measured points (Equation 3). The kriging variance is

then

<4(*o) = ZW;,*o)+îK*o) (3)

where y(x,.x0) is the semivariance between the data point jc, and the estimation

point x0. The weights Ât are distributed in terms of the data model (see above) and

under the requirement of unbiasedness. y/(x0) is the Lagrange multiplier used for

minimization of the kriging variance (Cressie, 1993; Journel, 1989). The kriging

system is given in detail in the annotations.

After kriging, the lognormal estimates are transformed back to the original

scale. Several backtransformations methods have been proposed (Cressie, 1993,

p. 136; Saito and Goovaerts, 2000; Webster and Oliver, 2001 p. 180). We

transformed the values back according to equation (4)

z; = exp z +-Z- (4)

where Z* is the estimated value in the raw scale, Z* is the estimated value, and

o-2z, is the kriging variance, all on the logarithmic scale. For the kriging variance the

transformation (Limpertand others, 2001; Stahel, 2000, p.136) is

47

Page 59: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

<7*.=wre''afl-c"^*J (5)

where mr is the mean, s2r is the dispersion variance of the raw variable, and o\, is

the kriging variance (on the log scale) (Geovariances, 1997). For the statistical

characterization of the interpolated data, we calculated the median of the raw

variable given the mean Z* on the log scale as Z^(x) = exp(z*(x)) (median of the

raw variable) and the multiplicative standard deviation from the untransformed

kriging variance o\. on the log scale as g (x) - exp(<rz. {x)) (Limpert and others,

2001; Stahel, 2000, p.136). Please note that Zf is the median (and not the mean)

of the raw variable and that the multiplicative standard deviation (sometimes called

the geometric standard deviation) is calculated as the exponential of the standard

deviation cz. on the log scale (and not as the exponential of the variance on the

log scale).

3.3.3 Geostatistical conditional simulation

To characterize the small-scale variability without a smoothing effect, a

random realization of a stochastic process is generated. The idea is to compute a

multitude of such random realizations and characterize the distribution at each

estimation point through the average and standard deviation. All calculations are

conducted using the logarithmically transformed data.

The stochastic simulation was performed using the turning bands method

(Matheron, 1973; Tietje, 1993), which is one of the two available methods (as

sequential Gaussian simulantion (Deutsch and Journel, 1998). The turning bands

method simulates realizations of a one-dimensional Gaussian process

{Y(tu):teR} along the L lines LX,...,LL through the origin defined by the unit

vectors^,...,uL. The lines all have the same arbitrary origin (Gotway, 1994) and

are uniformly distributed within the (in this case) 2-dimensional space. If uu...,uL

are unit vectors in the direction L\,...,LL, the simulated value Zs(x),xeD, is an

average of those values, which are generated at the orthogonal projection tui(x) of

the terminal point x onto the line of direction u\{Y(tui(x)ui):i = l,...,L}- Finally, the

generated value at point x is

48

Page 60: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

A detailed discussion of the method can be found in Gotway (1994) and

Journel (1974). Conditioning the simulation means to force the simulated surface

to pass through the measurements. Hence, the simulated value is obtained

through the formula (Journel and Huijbregts, 1978)

z;m=z*(x)+|z,m-z;m] m

where Z*{x) is the kriged value obtained from the data Z(x,), (measurements

i = lt...,n), Zs(x) is the simulated value from Equation (6), and z]{x) is a kriged

value obtained from z,(x,) using the realizations at the coordinates of the

measurements. The conditional simulation was conducted with 100 turning bands

for each of 100 realizations at each gridpoint x D. A simulation post-processing

averages the results of simulations according to Equation 7 for each grid node

1 100

At each gridpoint x e D the mean is used to interpolate the cadmium

concentration, and the standard deviation

^w=^Sfcw-^w)2 (9)

is used for uncertainty assessment. Similar to the lognormal kriging method, the

following equations are used for calculation of the median and the multiplicative

standard deviation:

Zr(x) = exp(z;(x)) (10)

which is the median of conditional simulation of the raw variable and

<7c,(x) = exp(<7;(x)) (11)

which is the standard deviation of conditional simulation of the raw variable.

49

Page 61: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

3.3.4 Uncertainty assessment

To assess the overall uncertainty of the interpolated concentrations, we

consider the following sources of error: soil sampling, sample preparation for

analytical measurement, analytical measurement itself, and interpolation. One

model for these sources of error is the sum of variances induced throughout the

assessment. Following the approach of the Analytical Methods Committee (1995)

and Ramsey and Argyraki (1997), we assume the uncertainty U(x) (Equation 10)

as random field

U(X) = ^Lunt + V(X) for all X<= D (12)

which includes a constant part <r^n8tant and a spatially variable part V(x).

V(x) indicates the error due to interpolation. cr2on8taW indicates the uncertainty

induced by soil sampling and measuring. The latter is considered to be the sum of

the uncertainties of soil sampling (sampling pattern, density, composite or spot

sample), preparation of the soil sample (fragment removal, drying, grinding,

sieving, splitting), and measurement of the cadmium content (flame-atom-

spectrometer (AAS)) itself (Muntau and others, 2001; Theocharopoulos and

others, 2001).

2 _2 ,_2 , _2 (I'i)

^constant sampling' ^ sampUngpreparathn ^measurement ' '

The project "European methods in soil sampling" (CEEM) (Desaules and

others, 2001; Wagner and others, 2001a) has evaluated the sources of uncertainty

in soil pollutants assessment. It demonstrated that sampling and/or sampling

preparation are the main sources of uncertainty, to what degree depends on the

element analyzed and its concentration level (Wagner and others, 2001b).

Moreover, the investigation emphasized the deviation from sampling being higher

than the analytical deviations. This is especially true for cadmium. From the results

of the CEEM project, we derived an uncertainty of 0.3 mg/kg for sampling and

sampling posttreatment for the constant part of the overall uncertainty (<72onstam)

(Wagner and others 2001b, p. 87). This uncertainty might also be caused by the

use of different sampling techniques (with variations in sampling depth, sieving,

etc.), the presence of small-scale heterogeneity (about 40% (see Wagner and

others, 2001b, p.92)), and diversity of conditions associated with type of land use

50

Page 62: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management ofsoil contaminants

(variation coefficients of 8.1 %, 7.2 %, and 2.7 % for forest, agriculture, and

grassland use (Desaules and others, 2001)). Von Hoist (1997) claims that

sampling itself causes the highest degree of error followed by sampling

preparation. Wächter (1997) even infers that the error of the measurement

instruments is negligible relative to the high error influence of sampling.

Concerning the spatially variable part of the uncertainty {V(x), Equation 10),

the geostatistical theory (Armstrong, 1998; Webster and Oliver, 2001) states that

the kriging variance is zero at measurement points and minimized (Equations 2,3)

between the sampling points. The magnitude of the kriging variance, therefore,

depends on the distribution of the sampling points (note that the kriging

characteristics - neighborhoods, anisotropy, the support, etc. - have to be set

carefully). Because the sampling in the investigated area is sparse, large parts of

the investigated area are expected to show a high kriging variance. The maximum

of the kriging variance is related to the sill of the variogram and hence depends on

the overall variance of the data (see above).

Finally, the overall uncertainty is calculated as £/(x) = exp(lnöyw-i-lnl.35),

which is used to estimate the spatially distributed uncertainty. Due to the

geometric distribution (and hence the multiplicative confidence intervals) of the

results, it is called the uncertainty factor throughout the text. In conclusion, the

probability of exceeding a threshold of concentration C is estimated by solving

P(Z' > C) = l-4(lnC-Z1*og)/a„(x)] (14)

for the probability px of the standard normal distribution 0.

3.4 Results

3.4.1 Spatial estimation

The estimation of the spatial distribution of the cadmium concentration relies

on the geostatistical analysis of the data. The trend analysis reveals no significant

spatial trend. Besides a linear trend (Webster and Oliver, 2001), the distance to

the metal smelter and/or the main wind direction is additionally computed as

regression parameters (Saito and Goovaerts, 2001). Residues from regression are

used for kriging and cross validation. Table 3.4 shows the mean square error of

51

Page 63: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

prediction with these trends compared to ordinary kriging without a trend. The

results of Table 3.4 justify a spatial prediction without a trend in the investigated

area.

Trend models Mean square Model Parameter

errorRange [m] Nugget [(mg/kg)2]

t0 — ct0 + tfjX + a2y0.48 666 0.10

tx = a0 + ax ln(d) 0.55 631 0.14

h = ao+ a\ Wd) + d2A9 0.50 737 0.10

?3 = ao+al(\n.(d)*A0) 0.53 597 0.12

t^=a0+ a,A0 + a2(\a.(d)* A9) 0.46 696 0.23

Ordinary kriging 0.50 813 0.30

Table 3.4 Mean square error from cross validation for five different trend models and ordinary

kriging (isotropic variogram) for 76 data, (d) denotes the distance to the metal smelter and (AÖ) is

the deviation of wind direction from west to east, x,yare data coordinates. All models are

calculated in accordance to Saito and Goovaerts (2001). Models of spatial dependency are derived

from the residues of regression

Spherical models are suitable for both empirical variograms (see Table 3.5,

Equation 1, and the isotropic spherical funtion is given in the annotations).

Isotropic variogram Anisotropic variogram

Model Spherical spherical

Direction Omnidirectional four directions, D1-D4

Range

Nugget effect

813 m

0.297 (mg/kg)2

-65°= 900m, -20°,+20°, +70°=

500m

0.125 (mg/kg)2

Sill 0.332 (mg/kg)2 0.528 (mg/kg)2

7ab/e 3.5 Geostatistical properties of the variograms, all data shown on log- scale

In both cases, the sill approximately equals the data variance. The

anisotropic and isotropic variograms on the log scale are shown in Figures 3.4.1

and 3.4.2. For the interpolation of cadmium, we use the anisotropic model

displayed in Figure 3.4.1, because the variability of the cadmium contents depend

on direction, as shown in the variogram map in Figure 3.3.1. The range is from

500 to 900 meters, encompassing the range of the isotropic model. The

anisotropic model is computed for lag classes of 230 m and is discontinuous near

the origin showing a nugget effect of 0.125 (mg/kg)2 (log scale).

52

Page 64: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

2.0 _

1.5_

y(h)1.0

_

0.5

Figure 3.4.1 Anisotropic variograms in the directions D1-D4; their ranges are 900 m to 500 m, the

nugget effect is 0.13 (mg/kg)2, and the sill is 0.53 (mg/kg)2. Number of pairs and lag value for each

lag are shown in the annotations

7(h)

1000. 2000. 3000.

Figure 3.4.2 Isotropic variogram with a range of 813 m, a nugget effect of 0.297 (mg/kg), and a sill

of 0.332 (mg/kg)2

53

Page 65: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

The interpolated cadmium concentrations are displayed in Figure 3.4.3,

which is a cutout of the interpolated area. The statistical properties of the

interpolations (isotropic and anisotropic) and the conditional simulation

(anisotropic) are shown in Table 3.6.

Data Isotropic Anisotropic Conditional

estimation estimation simulation

Counts 76

Geometric mean [eM mg/kg]°-93

Standard deviation [ eß mg/kg]2-22

Minimum [mg/kg] 0.12

Maximum [mg/kg] 23.3

Table 3.6 Statistical properties of the interpolations (isotropic and anisotropic) and the conditional

simulation

Differences between kriging of cadmium with isotropic and anisotropic

variograms mainly results from their different nugget effects. The larger nugget

effect leads to a smoother isotropic kriging interpolation (Webster and Oliver,

2001) with less variance and an underestimation of cadmium concentrations.

The statistical properties of the anisotropic interpolation (Table 3.7) have

been calculated for the area indicated to be the core area in Figure 3.2.1.

11878

1.05

2.33

0.12

4.36

10802

1.08

2.27

0.12

8.81

10802

1.09

2.33

0.14

9.03

54

Page 66: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

Cadmium [mg/kg]

~H 0-0.8

M 0.8-1.3

1.3-2

2 - 7.3

0 measurement

[ Cd mg/kg]

Figure 3.4.3 Spatial distribution of cadmium concentrations in topsoil after conditional simulation.

The median of 100 realizations on the raw scale [Z^] is displayed, taken only from the core area

indicated in figure 1

Anisotropic modeling Core area Total area

Cadmium

estimated

Cadmiumsimulated

Cadmium

estimated

Cadmium

simulated

Counts 3Ô70 3870 10802 10802

Geometric mean [eM mg/kg]103 1.03 1.08 1.09

Standard deviation [eß mg/kg]1.95 1.97 2.27 2.33

Minimum [mg/kg] 0.28 0.28 0.12 0.14

Maximum [mg/kg] S.81 9,03 8.81 9.03

Table 3.7 Statistic properties of the interpolation for the core area ofpollution ( indicated in Figure 1)

and the total area of pollution

55

Page 67: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

3.4.2 Uncertainty assessment

The uncertainty of factor 2 (two times the estimate by multiplicative standard

deviation) - postulated by the Analytical Methods Committee (1995); Ramsey and

Argyraki (1997); as well as Fresenius and others (1995) - can only be estimated at

a probability of 68.3 %, whereas the demand of px =97.5 % accuracy leads to an

uncertainty factor of about 4.2. The uncertainty only decreases in the vicinity of the

measurements.

The rationale of the spatially variable part V(x) of the overall uncertainty is

illustrated by Figure 3.4.4, which shows a north/south transection of Dornach. A

single realization of the conditional simulation demonstrates considerable

variability and randomly high cadmium concentrations.

257800 258000 258200 258400 258600 258800 259000

northing coordinate [m]

--conditional simulation 50th percentile -•— conditional simulation 84th percentile

-*— conditional simulation 16th percentile -^-lognormal kriging 50th percentile

-+-lognormal kriging 84th percentile -X- tognormal kriging 16th percentile

-+- singular realization

Figure 3.4.4 A south-north transection with the results of one singular realization and some

percentiles of the conditional simulation and lognormal kriging. The 50th percentiles are estimated

by taking the exponential of the mean values on the logarithmic scale. The other percentiles are

calculated as the exponential of the percentiles on the logarithmic scale

exp(Z*og(x)±0(p)alog(x)), with the léh and 84th percentile of the normal distribution

0(16*)--1 and 0(84(A) = 1

56

Page 68: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

As can be seen from Figure 3.4.4, in reality some cadmium concentrations

beyond the 68.3% confidence interval of the interpolation and simulation could

occur. The results for the spatially variable part are indicated in Table 3.6 as is the

geometric standard deviation eM mg/kg.

3.4.2 Relevance for decisions on soil remediation

To come to any decision, the results of the kriging interpolation and the

uncertainty assessment have to be correlated with the standards set out in the

Swiss Ordinance Relating to Pollutants in Soil (VBBo, 1998) (Table 3.3). This

document contains the three threshold values associated with a long-term fertility

goal (guide value), the concentration beyond which a more detailed investigation is

required (trigger value), and the concentration beyond which a remediation is

prescribed (remediation value).

Figure 3.4.5 Map of the likelihood of exceeding the threshold value C-2 mg/kg

Figure 3.4.5 presents the probability of exceeding 2 mg cadmium per kg soil

(trigger value for food and feed planting). Clearly, the probability of any other

threshold value can be presented. In practice, decision makers have to agree on a

probability statement that they consider reliable and safe enough for the specific

57

Page 69: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

case. Figure 3.4.4 shows one randomly chosen realization out of a total of 100

realizations that determine the simulated value. As can be seen, some local

cadmium concentrations at this point exceed the 84th percentile of the prediction.

Presentations, as the one demonstrated in Figure 3.4.5, facilitate understanding of

the meaning and consequences of the results, which are stated as "there is a 16

% likelihood that a specific cadmium concentration will be exceeded". In addition, it

is possible to delineate corresponding areas, which may be subjected to further

investigations (e.g., additional measurements). A combination of the spatially

distributed probability of occurrence with other data of a geographic information

system (GIS), such as land use classifications, might lead to a valuable decision.

However, if a high degree of reliability and confidence is required (e.g., 95 %) and

the costs of soil treatment are very high Figure 3.4.5 suggests that further

measurements should be collected. In general, knowledge about the probability of

exceeding legal threshold values helps to become aware of necessary actions to

be taken in sustainable soil improvement - depending on the decision-makers risk

level. For the case considered in this study the following conclusion became

evident from the interpolation and the uncertainty assessment:

• Cadmium contamination above legal remediation value (for all kind of

uses) is not likely (even with a probability below 50 % ) on any side

• A small area is contaminated above trigger value with a probability of more

than 90%

• Cadmium contamination above guide value is widespread with more than

50% probability

• High probabilities for cadmium contamination above 0.8 mg/kg are

delineated mainly around the emission source and

• Only a small domain of the case area is not contaminated above the Guide

value of Swiss Ordinance related to Soil (VBBo, 1998)

Since the contamination from different sources like sewage sludge or

fertilizer is ongoing (Keller and others 2001 ) decision makers should also bear in

mind that the cadmium concentration might still be accumulating and any kind of

soil improvement might help to sustain the multi-functional use of soil (Bouma,

2002).

58

Page 70: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

3.5 Conclusion

The case study shows that even sparse data are suitable for an assessment

of contaminated land. In our case we state an uncertainty factor of 2 with 68.3 %

accuracy and we acknowledge statements such as "The probability that the soil

contamination at this point exceeds 2 mg cadmium per kg soil is less than 40 %".

We infer from the preceding analysis that statements of this kind can be given in a

reliable way.

For the spatial interpolation of the cadmium concentrations, a lognormal

ordinary kriging (Papritz and Moyeed, 1999; Matheron, 1963) and a conditional

simulation (Goovaerts, 2000; Journel, 1974) were applied. The ordinary kriging

was improved by the application of an anisotropic variogram. The anisotropic

variogram shows varying distances of autocorrelation in different directions.

Several two dimensional stochastic realizations were produced and a conditional

simulation was applied to analyze the local variability. The conditional simulation

(as the stochastic realizations are forced to hit the measured data) yielded also an

estimation of the local uncertainty. A logarithmic transformation of the data and a

backtransformation after interpolation according to Limpert and others (2001) was

used to obtain the geometric mean and the multiplicative standard deviation. The

geometric mean of all realizations approximated the geometric mean of the

(lognormal) ordinary kriging estimation. Moreover, the standard deviation of all the

realizations approximated the standard deviation of the (lognormal) ordinary

kriging, i.e. the square root of the kriging variance (see Table 3.5). The geometric

distribution was also used to obtain the percentiles of the probability density

function and to predict the conditional cumulative density function including the

uncertainty. With this approach, the uncertainties involved in the interpolation are

conveniently estimated. Hence, this procedure can be applied for uncertainty

assessment of lognormally and spatially distributed data.

The uncertainty assessment exemplifies, that the main part of the overall

uncertainty is due to the sparseness of the data that underlies the spatial

interpolation. The second important uncertainty is due to the sampling and sample

preparation. Because corresponding data are hardly to obtain (see e.g. CEEM-

Project, Wagner and others, 2001b) we roughly estimated this as 35% of the

estimated cadmium concentration using quantitative (von Hoist, 1997; Wächter,

59

Page 71: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

1997; Wagner and others, 2001b) and qualitative results (Ramsey, 1998; Tiktak

and others, 1999) as a reference. For the purpose of this investigation, other

sources of uncertainty, such as the analytical measurement have been determined

to be negligible. In conclusion, the resulting overall uncertainty is suitable to

identify regions with a high risk of contamination. The interpolation combined with

the uncertainty assessment is sufficiently reliable to delineate potentially

contaminated sites pictured in "maps of probability of occurrence".

Acknowledgements

The work reported in this paper was conducted in the context of the Integrated Project Soil of

the Swiss Priority Program Environment. We are grateful to Martin Fritsch, who initiated the project

and guided it until 2000, and to the colleagues at UNS and within the IP Soil for the helpful

discussions. We also appreciate the comment of three anonymous reviewers. The project was

funded by the Swiss National Science of Foundation (Project Number 5001-44758) and the Swiss

Federal Institute of Technology (Project Number 0048-41-2609-5).

60

Page 72: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

3.6 Literature Cited

Analytical Methods Committee 1995. Uncertainty of measurement: implications of its use in

analytical science. Analyst 120:2303-2308

Armstrong, M. 1998. Basic Linear Geostatistics. Springer, Berlin, Heidelberg, 146 pp.

Barabas, N., P. Goovaerts, and P. Adriaens. 2001. Geostatistical assessment and validation of

uncertainty for three-dimensional dioxin data from sediments in an estuarine River.

Environmental Science and Technology 35:3294-3301.

Bouma, J. 2002. Land quality indicators of sustainable land management across scales.

Agriculture, Ecosystems and Environment 88 (2): 129-136.

Bouma, J., H. W. G. Booltink, and A. Stein. 1996. Reliability of soil data and risk assessment of

data applications. Soil Science Society ofAmerica Special Publication 47:677.

Cressie, N. A. C. 1993. Statistics for spatial data. John Wiley, New York, 900 pp.

Desaules, A., J. Sprengart, G. Wagner, H. Muntau, and S. Theocharopoulos. 2001. Description of

the test area and reference sampling at Dornach. The Science of the Total Envrionment

264:17-26.

Deutsch, C.V., Journel, A.G. 1998. GSLIB Geostatistical Software Library and User's Guide. New

York, Oxford, 369pp.

Dubois, J.-P., and R. Schulin. 1993. Sampling and analytical techniques as limiting factors in soil

monitoring. Birkhäuser, in Schulin, R., Desaules, A.,Webster, R., von Steiger, B. eds. Soil

Monitoring - Early Detection and Surveying of Soil Contamination and Degradation Basel,

Birkhäuser, pp. 271-273

Fresenius, W. Kettrup, A, and R.W. Scholz.1995. Grössenordnung Faktor 2:Zur Genauigkeit von

Messungen bei Altlasten, altlasten spektrum 4:217-218

Geiger, G., and R. Schulin. 1995. Risikoanalyse, Sanierungs- und Überwachungsvorschläge für

das schwermetallbelastete Gebiet von Dornach. Berichte 2. Volkswirtschaftsdepartement

des Kantons Solothurn, Amt für Umweltschutz, Solothurn, 79 pp.

Geovariances. 1997. Isatis. Geostatistical software. Version 3.1.2. Avon

Goovaerts, P. 1997. Geostatistics for natural resources evaluation. Oxford University Press, New

York Oxford, 483 pp.

Goovaerts, P. 2000. Estimation or simulation of soil properties? An optimization problem with

conflicting criteria. Geoderma 97:165-186.

Goovaerts, P. 2001. Geostatistical modelling of uncertainty in soil science. Geoderma 103:3-26.

Goovaerts, P., and V. Meirvenne. 2001. Delineation of harzard areas and additional sampling

strategy in préséance of a location specific threshold, in A. Soares, J. Gomez-Hernandez,

and J. Froidevaus, eds., geoENVIII-Geostatistics for Environmental Applications. Dordrecht,

Kluwer Academic Publishers, p. 125-136.

Goovaerts, P., R. Webster, and J.-P. Dubois. 1997. Assessing the risk of soil contamination in the

Swiss Jura using indicator geostatistics. Environmental and Ecological Statistics 4:31-48.

61

Page 73: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

Gotway, C. A. 1994. The use of conditional simulation in nuclear-waste-site performance

assessment. Technometrics 36:129-140.

Gotway, C. A., and B. M. Rutherford. 1993. Stochastic simulation for imaging spatial uncertainty:

Comparison and evaluation of available algorithms. Pages 1-21 in Geostatistical

Simulations. Fontainebleau, France.

Grunewald, K. 1997. Grossräumige Bodenkontamination. Springer, Berlin, Heidelberg, New York,

250 pp.

Gysi, C, S. Gupta, W. Jäggi, and J.-A. Neyroud. 1991. Wegleitung zur Beurteilung der

Bodenfruchtbarkeit. Eidgenössische Forschungsanstalt für Agrikulturchemie und

Umwelthygiene, Liebefeld-Bern, 89 pp.

Hendriks, C, Obernosterer, R., Muller, D., Kytzia, S., Baccini, P. and Brunner, P.H. 2000. Material

flow analysis. Case studies on the city of Vienna and the Swiss lowlands. Local Environment

5:311-328.

Hesske, S., M. Schärli, O. Tietje, and R. W. Scholz. 1998. Zum Umgang mit Schwermetallen im

Boden: Falldossier Dornach. Pabst Science Publishers, Lengerich, 145 pp.

Journel, A. G. 1974. Geostatistics for Conditional Simulation of Ore Bodies. Economic Geology

69:673-687.

Journel, A. G. 1989. Fundamentals of Geostatistics in Five Lessons. American Geophysical Union,

Washington, D.C., 40 pp.

Journel, A. G., and C. J. Huijbregts. 1978. Mining Geostatistics. Academic Press, London, 599 pp.

Keller, A. von Steiger, B. van der Zee, S.E.A.T.M., and R. Schulin 2001. A stochastic empirical

model for regional heavy-metal balances in agroecosystems. Journal of Environmental

Quality 30:1976-1989

Keller, A., K. Abbaspour, and R. Schulin. 2002. Assessment of Uncertainty and Risk in Modeling

Regional Heavy Metal Accumlation in Agricultural Soils. Journal of Environmental Quality 31 :

175-187

Limpert, E., W. A. Stahel, and M. Abbt. 2001. Log-normal distributions across the Sciences: Keys

and Clues. Bioscience 51:341-352.

Matheron, G. 1963. Principles of Geostatatistics. Economic Geology 58:1246-1266.

Meuli, R., O. Atteia, J. P. Dubois, R. Schulin, B. von Steiger, J.-C. Vedy, and R. Webster. 1994.

Geostatistical Interpolation in Regional Mapping of Soil Contamination by Heavy Metals. A

Case Study Comparing two Rural Regions in Switzerland, in FOEFL, ed., Regional Soil

Contamination Surveying, Environmental Documentation. Berne, Federal Office of

Environment, Forests and Landscape, p. 43.

Meuli, R. G. 1997. Geostatistical Analysis of Regional Soil Contamination by Heavy Metals.

Dissertation ETH No.12121, Swiss Federal Institute of Technology Zürich, 191 p.

Mowrer, H. T. 2000. Uncertainty in natural resource decision support systems: sources,

interpretations, and importance. Computers and Electronics in Agriculture 27:139-154.

62

Page 74: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

Muntau, H., A. Rehnert, A. Desaules, G. Wagner, S. Theocharopoulos, and P. Quevauviller. 2001.

Analytical aspects of the CEEM soil project. The Science of the Total Environment 264:27-

49.

Pan, G. 1997. Conditional Simulation as a Tool for Measuring Uncertainties in Petroleum

Exploration. Nonrenewable Resources 6:285-293.

Papritz, A., and R. A. Moyeed. 1999. Linear and Non-Linear Kriging Methods: Tools for Monitoring

Soil Pollution. In V. Barnett, A. Stein, K. Feridun Turkman (eds.) Statistics for Environment 4:

Pollution Assessment and Control. 4. John Wiley & Sons, London, pp 304-340

Prato, T. 2000. Multiple attribute evaluation of landscape management. Journal of Environmental

Management 60:325-337.

Ramsey, M. 1998. Sampling as a source of measurement uncertainty: techniques for quantification

and comparison with analytical sources. Journal of Analytical Atomic Spectrometry 13:97-

104.

Ramsey, M. H., A. Argyraki, and S. Squire. 1998. Measurement uncertainty arising from sampling

contaminated land: a tool for evaluating fitness-for-purpose. Pages 221-240 in ConSoH'98.

Edinburgh, UK.

Ramsey, M.H., A. Argayraki. 1997. Estimation of measurement uncertainty from field sampling:

implications for the classification of contaminated land. The Science of the Total

Environment 198:243-257

Roe, E., and M. van Eeten. 2001. Threshold-Based Resource Management: A Framework for

Comprehensive Ecosystem Management. Environmental Management 27:195-214.

Saito, H., and P. Goovaerts. 2000. Geostatistical Interpolation of Positively Skewed and Censored

Data in a Dioxin-Contaminated Site. Environmental Science and Technology 34:4228-4235.

Schulin, R., R. Webster, and R. Meuli. 1994. Technical Note on Objectives, Sampling Design, and

Procedures in Assessing Regional Soil Pollution and the Application of Geostatistical

Analysis in such Surveys. Federal Office of Environment, Forests and Landscape, Bern, 31

PP-

Schweizerische Meteorologische Anstalt. 1982-91. Klimaatlas der Schweiz.Atlas climatologique de

la Suisse. Atlante climatologico délia Suizzera. Lfg.1-4, Verlag des Bundesamtes für

Landestopographie, Wabern, Bern

Stahel, W. A. 2000. Statistische Datenanalyse. Vieweg, Braunschweig/Wiesbaden, 380 pp.

Theocharopoulos, S. P., G. Wagner, J. Sprengart, M.-E. Mohr, A. Desaules, H. Muntau, M.

Christou, and P. Quevauviller. 2001. European soil sampling guidelines for soil pollution

studies. The Science of the Total Environment 264:51 -62.

Thompson, M. 1995. Uncertainty in an uncertain world. Analyst "\ 20:117-118.

Thompson, M., and T. Fearn. 1990. What excatly is Fitness for Purpose in Analytical

Measurement? Analyst 121:273-278.

Tietje, 0.1993. Räumliche Varibilität bei der Modellierung der Bodenwasserbewegung in der

ungesättigten Zone. Ph.D. thesis, Technische Universität Braunschweig, 117 p.

63

Page 75: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

Tiktak, A., A. Leijnse, and H. Vissenberg. 1999. Uncertainty in a Regional-Scale Assessment of

Cadmium Accumulation in the Netherlands. Journal of Environmental Quality 28:461-470.

Tuchschmid, M.P. 1995, Quantifizierung und Regionalisierung von Schwermetall- und

Flourgehalten bodenbildender Gesteine der Schweiz. Bundesamt für Umwelt, Wald und

Landschaft, Umwelt-Materialien Nr. 32, Bern, 130 pp.

Van Meirvenne, M. 2001. Evaluating the probability of exceeding a site-specific soil cadmium

contamination threshold. Geoderma 102:75-100.

VBBo. 1998. Verordnung über Belastungen des Bodens vom 1. Juli 1998 (Swiss Ordinance on the

Pollution of Soils) SR 814.12, Bern,

von Holst, C. 1997. Beurteilung der Qualität von Analysendaten mit statistischen Methoden unter

Berücksichtigung der Probennahme und der analytischen Bestimmung. Pages 139-145, in

S. Schulte-Hostede, R. Freitag, A. Kettrup, and W. Fresenius (eds.) Altlasten-Bewertung,

Datenanaylse und Gefahrenbewertung, ecomed, Landsberg,

von Steiger, B., R. Webster, R. Schulin, and R. Lehmann. 1996. Mapping heavy metals in polluted

soil by disjunctive kriging. Environmental Pollution 94:205-215.

Wächter, H. 1997. Möglichkeiten und Grenzen der Routineanalytik bei der Überprüfung von

Schwellenwerten - Erfahrungen aus der Praxis eines Umweltlabors. Pages 30-43, in S.

Schulte-Hostede, R. Freitag, A. Kettrup, and W. Fresenius (eds.) Altlasten-Bewertung,

Datenanaylse und Gefahrenbewertung, ecomed, Landsberg.

Wagner, G., P. Lischer, H. Theocharopoulos, H. Muntau, A. Desaules, and P. Quevauviller.

2001b. Quantitative evaluation of the CEEM soil sampling intercomparison. The Science of

the Total Environment 264:73-101.

Wagner, G., M.-E. Mohr, J. Sprengart, A. Desaules, H. Muntau, S. Theocharopoulus, and P.

Quevauviller. 2001. Objectives, concept and design of the CEEM soil project. The Science of

the Total Environment 254:3-15.

Webster, R. 2000. Is soil variation random? Geoderma 97:149-163

Webster, R., and M. A. Oliver. 2001. Geostatistics for Environmental Scientists. John Wiley

& Sons, Chichester, 270 pp.

Wirz, E., and D. Winistörfer. 1987. Bericht über Metallgehalte in Boden- und Vegetationsproben

aus dem Raum Dornach. Kantonales Laboratorium Solothurn, Solothurn, 49 pp.

64

Page 76: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Uncertainty assessment for management of soil contaminants

3.6 Annotations

I. Number of pairs and semivariances for single lags of the variogram,

calculated in Equation (1) and shown in Figure 3.3

Number of pairs Value (mg/kg)D1 D2 D3 D4 D1 D2 D3 D4

Lag1 28 35 31 31 0.38 0.34 0.28 0.43

Lag 2 50 72 60 53 0.5 0.51 0.8 0.51

Lag 3 74 77 75 72 0.34 0.67 0.89 0.82

Lag 4 74 84 80 72 0.73 0.7 0.74 0.83

Lag 5 82 86 87 74 0.55 0.67 0.58 0.61

Lag 6 49 95 75 54 0.53 0.66 0.48 0.59

Lag 7 53 84 77 42 0.44 0.6 0.61 0.4

Lag 8 28 97 56 25 0.39 0.66 0.54 0.62

Lag 9 18 90 46 10 0.29 0.61 0.81 0.58

Lag 10 11 92 39 2 0.38 0.76 0.62 1.84

Lag 11 3 70 39 2 0.3 0.84 0.52 0.19

II. The formal kriging system is

Z'(x0)Â = b

Z\x0) =Y(x2 > X\) Y\Xi? ^2/

7\Xf{»Xj) j\XNix2)

1 1

/\Xi>Xjf)

/\X2 j xN )

¥\xn>xn)

1

1 ^

1

X =

(K }r

K*

,and b =

K

,?K*o), V

y\X\ > ^o /

/yx2,x0)

Y\xn,Xq)

1

65

Page 77: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 3

III. Spherical function calculated for Figure 3.5 and Table 3.4 and 3.5

^»MIHOl'for h<-a

y(h) = cQ+cx for h > a

where c0 is the nugget effect, c1 is the sill, h is the lag and a a the range.

IV. Accuracy plot for the variogram model calculated in Equation (1). Dots

indicate outliers being outside the 99 % confidence limit of a normal

distribution

In Z*(x) rag/kg

3. -

£à *

0. -

2. -

-2. -1. 0. 1. 2. 3.

i i i

1 1 i

-

+ JJT y

-

4+

•• -

-

/ +

+-

'

+

1 1 i i i

-2. -1. 0. 1. 2.

In Z*(x) mg/kg

- 3.

- 2.

- 1.

- 0,

- -1.

- -2

CSJ

if

66

Page 78: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

4 Decision making under uncertainty in case of soil remediation

Ute Schnabel and Roland W. Scholz

Submitted to Journal of Environmental Management

ABSTRACT/Decisions on soil remediation are one of the most difficult

management issues of municipal and state agencies. The assessment of

contamination is uncertain, the costs of the remediation are high, and the impacts

on the environment are multiple. The paper presents a general, transparent, and

consistent method for decision-making among the remediation alternatives. Soil

washing, phytoremediation and no remediation are exemplarily considered. Multi

criteria utility functions including a) the cost of remediation b) the impact on human

health and agricultural productivity, and c) the economic gain after remediation are

constructed along a case in which the contamination at each site coordinate are

represented by a probability distribution. Herewith different types of i) correct

decisions such as a hit or a correct rejection and ii) erroneous decisions such as a

false alarm or a miss lead to a evaluation of each state of contamination and its'

expected utility. The case study reveals the nonlinear structure of multi criteria-

decision making. If one alternative scores best for high and low concentration the

second alternative can fit for medium concentration. Typically doing nothing was

best for very low concentrations.

Page 79: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

4.1 Introduction

In most European countries, brown fields and contaminated sites are

currently recorded in registers of contaminated land. Municipal and state agencies

are facing the challenge to manage soil contamination resulting from the industrial

age (Bell, F.G., Genske, Bell, A.W. 2000; U.S. Environmental Protection Agency

2000). The problem to allocate sites of interest leads into the question on how to

cope with the uncertainty of data stored in these land registers, and, in particular,

how to evaluate different types of errors and - complementary - different correct

decisions that can result (Gsponer, 1996). For instance overseeing or ignoring a

highly contaminated site may cause severe effects. Contrary, acting based on a

false alarm might result in unwarranted high investments in remediation

procedures. Thus, a decision maker has to balance the consequences from

correct or false decisions and the probabilities of occurrence of these correct and

false decisions. This leads the practitioner into the core of decision theory

(Kleindorfer, Kunreuther Schoemaker, 1993).

However, the handling of different types of errors is relevant not only for the

case of soil remediation. For instance, when visiting a developing country with

cholera incidence, one has to deliberate whether to apply a vaccination or not.

There is always a chance of 40-50% of becoming infected in spite of vaccination

(Levine and Kaper, 1996). On the other hand in case of not being inoculated, there

is a higher chance of becoming infected during the travel with even worse

consequences as medical care is supposed to be poorer during the travel. Also in

this case, as in many other situations of life, the costs of negative consequences

resulting from missed action (becoming infected given no vaccination) and false

alarms (being inoculated and not getting infected) have to be negotiated. The

procedure presented in this paper thus will also be of relevance for other decision

situation.

In case of soil contamination the decision-making on the application of

remediation alternatives is a crucial step after a comprehensive analysis and

assessment of contaminants and their impacts on safeguards. The ability to make

a correct or "good" decision develops with increasing experience and knowledge

about the consequences of the decision and with the capability to comprehend

68

Page 80: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

and to structure the decision problem (Scholz, 1981; Papazoglou, Bonanos,

Briassoulis, 2000). Clearly, gathering of supplementary knowledge about the

contaminated area by additional soil samples is one possibility to improve decision

making issue (Scholz, Nothbaum, May, 1994; Van Groenigen 2000; Wagner,

Desaules, Muntau, Theocharopoulos, Quevauviller, 2001). However, soil sampling

induces additional costs, which often exceed the financial budget of the project or

at least of the planned costs for data sampling. In the following a method is

introduced that explicitly allows to deliberate about different types of correct and

false decisions on soil remediation when referring to signal detection theory

(Green and Swets, 1988; Tuzlukov, 2001). For the sake of simplicity, only one

criterion, e.g. one contaminant, is considered significant for causing negative

impacts. However a multitude of consequences and their impacts are considered

with multi-criteria utility theory. Utility functions provide evaluations on the

consequences of action and can be seen as a compromise between accurate

scientific knowledge and concise information (Beinat, 1997). Utilities ask for a rigor

scaling, as utility functions are supposed to have a zero point and consequently

differences can be relatively compared (von Winterfeldt and Edwards, 1986;

Parson and Wooldridge, 2002). In the following case study, a procedure that

combines utilities for decisions options with the related probabilities of errors is

proposed.

4.2 The case of contaminated land

The study was applied to an area of heavy metal soil contamination in

Dornach, Canton Solothurn, Switzerland. The soil was polluted from a metal

smelter for several decades (Hesske, Schärli, Tietje, Scholz, 1998). Several

studies assessed the spatial extension of the contamination and the potential for a

sustainable remediation. An overview of research studies and results can be found

in various studies (Wirz and Winistörfer, 1987; Geiger, Federer, Sticher, 1993; von

Steiger, Webster, Schulin, Lehmann, 1996; Schulin and Borer 1997; Fritsch,

Tietje, Hepperle, Schnabel, 1999; Keller, Jauslin, Schulin, 1999; Gupta, Wenger,

Tietje, Schulin, 2002; Schulin and Gupta 2002). The cadmium contamination was

spatially interpolated with lognormal ordinary kriging (Schnabel, Tietje, Scholz,

2002). Results were presented in terms of a probability that certain cadmium

69

Page 81: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

contents in topsoil occur (Figure 4.2.1). The estimation reveals an uncertainty of a

factor 2. This means that the 68,3 % confidence interval was between the double

and the 50% concentration of the estimated value. A confidence interval of 97,5%

is only achievable with a factor of 4.2 times of the estimated mean contaminant

level.

Cadmium

estimated [mg/kg]

0-0.5

0.5 - 0.8

0.8-1.5

1.5-2

2-2.5

2.5-3

3-4

4-6

6-10

No Data

probability [%] to exceed a

/V threshold value of 2 mg/kg in soil

^H metal smelter

Figure 4.2.1 Map of a cadmium contaminated land and the probability that a threshold value of 2

mg/kg soil is exceeded

Our study evaluates cadmium contamination in soil and its consequences

according to the threshold values of Swiss Ordinance Relating to Pollutants in Soil

(VBBo, 1998), which is the following

• The Guide value of 0.8 mg/kg cadmium in soil indicates a loss in soil

fertility, especially damages in microbial activity of soil organism (Laczko,

Rudaz, Aragno, 1996). Local authorities are advised to find the cause of

impact and to prevent an additional increase of contamination. The guide

value denotes the threshold value for a sustainable loss of soil function and

hence of resources.

• The Trigger value is a threshold value for different kind of land uses. In

case of contamination, soil use for food and feed planting is judged to be

without any consequences for human, animals and plants up to a cadmium

content of 2 mg/kg soil. For uncovered soil, where the possibility of oral

intake (especially by children and animals) is given (Ruck, 1990), the

threshold value is 10 mg/kg. In the case that a trigger values is exceeded,

local authorities have to prove the actual endangerment of human, animals,

70

Page 82: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

and plants. If a health risk is evident, land use must be restricted. Current

research and experience showed that above trigger values a health risk for

humans, animals and plants is likely (Harris, Karlen, Mulla, 1996; VBBo,

1998).

• Remediation values are also different for different types of land uses. The

threshold value for gardening and playgrounds is set to 20 mg/kg, while the

use of agro- and horticulture is forbidden when more than 30 mg/kg

cadmium in soil has been measured. If the remediation value is exceeded

the land use is prohibited and a remediation is obligatory in cases where

spatial planning intended a use for gardening, playgrounds, agriculture, or

horticulture.

The regional governmental planning organizations have to decide upon the

treatment of contaminated soil. It is a spatial choice referring to the different

probabilities to exceed the threshold value occurring throughout the whole area.

Since the community of Dornach has an increasing figure of new inhabitants

(Hesske et al., 1998) the area is of interest for dwelling and house gardening.

4.3 Probability knowledge on the contamination

Consider the data coordinates xe X of a certain area. Let z(x) denote the

predicted cadmium value at each point x in the considered area. The predicted

value z(x) is derived from measurements x„ by means of geostatistical

procedures based on the available data (Schnabel et al, 2002) and is the mean of

a lognormal distributed random variable at each point x (Limpert, Stahel, Abt,

2001). Each predicted value z(x) has a probability (density value) f(z(x))dz(x)

that can be considered as the best prediction for the "true value"

p(z(x))=\f(t(x))dt(x) = E(t(x)). Hence the probability of occurrence for every

possible state of contamination at each x is known for the decision maker, if we

interpret "known" as best available information or best model on the occurrence of

true value t(x) of contamination.

A threshold value c separates contaminated from uncontaminated areas and

has to be considered as a prior decision threshold (see Figure 4.3.1). If the

71

Page 83: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

predicted value z(x) is below the threshold value c, the decision maker usually will

consider the coordinate x to be not contaminated. Consequently the probability of

contamination, p(C) -.= p(z(x) > c), or non-contamination p(->c) := p(z(x) < c) is

calculable by the following formulas for the lognormal distribution at each x.

p(Q:=P(z(x)^c) = l-0[\nc-zln(x)/(Jbi(x)] , zln(x)~N({i,<r2) (V

and p(-iC).p(z(x)<c) = 0[\nc-z]n(x)/(Tln(x)],zln(x)~N(ß,o'2) (2)

m*»

t(x)<c t(x)>c t(x)<c t(x)Z c

/W*)U #x) /<«*» JM

c t(x) c t(x)0 c

piporrrejec ) = \f(t(x))dt{x) pQmss) = Jf(t(x))dt(x)

c t(x) ce

t(x)

pifalse alarm) = )f(t{x))dt(x) p(hü) = \f(t{x))dt{x)

CORRECTREJECTION MISS FALSE ALARM HIT

Figure 4.3.1 Model for the true value of contamination. The probability density functions are derived

form the geostatistical estimation of the contaminant. The probability of falling below or above the

threshold value c serves as a prediction of the true value t(x) [x-axis] and leads to four types of

decisions (see text)

A decision on an treatment at coordinate x is based on the estimation z(x)

and the corresponding probability distribution, i.e. model for the true value (Green

and Swets, 1988). The decision maker does not know with certainty about the true

value, neither if t(x)>c nor if t(x)<c. Two situations can appear (see Figure

4.3.1). The true value /(x)can be (a) smaller than c or (b) greater or equal than c.

72

Page 84: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

Case (a) is displayed in the lower left density function of Figure 4.3.1 (correct

rejection). Analogously is the case (b) in the lower right part of Figure 4.3.1 (hit).

The probabilities of t(x) exceeding c (Equation 3), respectively not exceeding c

(Equation 4) given z(x) > c is

p(hit) := p(t(x) > c | z(x) >c) = J" f(z(x))dz(x) (3)

/?(false alarm) := p(t(x) < c | z(x) _ c) = I f(z(x))dz(x) (4)

Analogously the conditional probabilities of *(x) being above c (Equation 5)

and being below c (Equation 6), given z(x) < c is

p(miss) := p(t(x) > c \ z(x) < c) = f f(z(x))dz(x) (5)

/?(correct rejection) := p(t(x) < c | z(x) < c) = f(z(x))dz(x) (6)

As described in signal detection theory (Tuzlukov, 2001) and shown in Table

4.1, the decision maker is either facing a hit or false alarm (if z(x) > c ) or a miss or

correct rejection (ifz(x) < c).

z(x) > C z(x) < C_____ _

_

t(x) < C False alarm Correct rejection

Table 4.1 Different types of decisions depending on the estimation z(x) and the hypothesis (the true

value t(x))

Deliberating between different remediation alternatives, the decision maker

has to consider each possible t(x) and has to trade off the utilities and

probabilities of hits and misses in case of z(x)>cand analogously the utilities and

probabilities between false alarms and correct rejections in case of z(x)<c. Thus,

given z(x), the decision maker gets a utility score for each (possible) true value

and for each alternative. When looking for the better alternative, these utility

scores have to be integrated along the density function f(z(x))dz(x). This is

shown below after introducing the multi-criteria utility framework.

73

Page 85: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

4.4 Decision alternatives on soil remediation

In practice, decision makers are supposed to be confronted with different

decision alternatives Ar In general there will be a finite set of alternatives A{,

i = X„„n, a countable set of alternatives / = 1,...,°° or a continuous set of

alternatives, for instance if the time of a treatment is considered. For the sake of

simplicity the subsequent analysis is restricted on three alternatives, 4», 4* and

An-

• No remediation Am: The contaminated land will not be remediated. The

pollutants remain in soil and safeguarding is the main action that will follow the

decision for this alternative. Agricultural yield and human health remains

endangered depending on the level of contamination. The market value of the site

presumably decreases due to the unknown risk from pollutants.

• Ex situ remediation Am: We assume a soil washing treatment as an example

for ex situ remediation. The contaminated soil is removed and treated off site and

exchanged with unpolluted soil, which is taken from another site. Environmental

and human health is no longer endangered. A decrease of agricultural production

depends on the land reclamation method (i.e. the way how the soil is substituted)

and goodness of reclamation practice. Once, an appropriate topsoil is replaced,

physical properties of soil develops towards the original state within a few years

(Geiger et al, 1993; Haering, Daniels, Roberts, 1993).

• In situ remediation Ain: We assume an in situ remediation method like

phytoremediation (Kumar, Dushenkov, Motto, Raskin, 1995; Cunningham and Ow,

1996; McGrath, Zhao, Lombi, 2002). The soil is not excavated, instead cleaned up

on the spot. However, phytoremediation methods are not able to remove a huge

amount of pollutants and the degree of removal depends on the level of

contamination (McGrath et al, 2002). Agricultural yield is influenced or even

improved by phytoremediation, the cost for the application of remediation method

is usually low (Eiermann, 1999; Ebiox, 2003) compared to other remediation

methods, but the market value of the area changes considerably due to a long

remediation time.

74

Page 86: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

4.5 Utilities of soil remediation

We propose four dimensions of utilities (partial utilities) that are relevant for a

decision on soil remediation for each alternative. The choice of dimensions reflects

the impact on human living from soil contamination, the economic impact from

polluted soil and technology that is applied for remediation, and the ecological

impact on soil.

The utility for human health impact uh(A,): An impact on human health is

particular relevant in case of high contamination for the most contaminants like

cadmium, copper, zinc, etc (Oliver, 1997). Human health is mainly affected

through the ingestion of contaminated food or through hand-to-mouth contact with

contaminated soil. Cadmium is a contaminant that accumulates in the human body

due to his half-life period of at least 20 years (Kazemi, 1983). Therefore, even low

concentrations are relevant for human health impact. In decision makers' practice,

it will be impossible to assess a precise and crisp utility function of human health

impact as the conditions and data of exposure and kinetics of intake are not

completely known (Beinat, 1997; Paustenbach, 2002, pp207 et seqq). The utility

uh(A,) is derived from the enhancement for human health after the application of

remediation method Ai.

It is assumed that cadmium is a natural resource up to a certain range of

concentrations (Bachmann, Schmidt, Bannick, 1996; McGrath et al. 2002).

Consequently, there is small or even zero utility after a remediation in case of low

concentration (e.g. below the guide value of Swiss Ordinance Relating to

Pollutants in Soil (VBBo, 1998)). The slope of the utility function for health impacts

is a result of the toxicity of contaminant (i.e. cadmium, in this case).

The costs of remediation technology u^): The costs of remediation are

significant in the decision-makers choice and particularly become relevant when

different alternatives A, are deliberated for the decision (Rivett, Petts, Butler,

Martin, 2002). Costs of remediation arise from the specific site conditions (e.g.

type of soil), the size of remediation area, and the remediation technology. Finally

the costs are a function of demand and supply. In case of soil washing, the costs

rise with the level of contamination and the size of contaminated area, but also

depends on the disposal of remaining residuals after remediation (Eberhard,

75

Page 87: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

1996). Analogous considerations can be assumed in case of phytoremediation for

the deposition and breeding of plants. For the alternative 'no remediation' costs of

safeguarding arise for a specific time span. For the sake of simplicity a simple

rating of costs for different remediation alternatives is used in the construction of

utility functions: uv(AJ <uv(Ano) <uv(AJ. The maximum utility for each alternative

is then uv(Ain) = lt uv(Ano) = 0.B, u^(Aex) = 0.5. Given all other properties of an

alternative remain unchanged, it is assumed that the lower the costs the higher is

the utility of the application (Janikowski, Kucharski, Sas-Nowosielska, 2000). The

utility of costs for each spatial unit are calculated with 1/10 000 of the assumed

costs.

77io long-term productivity of soil up(A) valuates soil quality for agricultural

production after remediation. The chosen remediation technologies, A,, have

different effects on the improvement of agricultural long-term productivity. Soil

fertility is a sensible indicator for agricultural productivity that is difficult to measure

or to assess (Johnson et al., 1997; Herrick, 2000). Soil scientists assume a

noticeable decrease of soil fertility at a certain level of concentrations for heavy

metals (Schmitt and Sticher, 1986; Hämmann and Gupta, 1997). However, a

change in agricultural productivity, is a multi-factorial process in soil (Hess et al.,

2000). Hence, for the construction of a utility after remediation it is assumed that

the productivity decreases non-linearly with increasing cadmium content.

The market value of land after action um(At): For the assessment of an

economic impact of remediation, financial experts calculate the costs caused by a)

remediation, b) the loss of land quality during and after remediation, and c) the

loss of market value due to missing liquidity of the landowner. The latter is taken

into account, if the landowner is forced to sell the land property, as he or she

cannot afford the remediation costs. The construction of an economic utility is

based on loans that will be given to landowners by banks for the contaminated

land. According to a survey run by Sell, Weber and Scholz (2000), the credit

officers of small and medium banks show reservation about the success of

phytoremediation. Based on the quoted surveys it is reasonable to assume that

the application of phytoremediation reveals 50% higher reduction of market value

than the application of soil washing does (Sell et. al, 2000, p. 25). The derivation of

76

Page 88: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

utility functions is based on the assumption that the market value is reduced

proportionally to the contamination and increases again after remediation. A high

reduction of market value results not necessarily in a low utility, the economic

utility depends on the success of remediation method. For example after the

application of phytoremediation only half of the market value before remediation is

achievable. Consequently the method gains less economic utility than more

successful methods.

4.6 Utility functions for decision alternatives

4.6.1 Construction of utility functions

Referring to utility theory as proposed by Keeney and Raiffa (1976) and

discussed and applied by von Winterfeldt and Edwards (1986), there are various

principles to construct a utility function. As shown in Table 4.2 and described

below utilities are displayed in one dimensional versus multidimensional, discrete

versus continuous, and deterministic versus stochastic utility functions. In general

utility functions have an absolute scale (von Winterfeldt and Edwards, 1986). This

means that they have a zero point and different values can be proportionally

compared (e.g. a utility 5 is five times bigger than a utility 1 as 50 is compared to

10).

77

Page 89: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

a) Dimensionality of utility: The simplest case is a one-dimensional utility

function. For the variable that is affected by the decision, a utility wlW(4) of anv

alternative 4 is determined for each value z(x). The utility evaluates

consequences and outcomes applying A, and has to be derived from experts'

knowledge. For a multi-dimensional utility type, the decision maker differentiates

between different types of consequences (i.e. costs, environmental benefits or

agriculture yield). This leads to partial utilities u{{x)(A,) for each value z(x). The

overall utility for an alternative 4 is then attained by the sum of these partial

utilities: c/z(;[)(4)-X<Je)(4)-j

b) Continuity of utility: Figure 4.6.1a shows an example for a discrete utility

function. We present a simple function for the remediation case assigning

uz(x)(A,)=0.9 for all contaminations between Guide and Trigger value and

wlW(4)=0.5 for contaminations between the Trigger and the Remediation value

(explanation of threshold values see above or Harris et al. (1996)). In this case the

values of the utility function are degrees of goal attainment in legal compliance and

standards given in the Swiss Ordinance Relating to Pollutants in Soil (VBBo,

1998).

a) b)

H

î 0.5-_

0.8 2 20

cadmium mg/kg cadmium mg/kg

Figure 4.6.1 Two examples how to derive the ecological benefit from remediation. The utility

between one and zero is the degree of goal attainment in legal compliance. Threshold values for

cadmium are described in Swiss Ordinance Relating to Pollutants in Soil (Guide value= 0.8 mg/kg;

Trigger value=2 mg/kg; Remediation value- 20 mg/kg)

78

Page 90: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

Schärli, Scholz, Tietje, Heitzer and Hesske (1997) and Sell et al. (2000)

provide an example of a continuous utility function (see Figure 4.6.1b). They

assess the ecological benefit (utility) of a remediation option for different types of

land use, when referring to the above-cited legal compliance. Exceeding the

Remediation value provides a utility of 0 and rises continuously up to 1 until the

soil contamination falls below the Guide value. The utility of the Trigger value is

supposed to be 0.5 and linear functions are proposed between the remediation

respectively the Guide and the Trigger value.

However, the assumption of a linear continuous utility function is not

reasonable for the behavior of most environmental variables. For example, the

remediation of heavy metals by plants (i.e. phytoremediation) is influenced by

several interacting variables in a complex ecosystem (Kumar et al., 1995; Filius,

Streck, Richter, 1998; Ensley, 2000). Therefore, instead assuming a linear uptake,

Kayser (2000) suggest a first-order-kinetic for the decontamination of lead by

plants (i.e. B.juneca), taking into account the depletion of phytoavailable lead that

is faster than the re-supply of strongly bounded lead in the soil matrix.

Although the interrelation between variables under consideration is often not

known, the assumption of a linear relationship is often not appropriate for variables

related to environmental issues (Limpert et al., 2001). Therefore, we construct

non-linear utility functions, if the rationale of interrelation of the variable under

consideration is known, as e.g. in the kinetics of heavy metal intake by plants.

c) Assessment of utility: A deterministic utility function is constructed, if for

the application of each alternative A, a single utility «,w(4) or partial utilities

m/w(4) will result with certainty. If there is uncertainty about the utility, e.g. on the

success of an alternative, a stochastic utility function can be appropriate (Tietje

and Scholz, 1996). In this case, the decision maker has to discuss reasonable

arguments about the distribution of the utilities (Quiggin 1989). However, the

assumption of a probabilistic utility function usually exceeds the degree of

complexity and available knowledge that is manageable in practice.

In this paper, multidimensional, continuous and deterministic utility functions

(see below) are proposed for decisions on soil remediation. The utility function is

normalized to a range between zero and one, when referring to the maximum and

minimum possible valuation. Each utility function is an evaluation of the

79

Page 91: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

consequences from a treatment 4. for all possible true values of contaminations

t(x). For each alternative and each coordinate x we construct the overall utility

C/(4) by the sum of partial utilities:

_r(4) = „*(4)+_v(4)+«'(4)+ttm(4) (7)

In practical decision-making, stakeholders or experts distribute different

weights to the utility functions. Besides the criteria selection, the weighting of the

different utility dimensions is the most critical aspect in stakeholder analysis

(Saaty, 1990; Scholz and Tietje, 2002). We do not discuss the methods of

weighting in this paper and assume that all weights are equal, instead.

The decision support is given with the overall utility f/(4)of a remediation

alternative and a density value f(z(x))dz(x) that describes the uncertainty of

cadmium estimation. Based on the probability knowledge of estimation, the

decision maker decides for each remediation alternative at each coordinate x

with the help of the density function from estimation and the overall utility. The

overall utility C/(4) for a true value t(x) and the density function derived from the

estimated point describes the expected utility:

E[U(-)] = _T,t/(4)] = \U(Ai)f(z(x))dz(x) (8)0

4.6.2 The alternative ex situ remediation (soil washing) Aa

The utility for human health u^iA^): Soil washing results in a complete

removal of contaminant at one time and for the total amount of contamination.

Consequently, the utility is derived from either having cleaned up and having no

health impacts or not having cleaned up and having health impacts. There is no

impact on human health below the natural background level N of cadmium content

in soil. Consequently soil washing results in zero utility, when it is applied in the

presence of contaminations below N. As a result the value 1 or 0 respectively is

assumed and displayed with a discrete function for the utility from health impacts:

80

Page 92: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case ofsoil remediation

h , JO forz(x)<iVUxMex)

|j f0Tz(x)>N (9)

where N denotes a value of natural background level of contaminant. In the case

study, the natural background level Wis set to 0.3 mg/kg total cadmium content

(Tuchschmid, 1995; Bachmann et al., 1996).

The utility of cost tt*(J0(4_)- The calculation of cost is based on the

assumption of 100 Swiss Franks/t (72 US $) for excavation (Heitzer, Scholz,

Stäubli, Stünzi, 1997; Ebiox, 2003). The estimation for a spatial unit used in our

case study (=1600m2) is 120,000 Swiss Franks (86,880 US $). The costs of

remediation rise with increasing contamination, but reach a maximum score of 0.5,

due to a simple rating of costs of remediation alternatives, that is:

uv(Ain)<uv(Ano)<uv(Aex) (see also above). This results in a linear utility function up

to a cadmium contamination of 6 mg/kg.

<mUJ = -

— *z(x) forz(x)<6mg/kg

0.5 forz(x)_6mg/kg

The utility gained for long-term productivity «£„(4-) : The utility is derived

from the improvement achieved by an exchange of soil. Assuming that an

appropriate unpolluted soil is exchanged, the comprehensive functionality of the

soil system will have been fully developed within an unspecified period after

remediation (Haering et al., 1993). Comprehensive functionality is difficult to

assess and therefore a most simple utility function is derived. The utility from soil

washing procedure is zero when agricultural productivity was not negatively

affected before remediation. Consequently the utility function discriminates

between no improvement (ttf(x)(4_)~°) and improvement («^ (4J = 1).

p[0 forz(x)<S

"iw(4h)"[1 forz(x)>S (11)

where S denotes a threshold value of impact on soil functionality. The

threshold value S is 0.8 mg/kg total cadmium in soil, which is the Swiss legislation

Guide value (see above).

81

Page 93: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

The utility gained for the change of market value _?(x)(4_): The derivation of

an economic utility is based on the assumption that the market value of land after

soil washing equals the market value before contamination. The correction of

market value increases linear to the increasing contamination. The utility for the

maximum concentration under consideration is assumed to be 1, presuming the

market value of land increases after soil washing. For the calculation a maximum

contamination of 30 mg/kg is assumed.

to^-M'nk; forz(x)*°

4.6.3 The alternative in situ remediation (phytoremediation) 4,

The utility for human health uh(AiR) : The application of phytoremediation

continuously reduces the contaminantion in soil. However, the reduction is not a

linear process due to several interacting factors in soil like for example the mobility

of heavy metals, species selection of plants, and the influence of cropping

systems. The process of heavy metal extraction from soil to plant is investigated

by Wenger, Gupta, Furrer, Schulin (2002), Kayser (2000) and others. Kayser

(2000, p.130) emphasizes that the heavy metal removal rate decreases with

decreasing phytoavaiiable concentrations of heavy metals. Hence, from a

conservative, risk-averse point of view (as also taken in k*w(4_)). there is a

remaining impact on human health after applying phytoremediation. Therefore, we

propose a utility function that increases non-linear with increasing contamination:

„ Üz(x)/c forz(x)_#

Mz(*MJ~jl forz(x)>* (13)

where R denotes the value of toxicity for human health (20 mg/kg, Remediation

value of Swiss legislation (VBBo, 1998)).

The utility of cost uvz(x)(Ain) : The calculation of cost is based on an estimate of

2 Swiss Franks /m2/a (-1.50 US $) for phytoremediation (Scholz, et al. 1999). The

82

Page 94: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

estimation for a spatial unit (=1600m2) during 10 years1 is 32 000 Swiss Franks

(-23 168 US $). The costs of remediation rise with increasing contamination, but

reach a maximum score of 0.5, due to a simple rating of costs of remediation

alternatives, that is: uv(Ain)<uv(Ano)<uv(Aex) (see also above). This results in a

linear utility function up to a cadmium contamination of 3 mg/kg.

"2vw(4)"'

i— *z(x) forz(x)< 3.2 mg/kg3-2

(14)1 forz(x)> 3.2 mg/kg

The utility gained for long-term productivity upx{x)(Ain): The utility is derived

from the improvement achieved by the application of phytoremediation {Ain). The

remediation method is an in situ remediation method that sustains the structure

and functionality of the polluted soil (Krebs, 1996). For the constructing of a utility

function, we distinguish three different levels of improvement after applying

phytoremediation: i) the range of contamination where the method is not able to

reduce the contamination below a critical impact value for the agricultural

productivity, ii) the range of contamination where the method reduces cadmium

concentration below a critical impact value and iii) the range of concentration

where phytoremediation gains no improvement for agricultural productivity due to

low cadmiums loads. In the study area plants with biomass production of 10 t/ha

and an uptake of 45 mg/kg cadmium are able to reduce the contamination for 50%

of total contamination (Kayser et al, 2000). However, environmental scientist

dispute whether there are plants with such a performance. Some studies reported

about plants with high uptakes of heavy metals (Kumar et al., 1995; Ebbs and

Kochian, 1998; Lombi, Zhao, Dunham, McGrath, 2001), which could attain the

necessary level of performance. The cadmium uptake is dependent on soil,

climate, plant species and other properties (McGrath et al., 2002). Therefore it is

not reasonable to assume a removal of pollutants in a reasonable time span and

no maximal utility can be expected for case i). In this case the residual

1The reader has to acknowledge that residents and landowners are not prone to accept the

application of in situ soil remediation, if the procedure last longer than five years (Weber, Scholz,

Bühlmann, Grasmück, 2001). Thus, we assume a time span of maximal 10 years for the

assessment of the utility function.

83

Page 95: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

contaminants will cause a remaining effect to long-term productivity after

remediation. For case ii) a certain removal rate yields maximum utility, because

the amount of removed cadmium provides a cadmium load below a critical impact

value. For case iii) one has to acknowledge that low cadmium concentrations have

no effects on plant growth and phytoremediation gains no improvement for

agricultural production, therefore. Based on these arguments the utility function for

long term soil productivity after phytoremediation is

«fw (4„ ) = e-(lnz(x)-°-9f, z(x) 6 R+ \ {0} (15)

The utility gained for change of market value u^x)(Ain)\ The derivation of

utility function is based on the assumption that only half of the reduced market

value can be corrected after phytoremediation (Sell et al., 2000). Therefore the

maximum achievable utility is set to 0.5. Simplifying, the correction of market value

increases linear with increasing contamination. In accordance to a utility gained for

soil washing the following function is suggested:

«„«I (4« ) = *(*) *—-— for z(x) * °

<wV in)2*z(x)max (16)

4.6.4 The alternative no remediation Ano

The utility for human health w*w(40): The proposed utility function of a

human health impact is derived from a conservative, risk-averse point of view.

Based on the fact that cadmium accumulates in the human body for a long time

period contaminations below the legal threshold values are are taken into account

for an impact on human health (see also above). May, Scholz and Nothbaum

(1991) calculate the human health impact with dose-response curves that consider

environmental harm from cadmium exposure and intake by linear-logistic

functions. In this study the utility for a human health impact is constructed as a

nonlinear, polynomial function of the predicted cadmium contamination. Utilities

decrease nonlinearly with increasing contamination, because this remediation

alternative gains no improvement for human health.

84

Page 96: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

« (Â ._fl-V^ÖÖÄ-T forz(x)</?

U^KA°}-o forz(x)>* M

where R denotes the value of toxicity for human health (20 mg/kg, Remediation

value of Swiss legislation (VBBo, 1998)). The Remediation value of agro-and

horticulture is not considered, because the study area is mainly located within the

municipal zone (Hesske et al., 1998)

The utility of cost «^ (Am) : The calculation of cost is based on an estimate of

3 Swiss Franks /m2/a (~ 2 US $) for safeguarding of the contaminated area. The

estimation of costs for a spatial unit that is used in our case study (= 1600m2) are

48 000 Swiss Franks (-34 750 US $), assuming the duration of 10 years

safeguarding. Further the maximum utility is determined to 0.8, due to a simple

rating of costs of remediation alternatives: uv(Aln)<uv(Ano)<uv(Aex)(see also

above). The utility function is presented as a linear, deterministic function, simply

assuming that increasing contaminations causes increasing costs.

'

1

«.vw(4_) = '

*z(x) for z(x)< 3.84 mg/kg4.8 (1°)

0.8 forz(x)_ 3.84 mg/kg

The utility gained for long-term productivity up(x)(A„0) ; Plant growth is getting

more and more incomplete and soil fertility decreases with increasing

contamination (Geiger et al., 1993). But, the impact of increasing cadmium in soil

and plants is not linear and therefore the proposed utility function decreases

nonlinearly with increasing content of contaminants. The utility is equal to one for

concentrations without any effect on soil fertility (low concentrations), but

decreases continuously and is zero for high states of contaminations.

u^(AJ =e^^ z(x)eR+ (19)

The utility gained from change of market value w)(40): In case of no

remediation, the lenders show reservations about profitableness of this decision

alternative due to the remaining risk from contamination and the unforeseeable

time span for a reduction of contamination (Sell et al, 2000). Consequently, this

85

Page 97: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

alternative results in no economic benefit and a utility of zero for all concentrations

under consideration: um (4o) = 0*(»>

4.7 Results

4.7.1 The utility functions

We construct nine different utility functions for three different remediation

alternatives as displayed in Figure 4.7.1. The utility functions were constructed

from the available knowledge about cadmium in soil and their impact on human

health, the influence of long-term productivity of soil, the cost of remediation and

the change of market value after remediation. This leads to continuous, non-linear

functions for four utility functions (impact on human health and long-term

productivity for in situ and no remediation alternatives). Three utility functions had

to be derived in a discrete way, like the impact for human health or long-term

productivity of soil after soil washing. For all other utility functions a linear or partial

linear function appropriately describes the impacts from cadmium in soil.

OJtility

Alternative \Human health Costs Long-term productivity Change of market value

1-

os-

1-

05-

0 5-

Soil washing0 5- 05-

/"

3 6 OS 30

Phytoremediation 05- // 03- os

20 i 3

1

30

No

remediation

0 5- / 05-

»_A_W>

20 3

t

3

Figure 4.7.1 Utility functions derived for remediation options and four dimensions of evaluation. The

x-axis denotes cadmium in topsoil from 0-30 mg/kg. The y-axis denotes the utility score from zero

to one

86

Page 98: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

4.7.2 The overall utility of remediation alternatives

The calculated overall utility t/(4) at each coordinate (10799 spatial units à

1600 m2 in total) and for each remediation alternative shows for the Dornach area

that only at 69 spatial units the alternative phytoremediation reveals the highest

score of overall utility. The utilities for soil washing (5422 spatial units) and no

remediation (5308 spatial units) reveal similar numeric results, but they are

specific for their spatial distribution. According to the fact that the maximum

contamination is located around the metal smelter, the cases with highest overall

utility scores for soil washing concentrate around the source of contamination. The

alternative no remediation yields best results in at least 300 m distance to the

metal smelter due to the fact that the cadmium concentration in soil is low.

Considering the contribution of single utility dimensions to the overall utility, it

is obvious that the utility of the change of market value is not relevant for none of

the remediation alternatives (Figure 4.7.2). For all remediation options the utility

gained from improvement for long-term productivity of soil is a decisive argument.

However, the main cause for the highest scores in soil washing is the

improvement for human health impact.

No remediation Soil washing Phytoremediation

H Human health nCost «Long term productivity Change of market value

Figure 4.7.2 Contribution of each utility dimension to the overall utility for each remediation option.

Displayed is the mean utility ui^A,) [y-axis] of all considered areas (10799 spatial units à

1600m2) for each utility dimension and remediation option [x-axis]

87

Page 99: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

The overall utility for phytoremediation is mainly influenced by the argument

from (low) cost and long-term productivity, while the impact for human health is

less important. Each single utility dimension for phytoremediation reveals in

average the lowest scores of utilities of all remediation options. Note that this is

true for the mean partial utilities, but maybe different for single spatial units.

The sum function of U(A,) for each remediation alternative, as displayed in

Figure 6, explains the result of only 69 spatial units of highest utility for

phytoremediation. There is a small range of highest overall utility scores for

phytoremediation around the concentration of 3 mg/kg cadmium in soil. This is

mainly influenced by the utility function that describes the improvement for long-

term productivity of soil (compare Figure 4.7.3). According to the utility function

used the long-term productivity is attained only within a small range of

contamination, due to the limited removal of pollutants when phytoremediation is

applied.

1 4 5 6

Cadmium [mg/kg]

- - No Remediation — Soil washing — Phytoremediation

Figure 4.7.3 Sum functions of utilities from four dimensions and for each remediation alternative

88

Page 100: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

4.7.3 The expected utility at coordinate x

The expected utility is calculable for any cadmium concentration that could

occur at point x with estimation z(x). Therefore, the expected utility is a tool to

picture the uncertainty from estimation and its' consequences for the decision.

Figure 4.7.4 exemplarily shows three estimated values of z(x) and the expected

utility for two values of t(x) for each remediation alternative. The decision

preference is indicated and derived from the expected utility for the predicted

cadmium content z(x).

Cadmium, predicted [z(x) in mg/kg]

2.55

a3.1

Expected Utility

E[U(Am|| washing)]E[U(Aperem«Nation)]

C[U(Ano remsdlationJJ

Decision for.

t(x)=1.5 z(x)=2.55 t(x)=4.5 t(x)=1.5 z(x)=3.1 t(x)=4.5

0.987

0.1

0.948

0.998

SM8 0.991

0.915 0.826

c=2 mg/kg

z(x)"

p(false alanm)=0.33 p(fa|Se a|arm)=0.2

z(x)

p(hit)=0.8

0.78

t(x)=0.3 z(x)=0.78 t(x)=1.5

0.981 1.000

0.997

0.999

p(correct rejection)=0.93

z(x)

$

.c=2„mg/kg

p(miss)=0.07

Figure 4.7.4 Expected utility and feasible decisions for three examples of cadmium contamination

and different remediation alternatives Ar The true value [t(x)J denotes a possible state of

contamination beyond the prediction [z(x)]. The density distribution at the predicted value picturesthe probability of occurrence for different types of decisions (defined in Table 1) referring to a

threshold value of 2 mg/kg cadmium in soil.

89

Page 101: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

In accordance with Figure 4.7.3 the difference between the expected utility

for soil washing and phytoremediation is small and even negligible in some cases.

Nonetheless the variation in expected utility for different states of contamination is

obvious and provides transparency for the decision between remediation

alternatives. Another transparent approach of judging the decision is given by the

density distribution shown in Figure 4.7.4 The probability of any true value falling

below or above the threshold value of 2 mg/kg cadmium in soil serves as quality

criteria for the decision. For the exemplarily taken values of z(x) the probabilities

for a correct decision are [p(t(x)>2\z(x) = 2.55)=] 0.67 (hit),

[p(/(x)_2|z(x) = 3.1)=] 0.8 (hit) and [/?(*(x)<2|z(x) = 0.78)=] 0.93 (correct

rejection). In other words the true value is likely to occur in accordance with the

prediction above or below the threshold value. More general Table 4.3 records the

goodness of decisions for all predicted and analyzed spatial units. The results are

valid for a threshold value of 2 mg/kg cadmium in soil and present mean

probabilities of occurrence for the correspondence of prediction and hypothesis.

Beside the fact that only 69 spatial units show preferences for phytoremediation,

the high mean probability for correct decision in case of the alternative no

remediation is obvious.

Spatial units Mean probability for Mean probability for

(à 1600 m2) erroneous decision correct decision

[p(false alarm); p(miss)] [p(hit); p(corr.rejection)]

Soil washing 5422 0.29 0.70

Phytoremediation 69 0.27 0.73

No remediation 5308 0.05 0.94

Table 4.3 Mean probabilities of occurrence for decisions above or below a threshold value of 2

mg/kg cadmium in soil. The mean probability of occurrence indicates the goodness of decision for

all cases in which the remediation alternative reveals the highest utility score

90

Page 102: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

4.7.4 Deviation from predicted to hypothesized contamination

Alternatively the goodness of decision can be determined by a confidence

interval around the predicted value (within or without the true value is likely to

occur) or simply a certain distance above or below the predicted value as shown in

Figure 4.7.5.

False alarm

Correct

rejection

/15 20 25 30

Cadmium mg/kg

Acceptable true value

13 20

Phytoremediation Cadmium mg/kg

3 10 15 30 23 30

Soil washingCadmium mg/kg

i id IS 30 23 30

NoremedationCadmium mg/kg

Figure 4.7.5 The evaluation of decision as function of deviation from prediction to hypothesis of

contamination

Then the distance or deviation from predicted value to the accepted true

value determines the goodness of decision (hit, a false alarm, a miss or a correct

rejection). The defined distance to the accepted true value may differ with

remediation alternative due to the defined utilities and its effectiveness. For a

correct decision on phytoremediation the distance may be smaller than the

distance to the accepted true value for soil washing or no remediation (see Figure

4.7.5). Clearly the acceptable difference between prediction and hypothesis is

subject of negotiation between decision makers and case agents. It also

determines the magnitude of failure cost resulting from erroneous decisions.

Erroneous decisions (miss and false alarm) and their consequences and failure

91

Page 103: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

costs are multiple, different for each remediation alternative, and mainly case

dependent.

4.8 Discussion

The presented decision procedure integrates uncertainty about the

estimation of cadmium in soil with multi-criteria utility for a proper decision on soil

remediation alternatives. The method uses a geostatistical prediction of cadmium

content in soil that is based on sparse data measurement and reveals high

uncertainty of estimation. The uncertainty is a result of sparse sampling and the

overall variance of sampled data. The reason for a high variance in the data can

be found in the air-borne distribution of cadmium. Although there is no trend in the

data, the variability of data is a function of the distance. Therefore, certainty for

single points is likely to increase after an additional sampling, but the overall

variance of all sample data will still be high. Further, the data will still contain an

uncertainty from sampling procedure (sampling pattern, sampling density,

composite or spot sample), which is estimated about 35% of the predicted

cadmium concentration (Wagner et al., 2001, Schnabel et al., 2002).

The probability density function from geostatistically predicted mean values

for cadmium serves as a model for the true value that is likely to occur beside the

predicted value. The model is considered as a quality criterion in four kinds of

decisions (hit, false alarm, correct rejection, miss). In our case study the decision

is evaluated with a legal compulsory threshold value that refers to Swiss

legislation. This may not be useful in any practical application.

Utility functions are mathematical representations of the decision-makers

mental model and a subjective view of the world, therefore. However, the concept

of multidimensional utilities, as used in this paper, supports a rational and

transparent decision process. Considering different aspects of impacts follows the

demand of complete and decomposable, but non-redundant and comparable

criteria for the decision (Werner and Scholz, 2002). The constructed utilities ask

for simple criteria when discrete and continuous functions are used. The

assessment of deterministic instead of stochastic functions guarantees

manageability and is adequate for the investigated case. Introducing stochastic

utility functions (from empirical surveys) will increase the degree of uncertainty and

Page 104: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

complexity and finally may lead to nontransparent decisions. In such a case the

utility that is most likely to occur (for example with 95% certainty) for the predicted

value or another expected value might be defined as minimal acceptable (and

most certain) utility. The intersection of uncertainty from utility value and

uncertainty for the true value is then defined as expected utility.

In decision makers' practice the success of the introduced method depends

on the available data and knowledge about the contamination and its' impact.

Especially for the case of sparse measurements the method is able to generate

transparent decision that include high uncertainty. However for a final decision on

soil remediation the relative importance of each utility for different stakeholders

has to be determined. The algorithm of weighting was not a subject of this paper,

but is an important step forward to the selection of the best alternative. Examples

of weighting methods and ranking algorithms for case studies can be found in

Prato (2003) and Scholz and Tietje (2002).

Finally the introduced method supports the management of contaminated

land for example within the context of regional planning. Since potentially

contaminated plots are stored in register due to a certain suspicious land use

history, the uncertainty of the real state of contamination is high. All plots within

these registers are contaminated or not and they are suspicious (of contamination)

or not. Then four different cases are likely to occur for one single plot. This is an

unsuspicious and uncontaminated state, which resembles a correct rejection (of

suspicion) in the decision heuristic presented. Consequently a suspicious and

contaminated plot is a hit. Erroneous decisions on the registered plot are

characterized through a suspicious but uncontaminated case (false alarm) or a

contaminated but suspicious case (miss). In this situation the additional application

of the Bayes' Theorem is able to estimate the rate of real contaminated plots from

all contaminated plots (hit rate) and from all suspicious plots (diagnosis). This

resembles the decision heuristic introduced in this paper. It allows for transparency

and prevents erroneous decisions on contaminated land.

93

Page 105: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

4.9 Conclusion

We introduced a simple approach of generating utility functions for the

situations of soil contamination, which resembles many other decision situations.

Although, the proposed methodology implies some strong assumptions, it opens a

practical and reasonable approach to support decisions on soil remediation.

Further, we presented four different types of goodness in order to validate the

decision for a remediation alternative. The density value for overall utility at each

true value around a crucial threshold under decision sheds light into fuzzy results

of prediction and evaluation. Herewith, the decision-maker is able to deliberate

between remediation alternatives at each possible state of contamination and

therefore is likely to prevent erroneous decisions.

The applied technique increases decision-makers knowledge, provides

insight into the comparison of different remediation options, and presents a formal

and consistent approach for decisions on contaminated land management.

Acknowledgments

We would like to thank Olaf Tietje and the colleagues at UNS for discussions and support of

our work. The project was funded by the Swiss National Science of Foundation (Project Number

50011-44758) and the Swiss Federal Institute of Technology (Project Number 0048-41-2609-5).

94

Page 106: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

4.10 References

Bachmann, G., Schmidt, S. and Bannick, CG., 1996. Hintergrundwerte für Schwermetalle in

Böden. Wasser & Boden 48. (4), 9-11.

Beinat, E., 1997. Value functions for environmental management, Environment & Management,

Kluwer, Dordrecht Boston London, 241 pp.

Bell, F.G., Genske, D.D., Bell, A.W., 2000. Rehabilitation of industrial areas: Case histories from

England and Germany. Environmental Geology 40 (1-2): 121-134.

Cunningham, S.C., Ow, D.W., 1996. Promises and prospects of phytoremediation. Plant

Physiology 110, 715-719.

Ebbs, S.D., Kochian, L.V., 1998. Phytoextractioin of zinc by Oat (Avena sativa), Barley (Hordeum

vulgare), and Indian Mustard (Brassica juncea). Environmental Science and Technology

32, 802-806.

Eberhard, H., 1996. Stoffflüsse mit technischen Sanierungsmöglichkeiten. In: Baudirektion des

Kantons Zürich in Zusammenarbeit mit ETH-UNS, Grundsätze, Modelle und Praxis der

Altlastenbearbeitung im Kanton Zürich: Referate zur Altlastentagung 1996, Zürich.

Ebiox, A., 2003. Altlastenentsorgung: Aktueller Preisspiegel in der Schweiz. Luzern, Ebiox AG,

Pioniere in Bioremediation.

Eiermann, D., 1999. "Biosanierung erfordert Gefühl für den Boden". BioTeCH forum 2,12-13.

Ensley, B.D., 2000. Rationale for use of phytoremediation. In: Raskin, I., Ensley, B. D.,

Phytoremediation of toxic meats. John Wiley & Sons, New York, pp. 287.

Filius, A., Streck, T., Richter, J., 1998. Cadmium sorption and desorption in limed topsoils as

influenced by pH. Isotherms and simulated leaching. Journal of Environmental Quality 27,

12-18.

Fritsch, M., Tietje, O., Hepperle, E., Schnabel, U., 1999. Landnutzungsmodelle als

Planungsinstrumente für den Einsatz von sanften Bodensanierungen. TerraTech 6, 24-27.

Geiger, G., Federer, P., Sticher, H., 1993. Reclamation of heavy metal-contaminated soils: Field

studies and germiantion experiments. Environmental Quality 22, 201-207.

Green, D.M., Swets, J.A., 1988. Signal detection theory and psychophysics. Peninsula Publishing,

Los Altos, pp 505.

Gsponer, R., 1996. Ursachendifferenziertes Vorgehen zur verdachtsorientierten Erkundung von

Schwermetallbelastungen im Boden. Dissertation. Eidgenössische Technische

Hochschule, Zürich, pp. 207.

Gupta, S.-K., Wenger, K., Tietje, O., Schulin, R., 2002. Von der Idee der sanften Sanierung zum

nachhaltigen Umgang mit schwermetallbelasteten Böden. Terra Tech 2, 53-54.

Haering, K.C., Daniels, W.L., Roberts, JA, 1993. Changes in mine soil properties resulting from

overburden weathering. Journal of Environmental Quality 22 (1), 194-200.

Hämmann, M., Gupta, S.K. 1997. Herleitung von Prüf- und Sanierungswerten für anorganische

Schadstoffe im Boden. Bundesamt für Umwelt, Wald und Landschaft, Bern, pp. 100.

95

Page 107: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

Harris, R.F., Karlen, D.L., Mulla, D.J., 1996. A conceptual framework for assessment and

management of soil quality and health. Soil Science Society of America, Special

Publication 49, 61-82.

Heitzer, A., Scholz, R.W., Stäubli, B., Stünzi, J., 1997. Is bioremediation a viable option for

contaminated site treatment? In: Sayler, G. S., Davis, K. L., Sanseverino, J., Biotechnology

in the sustainable environment. Plenum Press, New York, pp.107-126.

Herrick, J.E., 2000. Soil quality: An indicator of sustainable land management? Applied Soil

Ecology 15,75-83.

Hess, G.R., Campbell, C.L., Fiscus, DA, Hellkamp, A.S., McQuaid, B.F., Munster, M.J.,Peck,

S.L.,Shafer, S.R., 2000. A conceptual model and indicators for assessing the ecological

condition of agricultural lands. Journal of Environmental Quality 29, 728-737.

Hesske, S., Schärli, M., Tietje, O., Scholz, R.W., 1998. Zum Umgang mit Schwermetallen im

Boden: Falldossier Dornach. Pabst Science Publishers, Lengerich, pp. 145.

Janikowski, R., Kucharski, R., Sas-Nowosielska, A., 2000. Multi-criteria and multi-perspective

analysis of contamiated land management methods. Environmental Monitoring and

Assessment 60, 89-102.

Johnson, D.L., Ambrose, S.H., Bassett, T.J., Bowen, M.L., Isaacson, J.S., Crummey, D.E.,

Johnson, D.N., Lamb, P., Saul, M., Winter-Nelson, A.E., 1997. Meaning of environmental

terms. Journal of Environmental Quality 26, 581-589.

Kayser, A. (2000) Evaluation and enhancement of phytoextraction of heavy metals from

contaminated soils. PhD Thesis. Swiss Federal Institute of Technology, Zurich, pp. 149.

Kayser, A., Wenger, K., Keller, A., Attinger, W., Felix, H. R., Gupta, S., Schulin, R., 2000.

Enhancement of phytoextraction of Zn, Cd and Cu from calcareous soil: The use of NTA

and sulphur amendments. Environmental Science and Technology 34,1778-1783.

Keeney, R.L., Raiffa, H. 1976. Decisions with multiple objectives: Preferences and value tradeoffs.

John Wiley & Sons, New York, pp 569.

Keller, A., Jauslin, M., Schulin, R. 1999. Bodendaten und Stoffflussanalysen im

Schwermetallbelastungsgebiet Dornach. Volkswirtschaftsdepartement des Kantons

Solothurn, Amt für Umweltschutz, Solothurn, pp. 43.

Kleindorfer, P.R., Kunreuther, H.C., Schoemaker, P.J.H., 1993. Decision sciences. An integrative

perspective. Cambridge University Press, Cambridge, pp. 470.

Krebs, R., 1996. In situ immobilization of heavy metals in polluted agricultural soil - an approach to

gentle soil remediation. PhD Thesis. Swiss Federal Institute of Technology, Zurich, pp.

110.

Kumar, P.B.A.N., Dushenkov, V., Motto, H., Raskin, I., 1995. Phytoextraction: The use of plants to

remove heavy metals from soils. Environmental Science and Technology 29,1232-1238.

Laczko, E., Rudaz, A., Aragno, A., 1996. Mikrowelle Diversität in Böden - Stabilität von

anthropogen beeinflussten oder gestörten Mirkoorganismen in Böden. Mitteilungen der

Deutschen Bodenkundlichen Gesellschaft 81, 279-281.

96

Page 108: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case ofsoil remediation

Levine, M.M., Kaper, J.B., 1996. Cholera: pathogenesis and vaccine development. In: Drasar, B. S.

and Forrest, B. D., Cholera and the ecology of vibrio cholerae. Chapman & Hall, London,

pp. 126-186.

Limpert, E., Stahel, W.A., Abbt, M., 2001. Log-normal distributions across the sciences: Keys and

clues. Bioscience 51 (6), 341-352.

Lombi, E., Zhao, F.J., Dunham, S.J., McGrath, S.P., 2001. Phytoremediation of heavy metal-

contaminated soils: Natural hyperaccumulation versus chemically enhanced

phytoextration. Journal of Environmental Quality 30 (6), 1919-1926.

May, T.W., Scholz, R.W., Nothbaum, N., 1991. Die Anwendung des Donator-Akzeptor-Modells am

Beispiel Cadmium für den Pfad "Boden-Weizen-Mensch". In: Bütow, E., Ableitung von

Sanierungswerten für kontaminierte Böden. Erich Schmidt Verlag, Berlin, pp. 99-182.

McGrath, S.P., Zhao, F.J., Lombi, E., 2002. Phytoremediation of metals, metalloids, and

radionuclides. Advances in Agronomy 75,1-56.

Oliver, M.A., 1997. Soil and human health: A review. European Journal of Soil Science 48, 573-

592.

Papazoglou, I.A., Bonanos, G., Briassoulis, H., 2000. Risk informed decision making in land use

planning. Journal of Risk Research 3 (1), 69-92.

Parson, S., Wooldridge, M., 2002. An introduction to game theory and decison theory. In: Parson,

S., Gmytrasiewicz, P., Woolridge, M., Game theory and decision theory in agent-based

systems. Kluwer, Boston Dordrecht London, pp. 1-26.

Paustenbach, D.J., 2002. Exposure assessment. In: Paustenbach, D.J., Human and Ecological

Risk Assessment. Wiley Interscience, New York, pp. 189-271.

Prato, T., 2003. Multiple-attribute evaluation of ecosystem management for the Missouri River

system. Ecological Economics 45,297-309.

Quiggin, J., 1989. Sure things - dominance and independence rules for choice under uncertainty.

Annals of Operations Research 19, 335-357.

Rivett, M.O., Petts, J., Butler, B., Martin, I. 2002. Remediation of contaminated land and

groundwater: Experience in England and Wales. Journal of Environmental Management

65, 251-268.

Ruck, A., 1990. Bodenaufnahme durch Kinder - Abschätzungen und Annahmen. In: Rosenkranz,

D., Einsele, G. and Harress, H. M., Bodenschutz: Ergänzbares Handbuch der

Massnahmen und Empfehlungen für Schutz, Pflege und Sanierung von Böden, Landschaft

und Grundwasser. Erich Schmidt Verlag, Berlin

Saaty, T.L., 1990. The analytic hierarchy process. Pittburgh, RWS. pp. 235.

Schmitt, H.W., Sticher, H., 1986. Prediction of heavy metal contents and displacement in soils.

Zeitschrift für Pflanzernährung und Bodenkunde 149,157-171.

Schaerli, M., Scholz, R.W., Tietje, O., Heitzer, A., Hesske, S. 1997. Integrale Beurteilung sanfter

Bodensanierungen bei grossflächigen Schwermetallbeslatungen: Die ökologische

Perspektive. Mitteilungen der Deutsche Bodenkundliche Gesellschaft. 85 (2), 765-768.

97

Page 109: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 4

Schnabel, U., Tietje, O., Scholz, R.W., 2002. Uncertainty assessment for management of soil

contamination with sparse data. Environmental Management, in press.

Scholz, R.W., 1981. Stochastische Problemaufgaben - Analysen aus didaktischer und

psychologischer Perspektive. Materialien und Studien 23, Institut für Didaktik der

Mathematik, Universität Bielefeld, Bielefeld, pp. 130.

Scholz, R.W., Nothbaum, N., May, T.W., 1994. Fixed and hypothesis-guided soil sampling methods

- Principles, strategies, and examples In: Markert, B., Environmental sampling for trace

analysis, VCH, Weinheim, pp. 335-345.

Scholz, R.W., Sell, J., Tietje, O., Hesske, S., Semadini, M., Weber, O., 1999. Perspektiven

integraler Bewertung zur Sanierung schwermetallbelasteter Böden. Terra Tech 6,21-23.

Scholz, R.W., Tietje, O. 2002. Embedded case study methods. Sage Publications, Thousand Oaks,

London, New Delhi, pp 392.

Schulin, R., Borer, F. 1997. Schwermetall-Belastungen von Kulturböden im Siedlungsraum: das

Beispiel Dornach. Altlasten und Raumplanung, ORL-Berichte, Eidgenössische Technische

Hochschule, Zürich, pp. 141-151.

Schulin, R., Gupta, S.-K. 2002. Sanfte Sanierung von schwermetallbelasteten Böden - der Fall

Dornach. Terra Tech 2, 36-38.

Seil, J., Weber, O., Scholz, R.W. 2000. Liegenschaftsschatzungen und Bodenbelastungen. Natural

and Social Science Interface, Working Paper 25, Swiss Federal Institute of Technology,

Zürich, pp. 25.

Tietje, O., Scholz, R.W. 1996. Wahrscheinlichkeitskonzepte und Umweltsysteme. In: Gheorghe, A.,

Seiler, H., Was ist Wahrscheinlichkeit? vdf Hochschulverlag AG., Zürich, pp. 31-49.

Tuchschmid, M.P., 1995. Quantifizierung und Regionalisierung von Schwermetall- und

Fluorgehalten bodenbildender Gesteine der Schweiz. Bundesamt für Umwelt, Wald und

Landschaft, Bern, pp. 130.

Tuzlukov, V., 2001. Signal detection theory. Birkhäuser.Boston Basel Berlin, pp. 723 .

U.S. Environmental Protection Agency, 2000. Brown fields liability and cleanup Issues. U.S.

Environmental Protection Agency. Available at: Http://www.epa.gov/swerosps/bf/liab.htm

Van Groenigen, J.W., 2000. The influence of variogram parameters on optimal sampling schemes

for mapping by kriging. Geoderma 97, 223-236.

VBBo, 1998. Verordnung über Belastungen des Bodens vom 1. Juli 1998 (Swiss Ordinance on the

Pollution of Soils). Bern,

von Steiger, B., Webster, R., Schulin, R., Lehmann, R., 1996. Mapping heavy metals in polluted

soil by disjunctive kriging. Environmental Pollution 94 (2), 205-215.

von Winterfeldt, D., Edwards, W., 1986. Decision analysis and behavioral research. Cambridge

University Press, Cambridge, pp. 604.

Wagner, G., Desaules, A., Muntau, H., Theocharopoulus, S., Quevauviller, Ph., 2001.

Harmonisation and quality assurance in pre-analytical steps of soil contamination studies -

conclusions and recommendations of the CEEM Soil project. The Science of the Total

Environment 264,103-117.

98

Page 110: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Decision making under uncertainty in case of soil remediation

Wagner, G., Lischer, P., Theocharopoulos, H., Muntau, H„ Desaules, A., Queveauviller, P., 2001.

Quantitative evaluation of the CEEM soil sampling intercomparison. The Science of the

Total Environment 264, 73-101.

Weber, O., Scholz, R. W., Bühlmann, R., Grasmück, D., 2001. Risk perception of heavy metal soil

contamination and attitudes toward decontamination strategies. Risk Analysis 21 (5), 967-

977.

Wenger, K., Gupta, S.K., Furrer, G., Schulin, R., 2002. Zinc extraction potential of two common

crop plants, Nicotiana tabacum and Zea mays. Plant and Soil 242, 217-225.

Werner, F., Scholz, R.W., 2002. Ambiguities in decision-oriented life cycle inventories. International

Journal of Life Cycle Assessment 7 (6), 330-338.

Wirz, E., Winistörfer, D., 1987. Bericht über Metallgehalte in Boden- und Vegetationsproben aus

dem Raum Dornach. Kantonales Laboratorium, Solothurn, pp. 49.

99

Page 111: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Seite LeerBlank

Page 112: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

5 Conclusion

The major goal of this thesis was to improve spatial data analysis, the

prediction and the evaluation of heavy metals in soil for the purpose of decision¬

making on the application of soil remediation methods. To accomplish this

objective, the spatial distribution of cadmium concentrations in soil was analyzed

from two different areas in Switzerland. The study started with the evaluation of a

method for multidimensional explorative data analysis and further focused on the

application of geostatistical measures to forecast cadmium contamination of

unsampled areas. The prediction revealed uncertain, but nonetheless suitable

results. The suitability has been demonstrated by the evaluation of different soil

remediation methods. At the end a spatial choice of remediation methods is

delineated based on the benefits gained after removal of pollutants. In the

following the main results are described in more details.

5.1 Multiple spatial regression improves data management

Data smoothing or - mollifying - with a Kernel density regression is used for

a combined spatial and statistical analysis. The Mollifier method was applied with

a data set of cadmium, copper, zinc, nickel, and chromium samples attained from

the Furttal, north west of Zurich (Switzerland). The evaluation of the method

produced two main results that are suitable to improve data analysis and

prediction of cadmium content in soil and other variables:

a) Blankets drawn from three-dimensional regression clearly visualize spatial

interrelation between different heavy metals. Further, spatial relevant boundary

variables like land use, pH value, and altitude support an identification of spatial

tendencies, hot spots, and outliers. Herewith, the Mollifier method bridges a gap

Page 113: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 5

between explanatory data analysis and three-dimensional visualization of spatial

interrelated parameters.

b) The spatial interpolation with a Gaussian Kernel density function attained

results equally good to those of ordinary kriging (the prediction for cadmium

reveals a mean of 275 ug per kg soil from ordinary kriging and a mean of 308 pg

per kg soil from the Mollifier interpolation). Moreover, the Mollifier method is a

convenient alternative to kriging, when the stochastic, spatial structure of

underlying data shows a spatial trend.

However, the practicability of the Mollifier method strongly depends on the

density of data. Therefore, a precautionary fit of regression parameters, like

window size and regression function was essential.

5.2 Predictability versus uncertainty

The second data set consists of 76 samples, spread over an area of 15

square kilometers in Dornach (Canton Solothurn, Switzerland). The area-wide

distribution of cadmium was assessed with geostatistical methods. Since cadmium

was emitted from a metal smelter, the data exploration focused on trend models

and anisotropy. However, the consideration of contamination patterns reveals no

significant improvement for spatial prediction. Lognormal ordinary kriging and

conditional simulation was applied for interpolation. The outcome was a geometric

mean distribution of 1.08 mg cadmium per kg soil and a standard deviation of 2.27

mg/kg. After an assessment of the overall uncertainty, it was ascertained that the

high uncertainty predominantly results from the spatial interpolation and was

induced by low sampling density and spatial heterogeneity of contaminants.

Regardless of high mean uncertainty for the total area of 15 square kilometers the

use of probability knowledge at single small areas provided practicable and

useable results. Conclusively, a small part of the investigated area is contaminated

with more than 2 mg cadmium per kg soil with a probability of more than 90 %,

however a contamination above the legal remediation value (20 resp. 30 mg/kg) is

not likely at any side. Cadmium concentrations above 0.8 mg/kg soil are

widespread with more than 50 % probability of occurrence. The exact spatial

allocation of these results were presented in maps showing likelihood's to exceed

threshold values in accordance to legal compliance. The probability information

102

Page 114: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Conclusion

can be dedicated to any other threshold value, taking into account the decision

makers safety requirements. Herewith, the estimation can be used to delineate

areas for soil improvement, remediation, or restricted area use.

5.3 'Correct' decisions for soil improvement measures

The application of soil improvement measures was evaluated with a multi-

criteria utility analysis for soil washing, phytoremediation and the alternative of no

remediation. Considering different aspects of impacts, the constructed utility

functions asked for simple criteria when discrete and continuous functions were

used. Presuming some strong assumptions, a linear utility function was modeled

to express the benefit from 'cost of remediation techniques' and 'change of market

value'. Non-linear utility functions were developed for the impact on 'human health'

and 'long-term productivity of soil' after the application of remediation methods.

Without weighting single criteria with a decision maker's preference, 'human

health' and 'long-term productivity' of soil were decisive arguments for all

remediation alternatives.

The evaluation with an additive overall utility revealed a spatial structure of

highest scores for soil washing and no remediation. Soil-washing methods were

preferred within a distance of 300 m to the source of contamination, whereas no

remediation were recommended beyond this distance. Phytoremediation was

superior only in few cases with a cadmium concentration around 3 mg/kg soil. This

is a consequence of the limited capacity of phytoremediation to extract cadmium

from soil to plants, which have been accounted for in the utility function on long-

term productivity of soil.

The uncertainty of decisions was validated with the concept of expected

utility. The probability density function from geostatistically predicted mean values

for cadmium served as a model for any true value that is likely to occur beside the

prediction. The model was considered as a quality criterion in four kinds of

decisions (hit, false alarm, correct rejection, miss). The results are calculated for

the Trigger value of Swiss Ordinance Relating to Pollutants in Soil, which is 2 mg

cadmium per kg soil. Then the probability of any true value falling below or above

the Trigger value is able to determine the magnitude of failure cost.

103

Page 115: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Chapter 5__

Concluding, the investigated methods improve scaling and transmission of

sparse information in the context of soil improvement measures. Although the

prediction of cadmium revealed no reliable deterministic estimation the handling of

uncertainties with probability information led to useable and trustworthy results for

the decision maker. Therefore the choice of best suitable soil improvement

measures is transparent and avoids ambiguous decisions.

5.4 Outlook

This thesis showed that despite of a sparse data set, data exploration and

prediction is not only feasible, but also results in convenient and reliable decisions.

The experiences gathered throughout the study concerning data, prediction and

remediation methods, lead to a couple of desirable foci for future research studies.

In Switzerland national soil observation services (NABO and others) collect

data about pollutants in soil. Since these data are sampled at single random

points, a method for area-wide prediction of pollutants in soil is so far missing.

With the methods and results demonstrated in this study, the collected data can be

reliably analyzed and spatially predicted. Thus, it becomes possible to delineate

areas of a priority treatment for soil improvement measures or soil remediation.

This is of particular interest for a nationally consistent processing of measures to

sustain soil resources. The compilation of a national standardized soil database

may benefit from the methods of data analysis and interpolation demonstrated in

this thesis. Future work that aims for an interpretation of sparse data should

include boundary land use variables and the use of a Geographical Information

System (GIS). From a technical point of view the Mollifier method should be

integrated into a GIS in order to not only confirm an efficient and user-friendly

explorative data analysis, but also to bridge the gap between three dimensional

data analysis and visualization of spatial interdependency of variables.

Future work should focus on the application of phytoremediation as a soil

improvement measure. Since phytoremediation is only feasible to reduce a certain

but limited amount of heavy metals from soil, it is rather suitable to be used as a

tool to improve soil quality, e.g. in regional planning, than to be used as a sufficient

remediation method. In the context of regional planning, phytoremedation can be

used e.g. as an ecological buffer area or for a periodical decontamination of lowly

104

Page 116: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Conclusion

contaminated agricultural areas. Herewith, the aim to sustain soil fertility, claimed

in Swiss legislation, could be realized in practice.

Finally the results presented in this thesis may help for a break through in the

accomplishment of sustainable soil protection and further sustainable development

of the natural environment.

105

Page 117: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Seite LeerBlank

Page 118: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Danksagung

Zahlreiche Personen haben mich in den letzten Jahren bei der

Bearbeitung dieser Dissertation auf unterschiedlichste Weise begleitet und

unterstützt. Die ersten zwei Jahre am Institut für Kulturtechnik waren durch

eine in jeder Hinsicht sehr offene und wissenschaftliche Atmosphäre geprägt,

die mich mit dem institutionellen, strategischen und politischen Alltag an der

ETH vertraut machte. Die letzten drei Jahre an der Professur für Umweltnatur-

und Umweltsozialwissenschaften stellte eine grosse thematische Bereiche¬

rung dar, in der ich mich auf die wesentlichen Fragen meiner Arbeit

konzentrieren konnte. Allen Kollegen und Freunden sei an dieser Stelle für

Rat, Tat und Freundschaft herzlich gedankt. Mein ganz besonderer Dank gilt

folgenden Personen:

Prof. Dr. Roland W. Scholz für die Übernahme des Referats und die

Möglichkeit die Arbeit an seiner Professur zu Ende zu führen. Ich habe die

durch Vertrauen und Grosszügigkeit geprägte Arbeitsatmosphäre sowie den

zu jeder Zeit möglichen Dialog sehr geschätzt.

Dr. Olaf Tietje für die umfassende, weitreichende und beständige Zusammen¬

arbeit in den letzten fünf Jahren. Olaf Tietje ist es gelungen einem Fisch das

Fahrrad fahren beizubringen, indem er mit viel Geduld meine Gedanken

immer wieder in die richtigen mathematischen Bahnen lenkte. Für die

zahlreichen, mathematischen Diskussionen möchte ich mich ganz besonders

bedanken.

Dr. Martin Fritsch für visionäre Idee zu dem vorliegenden Projekt, durch das

ich wertvolle Erfahrungen während eines Forschungsaufenthaltes am

International Institute for Applied System Analysis (IIASA) sammeln konnte.

Page 119: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Prof. Dr. Willy Schmid für die freundliche Übernahme des Korreferats und die

finanzielle Unterstützung.

Hélène Renard-Demougeot gab mir wertvollen geostatistischen Tipps und

Tricks. Merci beaucoup!

Ich danke den Mitgliedern der Bewertungsgruppe des Integrierten Projektes

Boden (IP Boden, Schwerpunktprogramm Umwelt)), insbesondere Armin

Keller und Achim Kayser, für die anregenden Diskussionen über

Bodenbewertungen und Phytoremediation.

Von Herzen möchte ich mich bei meinen Freunden bedanken, die mir immer

wieder die Perspektive gaben, dass die Dissertation nur die eine Hälfte

meines Lebens ist:

Christoph Jung für die niemals endende Zuversicht, die Geduld und den

Ansporn, der mir nicht nur das Ende dieser Arbeit erleichtert hat.

Dani Güttinger für die Wärme, die Herzlichkeit, das offene Ohr zu jeder Zeit

und den besten Wein der Welt in guten und in schlechten Zeiten.

Marcus Fenchel für die unendliche Fröhlichkeit, die (gewonnenen!)

Sektwetten und die Austreibung des Morgenmuffels in mir.

Anita Jost, Sonja Wipf und Verena Schatanek für das Gefühl in einem frem¬

den Land zu Hause zu sein. Danke für die tiefe Freundschaft und den

Glauben an die Richtigkeit meines Tuns!

Julia Bolzek für die Inspiration, die konstruktive Kritik und die immerwährende

Freundschaft, trotz so mancher Leere, die ich hinterlassen musste.

Markus Gallus - für die vorbehaltlose, tiefe Freundschaft und den rettenden

Anker in jeder Situation.

Ein ganz besonderer Dank gebührt meinen Eltern Klaus und Ruth Schnabel,

die mich zu jeder Zeit ermutigt haben, den Weg zu gehen, den ich für richtig

halte. Ohne ihre Unterstützung hätte ich diese Arbeit nicht schreiben können.

108

Page 120: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Curriculum Vitae

1966 Born in Lehrte, Germany

1972-1976 Primary School, Lehrte

1977-1983 Secondary School, Lehrte

1983-1988 Bookseller, Schmorl und von Seefeld, Hannover

1987-1988 Advanced technical college certificate in economics,

Economy Secondary School, Hannover

1988-1990 University entrance diploma by second-chance education,

Hannover-Kolleg, Hannover

1990-1997 Studies of Geography, Soil Science and Geology

Georg-August Universität Göttingen and Technical

University of Hannover

Diploma Thesis: Bewertung von Landschaftseingriffen in

Skigebieten der französischen Alpen (Tarentaise)/Tignes

- La Plagne - Méribel unter besonderer Berücksichtung

des Sommertourismus

2000 Participant of the Young Scientist Summer Programm

Land Use and Land Cover Change Project,

International Institute of Applied System Analysis,

Laxenburg

1998-2002 PhD Student

Institute for Land Improvement and Water Management

and Natural and Social Science Interface

Swiss Federal Institute of Technology, Zurich

Page 121: Rights / License: Research Collection In Copyright - Non ...26837/eth-26837-02.pdffrom linear and non-linear utility functions for the criteria i) human health ii) long-term productivity

Publications

Ute Schnabel and Olaf Tietje (2003): Explorative data analysis of heavy metal

contaminated soil using multidimensional spatial regression, Environmental

Geology 44 (8), 893-904

Ute Schnabel, Olaf Tietje and Roland W. Scholz (in press): Uncertainty

assessment for management of soil contaminants with sparse data,

Environmental Management

Ute Schnabel, Olaf Tietje and Roland W. Scholz (2002): Using the power of

information of sparse data for soil improvement management, Working Paper

33, Natural and Social Science Interface, Swiss Federal Institute of

Technology Zurich, pp. 27

Martin Fritsch, Olaf Tietje, Erwin Hepperle und Ute Schnabel (1999):

Landnutzungsmodelle als Planungsinstrumente für den Einsatz von sanften

Bodensanierungen, Terra Tech 6, 24-27

Manuscripts

Ute Schnabel and Roland W. Scholz (submitted): Decision making under

uncertainty in case of soil remediation, Journal of Environmental Management

Ute Schnabel (2000): Mollifier or B.L.U.E. A comparison of two interpolation

methods, Final report for the Young Scientist Summer Program, International

Institute for Applied System Analysis, Laxenburg, pp 28, unpublished

Ute Schnabel (1998): Methodische Grundlagen zur Ausscheidung von

Sanierungseinheiten kontaminierter Böden im städtischen Umfeld, interner

Bericht, Institut für Kulturtechnik, Eidgenössische Technische Hochschule,

Zürich, 16 S., unveröffentlicht

110