estimating variogram uncertainty (2004)

33
Mathematical Geology, Vol. 36, No. 8, November 2004 ( C 2004) Estimating Variogram Uncertainty 1 B. P. Marchant 2 and R. M. Lark 2 The variogram is central to any geostatistical survey, but the precision of a variogram estimated from sample data by the method of moments is unknown. It is important to be able to quantify variogram uncertainty to ensure that the variogram estimate is sufficiently accurate for kriging. In previous studies theoretical expressions have been derived to approximate uncertainty in both estimates of the experimental variogram and fitted variogram models. These expressions rely upon various statistical assumptions about the data and are largely untested. They express variogram uncertainty as functions of the sampling positions and the underlying variogram. Thus the expressions can be used to design efficient sampling schemes for estimating a particular variogram. Extensive simulation tests show that for a Gaussian variable with a known variogram, the expression for the uncertainty of the experimental variogram estimate is accurate. In practice however, the variogram of the variable is unknown and the fitted variogram model must be used instead. For sampling schemes of 100 points or more this has only a small effect on the accuracy of the uncertainty estimate. The theoretical expressions for the uncertainty of fitted variogram models generally overestimate the precision of fitted parameters. The uncertainty of the fitted parameters can be determined more accurately by simulating multiple experimental variograms and fitting variogram models to these. The tests emphasize the importance of distinguishing between the variogram of the field being surveyed and the variogram of the random process which generated the field. These variograms are not necessarily identical. Most studies of variogram uncertainty describe the uncertainty associated with the variogram of the random process. Generally however, it is the variogram of the field being surveyed which is of interest. For intensive sampling schemes, estimates of the field variogram are significantly more precise than estimates of the random process variogram. It is important, when designing efficient sampling schemes or fitting variogram models, that the appropriate expression for variogram uncertainty is applied. KEY WORDS: ergodic, nonergodic, error, simulation tests. INTRODUCTION The variogram characterizes the structure of spatial correlation of a variable and is central to any geostatistical survey. It expresses the variance of the difference be- tween two observations of the variable as a function of the lag vector that separates them. A variogram estimate, expressed as a mathematical function, is required to 1 Received 12 November 2003; accepted 18 May 2004. 2 Silsoe Research Institute, Wrest Park, Silsoe, Bedford, MK45 4HS, United Kingdom; e-mail: [email protected] 867 0882-8121/04/1100-0867/1 C 2004 International Association for Mathematical Geology

Upload: willy-handa-nuraga

Post on 05-Sep-2015

236 views

Category:

Documents


0 download

DESCRIPTION

estimasi variogram

TRANSCRIPT

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Mathematical Geology, Vol. 36, No. 8, November 2004 ( C 2004)

    Estimating Variogram Uncertainty1

    B. P. Marchant2 and R. M. Lark2

    The variogram is central to any geostatistical survey, but the precision of a variogram estimated fromsample data by the method of moments is unknown. It is important to be able to quantify variogramuncertainty to ensure that the variogram estimate is sufficiently accurate for kriging. In previousstudies theoretical expressions have been derived to approximate uncertainty in both estimates of theexperimental variogram and fitted variogram models. These expressions rely upon various statisticalassumptions about the data and are largely untested. They express variogram uncertainty as functionsof the sampling positions and the underlying variogram. Thus the expressions can be used to designefficient sampling schemes for estimating a particular variogram. Extensive simulation tests show thatfor a Gaussian variable with a known variogram, the expression for the uncertainty of the experimentalvariogram estimate is accurate. In practice however, the variogram of the variable is unknown andthe fitted variogram model must be used instead. For sampling schemes of 100 points or more thishas only a small effect on the accuracy of the uncertainty estimate. The theoretical expressions forthe uncertainty of fitted variogram models generally overestimate the precision of fitted parameters.The uncertainty of the fitted parameters can be determined more accurately by simulating multipleexperimental variograms and fitting variogram models to these. The tests emphasize the importanceof distinguishing between the variogram of the field being surveyed and the variogram of the randomprocess which generated the field. These variograms are not necessarily identical. Most studies ofvariogram uncertainty describe the uncertainty associated with the variogram of the random process.Generally however, it is the variogram of the field being surveyed which is of interest. For intensivesampling schemes, estimates of the field variogram are significantly more precise than estimates ofthe random process variogram. It is important, when designing efficient sampling schemes or fittingvariogram models, that the appropriate expression for variogram uncertainty is applied.

    KEY WORDS: ergodic, nonergodic, error, simulation tests.

    INTRODUCTION

    The variogram characterizes the structure of spatial correlation of a variable and iscentral to any geostatistical survey. It expresses the variance of the difference be-tween two observations of the variable as a function of the lag vector that separatesthem. A variogram estimate, expressed as a mathematical function, is required to

    1Received 12 November 2003; accepted 18 May 2004.2Silsoe Research Institute, Wrest Park, Silsoe, Bedford, MK45 4HS, United Kingdom; e-mail:[email protected]

    867

    0882-8121/04/1100-0867/1 C 2004 International Association for Mathematical Geology

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    868 Marchant and Lark

    krige or simulate a spatially correlated variable (Webster and Oliver, 2001). How-ever, both of these techniques assume that the variogram of the variable is known,whereas in reality the variogram must be estimated from the available data. There-fore there is some unavoidable uncertainty associated with variogram estimate.In this paper, we discuss methods of quantifying this uncertainty for variogramsestimated by the method of moments.

    Variogram uncertainty has been considered previously in a number of dif-ferent contexts. Webster and Oliver (1992) measured the uncertainty of vari-ograms estimated from different sampling schemes to determine whether the sam-pling schemes were adequate for variogram estimation. Muller and Zimmerman(1999) and Bogaert and Russo (1999) have suggested techniques for design-ing sample schemes where the sample points are positioned to minimize thevalue of a theoretical expression of variogram uncertainty. The same theoret-ical expressions are used to fit variogram models in a way that accounts forthe difference in accuracy of the experimental semivariance at each lag distance(Cressie, 1985). Some measure of variogram uncertainty is also important whenconsidering the reliability of simulated or kriged estimates derived from the esti-mated variogram (Brooker, 1986; Todini, 2001; Todini, Pellegrini, and Mazzetti,2001).

    We draw attention to three possible problems with previous approaches forestimating variogram uncertainty. The reliability of the theoretical expressions ofvariogram uncertainty used by Muller and Zimmerman (1999) and Bogaert andRusso (1999) have not been tested comprehensively. Yet the expressions are onlyapproximate and rely upon certain statistical assumptions. Furthermore, generallyit is the error in estimating the variogram of the field being surveyed which isof interest. However, the theoretical expressions used by Muller and Zimmerman(1999) and Bogaert and Russo (1999) quantify the expected error in the exper-imental variogram as an approximation to the variogram of the random processwhich generated the field. Finally, the theoretical expressions to determine theuncertainty depend on the variogram of the random process. When applying theseexpressions, the variogram of the random process must be approximated by amodel fitted to the experimental variogram. Thus this approach to the estimationof variogram uncertainty is circular.

    Here, through experiments on simulated data sets, we assess the impact ofeach of these concerns. We follow Brus and de Gruijter (1994) in referring tothe variogram averaged over all realizations of the underlying random process asthe ergodic variogram, and the exhaustive variogram of the single realization orfield being sampled as the nonergodic variogram. Journel and Huijbregts (1978)refer to these as the theoretical and local variograms respectively. First, we testthe accuracy of the theoretical expressions for the uncertainty of the methodsof moments variogram as an estimate to a known ergodic variogram. Second,we consider the uncertainty associated with an experimental variogram estimate

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 869

    to a nonergodic variogram when (for the purpose of assessing uncertainty) theergodic variogram is known. In addition we compare the magnitude of the er-rors when using the experimental variogram as an estimate of the ergodic andnonergodic variogram. Third, we test the accuracy of the theoretical expressionsfor the uncertainty of the methods of moments variogram estimates to an un-known ergodic variogram. In this case the uncertainty expressions are calculatedusing a model fitted to the experimental variogram, rather than the correct ergodicvariogram.

    We denote the experimental variogram estimate by (h), the ergodic var-iogram by (h) and the nonergodic variogram by NE(h). We assume that thevariograms are isotropic and therefore functions of the scalar separation distanceh. We now present the three problems being addressed in more detail and describeprevious studies of them.

    Uncertainty of Estimates to a Known Ergodic Variogram

    Previous studies of variogram uncertainty have mostly concentrated on es-timates of the ergodic variogram. Cressie (1985), Ortiz and Deutsch (2002), andPardo-Iguzquiza and Dowd (2001a) suggested similar expressions for the covari-ance matrix of experimental variogram estimates to the ergodic variogram. Theseexpressions are functions of both the sampling scheme and the ergodic variogram.The elements of the main diagonal of the covariance matrix represent the varianceof the experimental variogram estimates at each separating distance. For conve-nience, we refer to the standard error at each separating distance as the ergodicerror. The ergodic error is the result of two different types of fluctuation. We aremost concerned with the sampling error, that is the expected difference betweenthe variogram estimate (h) and the nonergodic variogram of the realization beingsampled NE(h). However, the ergodic error also includes the effect of fluctua-tions between the ergodic variogram (h) and the nonergodic variogram NE(h).Pardo-Iguzquiza and Dowd (2001a) also consider how the uncertainty of the ex-perimental variogram may be incorporated into the uncertainty of fitted variogramparameters. This leads to an expression for the covariance matrix of variogramparameters fitted by generalized least squares (GLS).

    Few previous tests of the reliability of expressions of variogram uncertaintyhave been carried out. Pardo-Iguzquiza and Dowd (2001a) applied their expressionto a particular case study and confirmed qualitatively that variogram uncertaintyvaried with lag distance in the manner they expected. Ortiz and Deutsch (2002)applied two different methods of simulation to test their expressions of variogramuncertainty. One method simulated multiple values of a random variable at setsof two pairs of locations. The observed covariances between the variogram es-timates from each pair were in good agreement with their expressions. In the

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    870 Marchant and Lark

    second test they simulated multiple realizations of the random variable at a setof sampling points and calculated the experimental variogram for each realiza-tion. This was referred to as the global simulation method. The observed vari-ances of semivariances from the global method were generally less than thosepredicted by their expressions. Difficulties in simulating realizations that hon-oured the variogram function, particularly for long lag distances, were blamedfor these discrepancies. McBratney and Webster (1986) used a similar methodof simulation to establish confidence intervals on experimental variogramestimates.

    Expected Error in Estimates to the Nonergodic Variogramfor a Known Ergodic Variogram

    Munoz-Pardo (1987) derived expressions for the uncertainty of estimatesto the nonergodic variogram. He separated the two components of fluctuationwithin the ergodic error to approximate the expected error in approximating thenonergodic variogram NE(h) by the experimental variogram (h). We refer tothis quantity, which may be thought of as the sampling error, as the nonergodicerror. It is this quantity that is of interest when optimizing sample schemes forvariogram estimation. Muller and Zimmerman (1999) and Bogaert and Russo(1999) designed optimal sample schemes for variogram estimation by minimizingthe ergodic error. Therefore we investigated both the reliability of Munoz-Pardos(1987) expressions and the difference between the ergodic error and the nonergodicerror.

    Prior to our investigations, Munoz-Pardos (1987) expressions had not beenvalidated comprehensively. Munoz-Pardo (1987) used his expressions to calcu-late the expected sampling error of variogram estimates for different samplingschemes, ergodic variograms, and field sizes. He found that the ratio of variogramrange, that is the distance over which the variable is spatially-correlated, to fieldsize was a critical factor in determining the nonergodic error. Other authors haveattempted to establish confidence bands on nonergodic variograms using simu-lated data. Webster and Oliver (1992) carried out extensive simulation tests inorder to estimate the nonergodic error when applying different sampling schemes.Motivated by the findings of Munoz-Pardo (1987), they examined data sets withdifferent ratios of variogram range to field size. They also varied the basic struc-ture of the ergodic variogram used to simulate the data. They found that between150 and 225 sampling points are required to estimate the variogram accurately.In each of their simulation tests they sampled the same region several times bytranslating the sampling grid across the region. Although they ensured that thesame point was not sampled by two different versions of the grid, they might haveunderestimated the expected error of variogram estimates because of correlation

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 871

    between the samples. We discuss this correlation and the effect it has on the errorestimates later in this paper. Webster and Oliver (1992) briefly compared theirobserved confidence intervals with those of Munoz-Pardo (1987), and saw somesimilarities.

    Uncertainty of Estimates to an Unknown Ergodic Variogram

    All of the expressions of variogram uncertainty described above are functionsof the ergodic variogram. However, in any real survey the ergodic variogram wouldnot be known; it would be approximated by the estimated variogram. The effectof this approximation has neither been accounted for in the theoretical studies norestimated from simulated data.

    THEORY

    Estimating the Variogram

    In geostatistics we regard the value of a variable at a location x as a re-alization of a random function Z (x). This random function is assumed to beintrinsically stationary. This is a weak form of second-order stationarity andis met if two conditions hold. The first is that the expected value of the ran-dom function, E[Z(x)], is constant for all x. Secondly, the variance of the dif-ferences between the value of the variable at two different locations dependsonly on the lag vector separating the two locations and not on the absolute lo-cations. In general, this variance may be a function of both the direction andlength of the lag vector. In this study isotropic variograms only are considered.These are purely a function of the length of the vector which we denote h.Thus the relationship between values from different locations is described by thevariogram

    (h) = 12

    E[(Z (x) Z (x + h))2]. (1)

    The variogram is estimated from variable values observed at sampled points,xs , s = 1, . . . , n. The method of moments estimator is the average of squared dif-ferences between observations separated by distance h. Pairs of observations aredivided amongst different bins based upon their separating distance. If the obser-vations are on a regular sampling grid, then bins consisting of pairs with exactlythe same separating distance may be chosen. Otherwise a small tolerance mustbe placed on the separating distances associated with each bin. The experimental

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    872 Marchant and Lark

    variogram (h j ), j = 1, . . . , k is then estimated by

    (h j ) = 12n(h j )n(h j )i=1

    [zi1(h j ) zi2(h j )

    ]2, (2)

    where n(h j ) is the number of pairs in the bin centred on separating distance h j ,and zi1(h j ), zi2(h j ) are the i th pair of observed values in this bin.

    Kriging and simulation require that the variogram is expressed as a mathemat-ical function or model. This function must obey several mathematical constraintsto describe random variation and to avoid negative variances. This is achievedtypically by fitting a suitable function to the experimental variogram. The math-ematical constraints, and the most commonly used authorized functions whichobey them, are described by Webster and Oliver (2001). In practice, the modeltype may be chosen by visual inspection of the experimental variogram or, afterfitting the model, by more formal criteria such as the Akaike Information Criterion(McBratney and Webster, 1986). Webster and Oliver (2001) recommend that themodel type should be chosen by a procedure which combines visual and statisticalassessment.

    Each function has a few parameters that are selected to fit the function to theexperimental variogram. Different methods are used to estimate these parameters.Some practitioners do so by eye, but most prefer more objective methods. Cressie(1985) describes three mathematical techniques for fitting the parameter values.The simplest is the least squares method. If is the vector of p variogram pa-rameters, (h; ) the corresponding parameterised variogram function, and k thenumber of experimental variogram bins, then the method of least squares chooses that minimizes

    ki=1

    ( (hi ) (hi ; ))2. (3)

    However, the reliability of each experimental semivariance (hi ) varies accordingto the number of point pairs used to describe it and the actual value of (hi ).Therefore it is better to use weighted least squares and minimize

    ki=1

    wi ( (hi ) (hi ; ))2, (4)

    where wi is a weighting function. The weighting function may be set proportionalto n(hi ) or, in order to account for the inverse relation between the reliability of

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 873

    an estimate of variance and the variance itself,

    wi = n(hi )

    (hi )2, (5)

    may be specified.The most rigorous of the three techniques described by Cressie (1985) is

    generalized least squares (GLS). The GLS technique accounts for the accuracy ofeach bin estimate of the experimental variogram, and the correlation between eachestimate. The chosen parameter values minimize

    ( (h) (h; ))T 1(h; ) ( (h) (h; )). (6)

    Here, h is the length k vector of lag bin centres and1(h; ) is the k k covariancematrix of (h). This matrix will be discussed in more detail later in this section.The direct method of minimizing Equation (6) has been shown to be inconsistent(Muller and Zimmerman, 1999). To account for this the following iterative schemeis used

    m+1 = min

    ( (h) (h; ))T 1(h; m)( (h) (h; )), = lim

    mm . (7)

    Here, m is the estimate of after m 1 iterations of Equation (7). This iterativescheme requires 1, an initial estimate of the parameter values. This initial estimatemay be chosen by weighted least squares [Eqs. (4)(5)]. The procedure then con-verges to the GLS parameter estimate in an asymptotically efficient and consistentmanner.

    Assessing Variogram Uncertainty

    Several authors (Cressie, 1985; Ortiz and Deutsch, 2002; Pardo-Iguzquizaand Dowd, 2001a) have derived similar expressions for the uncertainty of theexperimental variogram. In each case they express this uncertainty in terms of ,the covariance matrix of the experimental variogram. The pqth element of thismatrix is

    []pq = Cov [ (h p), (hq )], (8)

    and the diagonal elements are the variances of semivariances. The expected valueof (h) for each lag distance h is equal to (h). Therefore we refer to the standard

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    874 Marchant and Lark

    deviations of the semvariance at each lag bin, that is the square root of each elementon the main diagonal, as the ergodic error and to as the ergodic covariance matrix.From the definition of covariance

    []pq = E [ (h p) (hq )] (h p) (hq ), (9)

    = 14n(h p)n(hq )

    n(h p)i=1

    n(hq )j=1

    E[(

    zi1(h p) zi2(h p))2(

    zj1(hq ) z j2(hq )

    )2]

    (h p) (hq ). (10)

    Munoz-Pardo (1987) showed that if Z (x) is multivariate Gaussian with an isotropicergodic variogram (h), then

    E[(

    zi1(h p) zi2(h p))2(

    zj1(hq ) z j2(hq )

    )2] = 2Ci j (h p, hq ) + 4 (h p) (hq ). (11)The function Ci j (h p, hq ) describes the covariance between [zi1(h p) zi2(h p)] and[z j1(hq ) z j2(hq )] and may be written

    Ci j (h p, hq ) =((xi1 x j1 ) + (xi2 x j2 ) (xi1 x j2 ) (xi2 x j1 ))2,

    (12)where xi1, xi2, x

    j1 , and x

    j2 are the sample points at which the values zi1(h p), zi2(h p),

    zj1(hq ), and z j2(hq ), are measured, and |.| denotes the distance between the sample

    points. Therefore the pqth element of the ergodic covariance matrix is written

    []pq = 12n(h p)n(hq )n(h p)i=1

    n(hq )j=1

    Ci j (h p, hq ). (13)

    Pardo-Iguzquiza and Dowd (2001b) provide Fortran code to calculate this ex-pression. To calculate Equation (12), the program requires the ergodic variogramfunction as an input. This is best approximated from the fitted variogram model.

    If the distribution of semivariances is multivariate Gaussian, it is completelydefined by the ergodic variogram (h) and the covariance matrix . Furthermore,standard statistical theory (Gathwaite, Joliffe, and Jones, 1995) states that thequantity

    ( (h) (h))T 1(h)( (h) (h)), (14)

    has a chi squared distribution with k degrees of freedom. Confidence sets for ,with confidence (1 ), where is the significant probability level, are given by

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 875

    those values of for which

    ( (h) (h))T 1(h)( (h) (h)) 2k;. (15)

    Here, 2k; is the critical value of the chi squared distribution, with k degrees offreedom, at probability .

    The main diagonal of matrix describes the expected squared differencebetween the ergodic variogram and the estimated variogram. In practice, practi-tioners are more concerned about the difference between the nonergodic variogram,NE, and the estimated variogram. Munoz-Pardo (1987) derived an expression forE[( (h) N E (h))2]. The square root of this may be thought of as the expectedsampling error and we refer to it as the nonergodic error. Munoz-Pardos (1987) ex-pression may be extended to a nonergodic variancecovariance matrix NE where

    [NE]pq = E[( (h p) NE(h p)) ( (hq ) NE(hq ))] (16)= E[ (h p) (hq )] + E[NE(h p)NE(hq )]

    E[NE(h p) (hq )] E[ (h p)NE(hq )]. (17)

    In deriving Munoz-Pardos (1987) expression the nonergodic variogram is approx-imated by the notional variogram estimated from a sampling scheme with N npoints spread evenly throughout the region. That is

    NE(h j ) 1N (h j )N (h j )i=1

    [zi1(h j ) zi2(h j )

    ]2, (18)

    where N (h j ) is the number of pairs contained in lag bin j . This expression mayin turn be used to approximate the terms on the right hand side of Equation (17).For example,

    E[NE(h p) (hq )] 14N (h p)n(hq )

    N (h p)

    i=1

    n(hq )j=1

    E[(

    zi1(h p) zi2(h p))2(

    zj1(hq ) z j2(hq )

    )2]. (19)

    Applying Equation (11) yields that

    E[NE(h p) (hq )] 12N (h p)n(hq )N (h p)

    i=1

    n(hq )j=1

    Ci j + (h p) (hq ). (20)

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    876 Marchant and Lark

    Similarly,

    E[NE(h p)NE(hq )] 12N (h p)N (hq )N (h p)

    i=1

    N (hq )j=1

    Ci j + (h p) (hq ), (21)

    and

    E[ (h p) (hq )] 12n(h p)n(hq )n(h p)i=1

    (nq )j=1

    Ci j + (h p) (hq ). (22)

    Therefore, substituting Equations (20), (21), and (22) into Equation (17) gives

    [NE]pq 12n(h p)n(hq )n(h p)r=1

    n(hq )s=1

    Crs(h p, hq )

    + 12N (h p)N (hq )

    N (h p)r=1

    N (hq )s=1

    Crs(h p, hq )

    12n(h p)N (hq )

    n(h p)r=1

    N (hq )s=1

    Crs(h p, hq )

    12N (h p)n(hq )

    N (h p)r=1

    n(hq )s=1

    Crs(h p, hq ). (23)

    This expression may be calculated numerically in a similar manner to Equation(13). It is more computationally expensive however since the covariances betweenN (N 1)/2 pairs of points must be considered.

    The most common method of estimating the uncertainty of fitted variogramparameter estimates is by calculating the inverse of the information matrix (Pardo-Iguzquiza and Dowd, 2001a). The p p information matrix, M, that correspondsto parameter vector (of length p) fitted by GLS is

    M = JT 1J. (24)

    Here, J is the k p Jacobian matrix in which the i j th element is [J]i j = (hi )/ j , evaluated at the GLS estimate of . A result from nonlinear inversion theory(Menke, 1984) says that M1 is a leading order Taylor series approximation tothe covariance matrix of the parameter estimates. Since this is a leading order ap-proximation it is only accurate for estimates of that are themselves accurate. The

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 877

    approximation assumes that the parameter estimates have a multivariate Gaussiandistribution. In this case, and under the assumption that the variogram estimationtechnique is unbiased, the distribution of the vector of parameter estimates, ,is completely defined. The mean value is given by the parameter vector of thesimulated variable, and the covariance matrix by M1.

    Ortiz and Deutsch (2002) assessed variogram parameter uncertainty by amore arbitrary criterion. They examined the experimental variogram covariancematrix, , and fitted variograms to what they judged to be extreme realizationsof the experimental variogram. These fitted variograms were themselves assumedto have extreme parameter values.

    SIMULATION EXPERIMENTS

    Simulated Fields and Sampling Schemes

    The characteristics of the simulated fields and sampling schemes matchedthose used by Webster and Oliver (1992). Fields were generated with one of twoergodic variogram models. The first was the exponential variogram model

    (h) = c0 + c1(1 exp(h/r )) for h > 0, (25) (0) = 0, (26)

    where c0 is the nugget variance, c1 the sill of the spatially structured variance, andr the distance parameter of the model. The chosen parameter values were c0 = 0,c1 = 1, and r = 16. The other was the spherical variogram model

    (h) = c0 + c1(

    3h2a

    12

    (ha

    )3)for 0 < h a, (27)

    (h) = c0 + c1 for h > a, (28) (0) = 0, (29)

    where c0 and c1 have the same meaning as in Equation (25) and a is the distanceparameter. The parameter values were c0 = 1/3, c1 = 2/3, and a = 50. The dis-tance parameter, a, for the spherical model is the range of the spatial dependence,whereas the exponential model has effective range 3r . Thus both of the modelsapplied had approximately the same effective range.

    Each field was generated using unconditioned sequential Gaussian simulation(Deutsch and Journel, 1998) and consisted of either 120 120 = 14,400 or 256 256 = 65, 536 values on a square grid at unit interval. Henceforth we refer to the

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    878 Marchant and Lark

    Table 1. The Number of Sample Points in Each Schemeand the Corresponding Distance Between These Points

    Sample points 25 49 100 144 225 400Interval 20 15 10 8 7 5

    four sets of simulated fields as Sets 14. Sets 1 and 2 are the sets of large fieldswith exponential and spherical variograms respectively. Sets 3 and 4 are the set ofsmall fields with exponential and spherical variograms respectively.

    The smaller fields were simulated 1000 times, and the larger ones 100 times.The smaller fields, where the effective range was almost half of the length of thefield, were sampled using regular square grids with the sample sizes and samplingintervals listed in Table 1. For the larger fields, the effective range of the simulatedvariogram was less than a fifth of the length of the field. If the whole field hadbeen sampled using a square grid it would have provided little information aboutthe structured part of the variogram, unless the grid was very dense. Therefore thefield was sampled along transects. The combinations of sample sizes and sampleintervals were the same as those listed in Table 1, for example, a sample size of25 points was split into 5 transects, with each point separated by distance 5.

    Each of the four sets of fields were sampled with six different samplingschemes. The exact position of the sampling grids or transects was chosen atrandom, but the same positions were used for each realization within the samefield set. All of the theoretical expressions for variogram uncertainty describedpreviously are functions of both the ergodic variogram and the sampling schemeused. Therefore, in the three simulation tests described below, each combinationof sampling scheme and field type was tested independently. In each case, thetheoretical expressions of variogram uncertainty were calculated. Then each ofthe realizations was sampled, and from these data an experimental variogram wasestimated, and a variogram model fitted by GLS. The fitted variogram model wasof the same type as that of the simulated variable. The errors in the variogramestimates were then compared with the expected values. Also, a further simulatedapproximation of the covariance matrix of fitted parameter values was made bysimulating Gaussian realizations of the experimental variogram (h) directly, usingthe estimated experimental variogram covariance matrix . A model was fitted toeach realization by GLS and the covariance matrix of these simulated parameterestimates was calculated.

    Experiment 1

    The first experiment considered the covariance matrices calculated fromEquations (13) and (24) which describe the uncertainty of method of moments

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 879

    Table 2. The Constraints Placed on the Fitted Parameters for Each Data Set

    Data set c0 min c0 max c1 min c1 max a or r min a or r max

    Set 1 0.0 0.7 0.1 1.6 7.0 30.0Set 2 0.0 0.7 0.1 1.6 21.0 80.0Set 3 0.0 0.7 0.1 1.6 7.0 30.0Set 4 0.0 0.7 0.1 1.6 21.0 80.0

    estimates to the ergodic variogram. These theoretical uncertainty estimates werecalculated for each combination of test set and sampling scheme. The ergodicvariogram values required by Equations (13) and (24) were taken from the vari-ogram used to simulate the relevant data set.

    The covariance matrices of the experimental variograms and fitted parametersfor the sets of simulated data were then derived. The sampling scheme being testedwas applied to each realization of the random variable. Experimental variograms

    Figure 1. Comparison between expected ergodic errors and those observed for the Set 1 data set. Thecontinuous lines show the expected ergodic errors for the marked sample size. The ergodic errors from400 sample points are denoted by , from 100 sample points by +, and from 25 sample points by .

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    880 Marchant and Lark

    were calculated for each realization. The tolerance on the lag bins was set atzero. A variogram model of the same type as the simulated variable was fittedto the experimental variogram by a single iteration of the GLS procedure [Eq.(7)]. Limits were placed on the possible fitted values for each data set in order toprevent negative variances and ensure that the range of spatial correlation was notgreater than half the length of the field. Variogram estimates for lags greater thanhalf the length of a region are known to be unreliable (Webster and Oliver, 2001).The limits are listed in Table 2. For Sets 1 and 3 the fitting procedure was thenrepeated with the minimum value of c0 equal to 0.3. Such a model would not befitted in reality since the variance is negative for small lag distances. It is includedhere so that the effect of the c0 = 0 constraint on the uncertainty estimates may beseparated from other sources of error.

    In Figures 14 the expected ergodic error is compared with that observed fromthe simulated data sets. There is good agreement for all data sets. The expected

    Figure 2. Comparison between expected ergodic errors and those observed for the Set 2 data set. Thecontinuous lines show the expected ergodic errors for the marked sample size. The simulated standarderrors from 400 sample points are denoted by , from 100 sample points by +, and from 25 samplepoints by .

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 881

    Figure 3. Comparison between expected ergodic errors and those observed for the Set 3 data set. Thecontinuous lines show the expected ergodic errors for the marked sample size. The ergodic errors from400 sample points are denoted by , from 100 sample points by +, and from 25 sample points by .variance of fitted parameter estimates (c0, c1, a or r ) are compared with the sim-ulated variance of these estimates in Tables 36. Here, the minimum permissiblevalue of c0 for Set 1 and Set 3 is 0.3. For sample schemes of fewer than 100 points,the simulated values are less than the expected values. This is due to the theoretical

    Table 3. Comparison of Theoretical Variances of Fitted Variogram Parameters, With VariancesObserved From Multiple Simulated Fields and Multiple Simulated Experimental Variograms, for the

    Set 1 Data Set

    Theoretical Simulated field Simulated variogram

    Size c0 c1 a c0 c1 a c0 c1 r

    25 2.34e01 1.99e01 6.27e03 1.39e-01 4.36e-01 8.56e02 1.34e-01 5.67e-01 1.37e0349 1.67e00 1.31e00 8.46e02 1.06e-01 2.72e-01 7.81e02 1.11e-01 2.99e-01 6.75e02

    100 1.09e-01 1.33e-01 1.44e02 5.45e-02 1.92e-01 4.65e02 7.60e-02 1.45e-01 4.83e02144 3.47e-02 7.87e-02 8.45e01 2.83e-02 1.78e-01 3.06e02 4.44e-02 9.11e-02 2.22e02225 1.39e-02 5.59e-02 5.41e01 1.87e-02 9.95e-02 1.55e02 2.23e-02 9.42e-02 1.72e02400 2.96e-03 4.23e-02 2.69e01 5.10e-03 4.91e-02 3.52e01 5.82e-03 3.70e-02 3.66e02

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    882 Marchant and Lark

    Figure 4. Comparison between expected ergodic errors and those observed for the Set 4 data set. Thecontinuous lines show the expected ergodic errors for the marked sample size. The ergodic errors from400 sample points are denoted by , from 100 sample points by +, and from 25 sample points by .confidence limits on the parameters being wider than the permissible values inTable 2. For larger sampling schemes the simulated variances are larger thanthe expected values. The variances derived from simulating experimental vari-ograms agree closely with those derived from realizations of the random variable.

    Table 4. Comparison of Theoretical Variances of Fitted Variogram Parameters, With VariancesObserved From Multiple Simulated Fields and Multiple Simulated Experimental Variograms, for the

    Set 2 Data Set

    Theoretical Simulated field Simulated variogram

    Size c0 c1 r c0 c1 r c0 c1 a

    25 7.03e-01 7.23e-01 5.15e03 4.82e-02 4.02e-01 8.75e02 1.79e-01 4.63e-01 4.96e0249 9.12e-02 1.85e-01 1.14e03 3.71e-02 1.34e-01 4.14e02 1.21e-01 2.40e-01 3.63e02

    100 1.66e-02 8.86e-02 3.21e02 1.70e-02 8.91e-02 3.57e02 3.88e-02 8.48e-02 4.08e02144 8.72e-03 6.93e-02 2.90e02 9.34e-03 9.45e-02 3.84e02 1.85e-02 6.44e-02 3.13e02225 4.76e-03 4.81e-02 1.60e02 6.42e-03 6.69e-02 3.31e02 8.67e-03 4.52e-02 2.81e02400 2.05e-03 3.44e-02 1.07e02 1.98e-03 4.64e-02 2.14e02 2.18e-03 3.95e-02 1.77e02

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 883

    Table 5. Comparison of Theoretical Variances of Fitted Variogram Parameters, With VariancesObserved From Multiple Simulated Fields and Multiple Simulated Experimental Variograms, for the

    Set 3 Data Set

    Theoretical Simulated field Simulated variogram

    Size c0 c1 a c0 c1 a c0 c1 a

    25 1.28e01 1.07e01 3.42e03 1.39e-01 3.80e-01 9.01e02 1.48e-01 4.73e-01 1.39e0349 8.79e-01 7.59e-01 4.45e02 1.04e-01 2.27e-01 6.48e02 1.07e-01 2.46e-01 5.96e02

    100 7.29e-02 1.21e-01 1.25e02 4.51e-02 1.66e-01 3.18e02 5.04e-02 1.42e-01 3.64e02144 2.35e-02 9.31e-02 8.12e01 2.37e-02 1.35e-01 1.78e02 3.09e-02 1.01e-01 1.82e02225 9.52e-03 7.15e-02 5.14e01 1.19e-02 9.96e-02 1.02e02 1.72e-02 6.67e-02 9.20e01400 2.04e-03 7.38e-02 3.68e01 3.76e-03 9.99e-02 5.72e01 2.73e-03 8.11e-02 5.29e01

    Figures 57 show the histograms of parameter estimates fitted to the Set 4 data,sampled with 400 points. The expected distribution of these estimates from Equa-tion (13) is also shown. The distribution of estimates does not appear to be Gaussianand there are more extreme values than expected.

    Experiment 2

    The second experiment considered Munoz-Pardos (1987) expression for thenonergodic error [Eq. (23)]. In a similar manner to Experiment 1, the expectedvalues for the nonergodic error were calculated using the ergodic variogram of thesimulated variable. To calculate Equation (23), it was necessary to consider thecovariance between pairs of points within a dense, N point, sampling scheme. Foreach sampling scheme N was chosen to be 3600. These N points were positionedto ensure that N (hi ) was large for each relevant lag bin and that the points werespread evenly over the region.

    Table 6. Comparison of Theoretical Variances of Fitted Variogram Parameters, With VariancesObserved From Multiple Simulated Fields and Multiple Simulated Experimental Variograms, for the

    Set 4 Data Set

    Theoretical Simulated field Simulated variogram

    Size c0 c1 r c0 c1 r c0 c1 r

    25 3.86e-01 4.56e-01 2.78e03 4.65e-02 2.44e-01 4.16e02 9.24e-02 2.02e-01 4.28e0249 5.56e-02 1.51e-01 6.39e02 3.60e-02 1.49e-01 3.23e02 8.35e-02 1.88e-01 3.14e02

    100 1.20e-02 9.42e-02 1.98e02 1.46e-02 1.11e-01 3.31e02 3.38e-02 9.27e-02 3.60e02144 6.52e-03 8.38e-02 2.15e02 8.12e-03 9.98e-02 3.32e02 1.57e-02 7.25e-02 3.16e02225 3.76e-03 5.95e-02 1.29e02 4.67e-03 7.74e-02 2.86e02 3.92e-03 5.64e-02 2.60e02400 1.60e-03 4.44e-02 8.42e01 2.50e-03 7.26e-02 3.03e02 1.92e-03 6.20e-02 2.48e02

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    884 Marchant and Lark

    Figure 5. A histogram showing a distribution of the fitted values of c0, for the Set 4 data set sampledwith a 400 point square grid. The continuous line shows the distribution predicted by Equation (13).The expected value of c0 = 1/3.

    The nonergodic variogram of each realization is equal to the realizationsexhaustive variogram. This was calculated by using all of the simulated points.The resulting experimental variogram had 80 bins, separated by distance 1 andwith zero tolerance. For each realization, the difference between the exhaustiveexperimental variogram values and the estimated experimental variogram values(from Experiment 1) were recorded. These differences were compared with theexpected nonergodic errors.

    Figures 811 compare the expected nonergodic errors with the nonergodicerrors observed from the simulated data sets. In each case there is good agree-ment. The difference between the expected nonergodic errors [Eq. (23)] andthe expected ergodic errors [Eq. (13)] is explored in Figures 1215. For thelarger simulated regions (Set 1 and Set 2), the nonergodic errors are slightlyless than the ergodic errors. This difference increases with both lag distanceand sample size. For the smaller fields (Set 3 and Set 4) the difference be-tween the nonergodic and ergodic errors is much larger, particularly for large lagdistances.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 885

    Figure 6. A histogram showing a distribution of the fitted values of c1, for the Set 4 data set sampledwith a 400 point square grid. The continuous line shows the distribution predicted by Equation (13).The expected value of c1 = 2/3.

    Experiment 3

    In Experiments 1 and 2, the ergodic variogram of the simulated variable is usedto calculate the expected variogram errors. In practice, this would be unknown.Instead it would have to be approximated by the model fitted to the experimentalvariogram. The third experiment investigates the effect that this has on the accuracyof the confidence limits.

    For each variogram estimated in Experiment 1, the value of

    ( (h) (h))T 1( (h) (h)), (30)

    was calculated. Here, (h) is the ergodic variogram of the simulated variablecalculated at the vector of lag distances h, (h) is the estimated experimentalvariogram values and is the covariance matrix of the experimental variogramestimates, calculated from Equation (13), using (h). The covariance matrix is then recalculated, using the variogram fitted to (h). Then Equation (30) is

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    886 Marchant and Lark

    Figure 7. A histogram showing a distribution of the fitted values of a, for the Set 4 data set sampledwith a 400 point square grid. The continuous line shows the distribution predicted by Equation (13).The expected value of a = 50.

    recalculated using this new matrix. For each test set and sampling schemecombination, the distributions of the two sets of values of Equation (30) shouldform chi squared distributions of order k, where k is the number of experimentallag bins, and the confidence limits may be calculated from Equation (15).

    In Tables 710, the observed percentage of ergodic experimental variogramestimates lying within each theoretical confidence limit are given. The theoreticalconfidence limits appear to be reasonable for covariance matrices calculated withfitted variogram estimates and for covariance matrices calculated with the actualergodic variogram. In general, the confidence limits resulting from the actualergodic variogram are slightly more accurate. This is particularly noticeable forsample schemes with fewer than 100 points.

    DISCUSSION

    Extensive simulation tests have shown that, for an isotropic Gaussian randomvariable with known ergodic variogram, the covariance matrix of experimental

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 887

    Figure 8. Comparison between expected nonergodic errors and those observed for the Set 1 data set.The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for the markedsample size. The nonergodic errors from 400 sample points are denoted by, from 100 sample pointsby +, and from 25 sample points by .

    Table 7. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical ConfidenceLimits for the Set 1 Data Set

    Sample size Variogram 99 98 95 90 80 70 50 30 10

    25 Fitted 92.0 91.0 85.0 77.0 68.0 58.0 45.0 27.0 9.025 Ergodic 99.0 98.0 94.0 87.0 79.0 72.0 56.0 36.0 12.049 Fitted 94.0 92.0 92.0 87.0 77.0 67.0 49.0 31.0 13.049 Ergodic 97.0 96.0 93.0 92.0 84.0 74.0 61.0 39.0 17.0

    100 Fitted 97.0 95.0 93.0 86.0 73.0 65.0 52.0 30.0 10.0100 Ergodic 95.0 95.0 93.0 90.0 83.0 80.0 57.0 31.0 10.0144 Fitted 98.0 97.0 94.0 86.0 81.0 68.0 52.0 33.0 12.0144 Ergodic 98.0 98.0 95.0 93.0 88.0 82.0 62.0 44.0 15.0225 Fitted 97.0 96.0 93.0 87.0 78.0 64.0 56.0 37.0 11.0225 Ergodic 98.0 96.0 94.0 91.0 75.0 67.0 54.0 32.0 12.0400 Fitted 100.0 96.0 93.0 89.0 74.0 64.0 44.0 28.0 11.0400 Ergodic 97.0 94.0 89.0 86.0 78.0 73.0 58.0 36.0 16.0

    Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treatedseparately.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    888 Marchant and Lark

    Figure 9. Comparison between expected nonergodic errors and those observed for the Set 2 data set.The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for the markedsample size. The nonergodic errors from 400 sample points are denoted by, from 100 sample pointsby +, and from 25 sample points by .

    Table 8. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical ConfidenceLimits for the Set 2 Data Set

    Sample size Variogram 99 98 95 90 80 70 50 30 10

    25 Fitted 93.0 91.0 88.0 85.0 72.0 58.0 44.0 27.0 7.025 Ergodic 97.0 97.0 89.0 87.0 78.0 73.0 56.0 38.0 8.049 Fitted 97.0 93.0 91.0 83.0 72.0 63.0 51.0 27.0 11.049 Ergodic 99.0 98.0 95.0 91.0 83.0 73.0 53.0 36.0 14.0

    100 Fitted 95.0 93.0 87.0 79.0 74.0 64.0 50.0 29.0 18.0100 Ergodic 98.0 96.0 92.0 89.0 83.0 73.0 58.0 37.0 16.0144 Fitted 97.0 96.0 92.0 88.0 74.0 69.0 52.0 33.0 10.0144 Ergodic 97.0 97.0 95.0 93.0 87.0 80.0 62.0 46.0 13.0225 Fitted 95.0 91.0 89.0 82.0 74.0 62.0 45.0 28.0 12.0225 Ergodic 97.0 96.0 92.0 87.0 76.0 69.0 60.0 35.0 11.0400 Fitted 97.0 96.0 92.0 84.0 78.0 61.0 47.0 31.0 13.0400 Ergodic 98.0 96.0 94.0 85.0 79.0 73.0 58.0 39.0 18.0

    Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treatedseparately.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 889

    Figure 10. Comparison between expected nonergodic errors and those observed for the Set 3 dataset. The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for themarked sample size. The nonergodic errors from 400 sample points are denoted by , from 100sample points by +, and from 25 sample points by .

    Table 9. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical ConfidenceLimits for the Set 3 Data Set

    Sample size Variogram 99 98 95 90 80 70 50 30 10

    25 Fitted 93.4 91.3 87.0 82.2 72.4 66.0 50.5 32.6 11.525 Ergodic 97.0 95.2 93.0 90.6 83.9 78.3 60.4 38.1 11.949 Fitted 93.3 91.5 88.8 81.9 74.1 65.0 49.7 30.6 10.349 Ergodic 96.5 95.0 91.9 88.4 82.8 75.5 59.4 40.5 14.1

    100 Fitted 95.1 92.8 89.4 83.9 75.1 66.3 47.9 30.4 11.1100 Ergodic 96.8 95.1 92.5 89.4 80.9 73.3 58.4 39.2 14.8144 Fitted 95.2 92.0 88.7 84.4 76.6 66.4 46.8 29.9 10.8144 Ergodic 96.5 95.0 92.4 89.4 81.5 74.6 60.3 41.0 15.4225 Fitted 95.5 93.7 90.1 85.9 75.9 67.2 49.3 31.5 10.3225 Ergodic 96.0 93.9 90.4 85.4 78.7 71.6 60.8 42.3 17.1400 Fitted 96.8 96.2 91.4 86.6 80.1 72.0 47.8 28.5 11.8400 Ergodic 98.0 96.0 94.0 85.0 79.0 73.0 58.0 39.0 18.0

    Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treatedseparately.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    890 Marchant and Lark

    Figure 11. Comparison between expected nonergodic errors and those observed for the Set 4 dataset. The continuous lines show the expected nonergodic error [calculated from Eq. (23)] for themarked sample size. The nonergodic errors from 400 sample points are denoted by , from 100sample points by +, and from 25 sample points by .

    Table 10. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical ConfidenceLimits for the Set 4 Data Set

    Sample size Variogram 99 98 95 90 80 70 50 30 10

    25 Fitted 92.3 90.2 87.0 80.7 72.4 63.9 47.4 28.5 11.625 Ergodic 94.8 93.3 90.3 86.9 81.1 74.3 57.5 36.9 11.149 Fitted 92.3 90.3 86.1 80.7 71.2 62.0 46.5 31.1 11.049 Ergodic 95.2 93.5 90.5 86.6 81.5 75.2 60.1 40.8 14.0

    100 Fitted 90.2 88.3 82.8 78.2 67.6 61.3 43.0 26.5 10.0100 Ergodic 95.5 94.7 90.4 85.7 78.5 71.9 56.7 38.7 12.7144 Fitted 90.0 87.6 83.1 76.1 66.6 59.5 43.3 26.6 8.8144 Ergodic 95.9 94.4 91.5 88.1 82.1 74.2 59.4 38.9 15.1225 Fitted 89.1 87.7 83.1 77.6 68.1 60.8 43.8 28.7 11.3225 Ergodic 94.6 93.0 89.6 85.6 79.1 73.6 60.0 39.8 16.8400 Fitted 91.3 88.9 83.4 78.7 69.4 58.5 42.9 24.9 8.7400 Ergodic 96.8 94.5 90.7 86.0 76.9 71.9 55.7 40.9 18.2

    Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treatedseparately.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 891

    Figure 12. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)errors for the Set 1 data set.

    variogram estimates is approximated accurately by Equation (13). Pardo-Iguzquizaand Dowd (2001a) approximated the covariance matrix of the parameters of fittedvariogram models by calculating the inverse of the information matrix [Eq. (24)].This method is seen to overestimate the precision of the parameter estimates.This might be due to a number of factors. First, this expression for parameteruncertainty is based on a leading order Taylor series expansion centered on theactual ergodic variogram. Thus it is accurate only when the uncertainty is small. InTables 36, the uncertainty estimates are seen to improve as the sample size, andtherefore the precision of estimates, increases. Secondly, this method assumes thatthe distribution of fitted variogram parameters is Gaussian. The distributions ofparameters fitted to the simulated data sets were seen to deviate from Gaussian. Themethod also assumes that the parameters may take any value. In the simulationtests it was necessary to place constraints on the parameter values for practicalreasons, as would be the case in a real survey. Finally, for some realizations aninappropriate choice of variogram model might have caused larger deviations fromthe expected parameter values.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    892 Marchant and Lark

    Figure 13. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)errors for the Set 2 data set.

    As an alternative to calculating the inverse of the information matrix, theuncertainty of fitted parameter values may be assessed by simulating multipleexperimental variograms using Equation (13) and then fitting variogram models tothese. This process is computationally more expensive, but for a small number oflag bins it is practical. The results in Tables 36 show that this simulation method ismore accurate than using the information matrix. Also, there is no need to assumea particular distribution of parameter estimates, and constraints on the parametervalues can be accounted for.

    Simulation tests have also shown that the expected nonergodic errors areapproximated accurately by Munoz-Pardos (1987) expression [Eq. (23)].Thesenonergodic errors are due purely to sampling. Estimates of the ergodic variogramhave a component of uncertainty due to the fluctuations of the random variable, inaddition to this sampling error. When the large fields studied in Set 1 and Set 2 weresampled with a 25 point scheme, the ergodic and nonergodic errors were almostidentical. When more sampling points were used, the nonergodic error becameless than the ergodic error. The difference between the nonergodic and ergodic

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 893

    Figure 14. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)errors for the Set 3 data set.

    errors was more pronounced for the smaller fields considered in Set 3 and Set 4,particularly over large lag distances.

    These results reflect that for the small fields, the sample grids covered theentire field effectively. Thus most of the variation of the variable within the region,particularly over large lag distances, was accounted for. Therefore the nonergodicerror was much smaller than the ergodic error, which also had to account forfluctuations of the random variable over other realizations. For the larger fields,the sample points were more sparse. Therefore there were parts of the field that wereunsampled and the variation in these was not accounted for. Thus the nonergodicerror was more similar to the ergodic error since both estimators have to accountfor behavior and fluctuations of the variable in unsampled regions. In the case ofthe ergodic estimator this unsampled region consisted of all other realizationsof the variable.

    These simulation tests were computationally very expensive. Each realiza-tion was sampled at every point to calculate a definitive nonergodic variogram.Webster and Olivers (1992) study of nonergodic variogram uncertainty used a

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    894 Marchant and Lark

    Figure 15. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)errors for the Set 4 data set.

    method requiring far fewer computations. However, we feel that our method wasworthwhile since it was more accurate. To illustrate this, Figure 16 shows the ex-pected nonergodic error for the Set 4 data set calculated using Webster and Oliversmethod. The values seen here agree with those found in 1992. They are however,significantly less than both the nonergodic errors observed in our simulations andthe expected nonergodic errors from Munoz-Pardos (1987) expression [Eq. (23)].

    Webster and Oliver (1992) estimated the nonergodic error for a particularsampling grid design and separation distance h by the standard deviation of (h)values derived from translations of the grid over a single realization. Since theycame from the same realization, these estimates of NE(h) were not independent.The covariance between (h; gp), the semivariance estimated from translated gridgp, and (h; gq ), the semivariance estimated from translated grid gq is given by

    Cov ( (h; gp), (h; gq )) = 12n2(h)n(h)i=1

    n(h)j=1

    Ci j (h; gp, gq ), (31)

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 895

    Figure 16. The expected nonergodic error calculated using Webster and Olivers (1992) method. The are for 400 sample points, + are for 100 sample points, and the are for 25 sample points. Thelines show the expected nonergodic error using Munoz-Pardos (1987) method [Eq. (23].

    where Ci j (h; gp, gq ) describes the covariance between [zi1(h) zi2(h)] sampledfrom grid gp and [z j1(h) z j2(h)] sampled from grid gq . Values of Ci j (h; gp,gq ) may be calculated from Equation (12). If the translated grids are well sep-arated, or h is small, then the covariance between semivariances estimated fromdifferent grids are small and Webster and Olivers method gives a good approxima-tion of the nonergodic error. However there are only a small number of translationsof large sample grids over small regions which do not have a point in common.The position of sample points within some of these grids are close enough to causesignificant correlation between the estimated semivariances. This leads to the non-ergodic error being underestimated as illustrated in Figure 16. Our method didnot contain such a bias since the nonergodic error estimates came from differentrealizations of the random process and were therefore uncorrelated.

    The difference between the nonergodic and ergodic errors can have impli-cations for both the design of efficient sampling schemes and variogram modelfitting. The ergodic covariance matrix has been used previously to optimize sample

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    896 Marchant and Lark

    Figure 17. Plot of the experimental variogram (denoted by ) and two fitted variogram models for arealization from the Set 4 data set sampled with 225 points. The continuous line denotes the model fittedusing the ergodic uncertainty matrix. The dotted line denotes the model fitted using the nonergodicuncertainty matrix.

    schemes for variogram estimation (Bogaert and Russo, 1999; Muller andZimmerman, 1999). This can lead to too much sampling effort being directedtowards calculating the variogram for the larger lag distances. Figure 17 showsthe GLS fitted variogram for one realization of the Set 4 data set, sampled with225 points. There is a good match between the model and the experimental var-iogram for the two smallest lags, but not for the larger lags. This is because theexpected ergodic error is much less for the smallest lags and therefore the fittedmodel is biased towards these. The dotted line shows the fitted model that resultsfrom using the nonergodic covariance matrix in Equation (7). This model matchesthe experimental variogram for all but the largest lags and seems a much morereasonable fit. This sort of behavior was seen regularly when fitting models to theSet 3 and Set 4 data sets. Thus it appears that the GLS model fitting procedure maybe improved by using the nonergodic covariance matrix [Eq. (23)] rather than theergodic one [Eq. (13)]. In practice, since Equation (23) is a function of the ergodicvariogram, it would be necessary to first fit a model of the ergodic variogram using

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    Estimating Variogram Uncertainty 897

    Equation (13). Then Equation (23) could be calculated from this model, and usedfor one final iteration of the fitting procedure [Eq. (7)]. More theoretical work isrequired to ensure that such an approach is consistent.

    A major disadvantage of calculating the nonergodic covariance matrix is theextra computational work required. The method described in this paper requires allthe covariances between pairs of pairs within a very concentrated sample schemeto be calculated. For each entry of the covariance matrix it is only the average of thecovariance between pairs from each bin that is needed. Therefore the computationalload may be reduced by subsampling of these pairs.

    In Experiment 3, the effect upon the experimental variogram confidence limitsfrom using the fitted variogram rather than the correct ergodic variogram, was seento be small for sample schemes of 100 or more points. It therefore appears that thecircular approach in calculating variogram uncertainty is valid.

    CONCLUSIONS

    This study has demonstrated that for a known ergodic variogram, it is pos-sible to accurately determine the expected difference between the experimentalsemivariances calculated from a particular sampling scheme and the correspond-ing ergodic and nonergodic variogram values. Ergodic errors may be estimatedby Pardo-Iguzquiza and Dowds (2001a) method [Eq. (13)] and nonergodic errorsby Munoz-Pardos (1987) expression [Eq. (23)]. The ergodic error is significantlyless demanding to compute than the nonergodic error. For large fields the differ-ence between the two error expressions is negligible. However for small regions,say with length around twice the range of spatial correlation of the variable, thenonergodic error is significantly less than the ergodic error.

    Previously Muller and Zimmerman (1999) and Bogaert and Russo (1999)have used the ergodic error expressions to compute optimal sampling schemes forvariogram estimation. If the aim of these schemes is to approximate the variogramof the single region being sampled with maximum precision, then a nonergodicexpression of variogram uncertainty would be more appropriate. On smaller fieldsuse of the ergodic expression leads to more intensive sampling than is required.Our results have also suggested that the GLS variogram fitting procedure may beimproved (in the sense that the fitted variogram better matches the nonergodicvariogram) if the nonergodic error is incorporated into the final iteration of theprocedure.

    It should be noted that if these expressions are used to determine variogramuncertainty in a real survey, there will be additional uncertainty because the ergodicvariogram is unknown. Further simulation tests suggested that the additional un-certainty from using the estimated variogram rather than the true ergodic variogramis small for sample schemes of more than 100 points.

  • P1: KEEMathematical Geology [mg] PP1377-matg-496103 November 3, 2004 7:27 Style file version June 25th, 2002

    898 Marchant and Lark

    ACKNOWLEDGMENTS

    This work was supported by the Biotechnology and Biological Sciences Re-search Council of the U.K. through Grant 204/D1 5335 and by the Home-GrownCereals Authority of the U.K. through grant 2453.

    REFERENCES

    Bogaert, P., and Russo, D., 1999, Optimal spatial sampling design for the estimation of the variogrambased on a least squares approach: Water Resour. Res., v. 35, no. 4, p. 12751289.

    Brooker, P. I., 1986, A parametric study of robustness of Kriging variance as a function of range andrelative nugget effect for a spherical semivariogram: Math. Geology, v. 18, no. 5, p. 477488.

    Brus, D. J., and de Gruijter, J. J., 1994, Estimation of nonergodic variograms and their sampling varianceby design-based sampling strategies: Math. Geology, v. 26, no. 4, p. 437453.

    Cressie, N., 1985, Fitting variogram models by weighted least squares: Math. Geology, v. 17, no. 5,p. 563586.

    Deutsch, C. V., and Journel, A. G., 1998, GSLIB: Geostatistical software library and users guide, 2nded.: Oxford University Press, New York, 369 p.

    Gathwaite, P. H., Joliffe, I. T., and Jones, B., 1995, Statistical inference: Prentice Hall, London, 290 p.Journel, A. G., and Huijbregts, C. J., 1978, Mining geostatistics: Academic Press, London, 600 p.McBratney, A. B., and Webster, R., 1986, Choosing functions for semi-variograms of soil properties

    and fitting them to sampling estimates: J. Soil Sci. v. 37, no. 4, p. 617639.Menke, W., 1984, Geophysical data analysis: Discrete inversion theory: Academic Press, San Diego,

    CA, 285 p.Muller, W. G., and Zimmerman, D. L., 1999, Optimal designs for variogram estimation: Environmetrics,

    v. 10, no. 1, p. 2337.Munoz-Pardo, J. F., 1987, Approche Geostatistique de la variabilite spatiale des Milieux Geophysiques:

    The`se Docteur-Ingenieur, Universite de Grenoble et lInstitut National Polytechnique de Grenoble,254 p.

    Ortiz, C. J., and Deutsch, C. V., 2002, Calculation of uncertainty in the variogram: Math. Geology:v. 34, no. 2, p. 169183.

    Pardo-Iguzquiza, E., and Dowd, P. A., 2001a, Variancecovariance matrix of the experimental vari-ogram: Assessing variogram uncertainty: Math. Geology, v. 33, no. 4, p. 397419.

    Pardo-Iguzquiza, E., and Dowd, P. A., 2001b, VARIOG2D: A computer program for estimating thesemi-variogram and its uncertainty: Comput. Geosciences, v. 27, no. 5, p. 549561.

    Todini, E., 2001, Influence of parameter estimation uncertainty in Kriging: Part 1Theoretical devel-opment: Hydrol. Earth Sci. Syst., v. 5, no. 2, p. 215223.

    Todini, E., Pellegrini, F., and Mazzetti, C., 2001, Influence of parameter estimation uncertaintyin Kriging: Part 2Test and case study applications: Hydrol. Earth Sci. Syst., v. 5, no. 2,p. 225232.

    Webster, R., and Oliver, M. A., 1992, Sample adequately to estimate variograms of soil properties:J. Soil Sci., v. 43, no. 1, p. 177192.

    Webster, R., and Oliver, M. A., 2001, Geostatistics for environmental scientists: John Wiley & Sons,Chichester, 271 p.

  • Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.