63 statistical properties of random co2 flux

Upload: aurelia-dinca

Post on 04-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 63 Statistical properties of random CO2 flux

    1/13

    Statistical properties of random CO2flux measurement

    uncertainty inferred from model residuals

    Andrew D. Richardson a,*, Miguel D. Mahecha b, Eva Falge c, Jens Kattge b,Antje M. Moffatb, Dario Papale d, Markus Reichstein b, Vanessa J. Stauch e,Bobby H. Braswell a, Galina Churkina b, Bart Kruijtf, David Y. Hollingerg

    a University of New Hampshire, Complex Systems Research Center, Morse Hall, 39 College Road, Durham, NH 03824, USAb Max Planck Institute for Biogeochemistry, Hans-Knoll-Strasse 10, 07745 Jena, Germany

    c Max Planck Institute for Chemistry, Biogeochemistry Department, J.J.v. Becherweg 27, 55128 Mainz, Germanyd DISAFRI, University of Tuscia, via C. de Lellis, 01100 Viterbo, Italye Computational Landscape Ecology, Centre for Environmental Research (UFZ), Leipzig, GermanyfAlterra, Wageningen University and Research Centre, 6700 AA Wageningen, The NetherlandsgUSDA Forest Service, Northern Research Station, Durham, NH 03824, USA

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0

    a r t i c l e i n f o

    Article history:

    Received 26 March 2007

    Received in revised form

    25 August 2007

    Accepted 3 September 2007

    Keywords:

    Data-model fusion

    Eddy covariance

    FLUXNET

    Measurement error

    Pink noise

    Spectral analysis

    Uncertainty

    a b s t r a c t

    Information about the uncertainties associated with eddy covariance measurements of sur-

    faceatmosphere CO2 exchange is needed for data assimilation and inverse analyses to

    estimate model parameters, validation of ecosystem models against flux data, as well as

    multi-site synthesis activities (e.g., regional to continental integration) and policy decision-

    making. While model residuals (mismatch between fitted model predictions and measuredfluxes) can potentially be analyzed to infer data uncertainties, the resulting uncertainty

    estimates may be sensitive to the particular model chosen. Here we use 10 site-years of data

    fromthe CarboEurope program, and compare the statistical propertiesof the inferred random

    flux measurement error calculated first using residuals from five different models, and

    secondly using paired observations made under similar environmental conditions. Spectral

    analysis of the model predictions indicated greater persistence (i.e., autocorrelation or

    memory) compared to the measured values. Model residuals exhibited weaker temporal

    correlation, but were not uncorrelated white noise. Random flux measurement uncertainty,

    expressed as a standard deviation, was found to vary predictably in relation to the expected

    magnitude of the flux, in a manner that was nearly identical (for negative, but not positive,

    fluxes) to that reported previously for forested sites. Uncertainty estimates were generally

    comparable whether the uncertainty was inferred from model residuals or paired observa-

    tions, although the latter approach resulted in somewhat smaller estimates. Higher ordermoments (e.g., skewness and kurtosis) suggested that for fluxes close to zero, the measure-

    ment error is commonly skewed and leptokurtic. Skewness could not be evaluated using the

    paired observation approach, because differencing of paired measurements resulted in a

    symmetricdistributionof the inferred error. Patterns were robust and not especially sensitive

    to the model used, although more flexible models, which did not impose a particular func-

    tional form on relationships between environmental drivers and modeled fluxes, appeared to

    give the best results. We conclude that evaluation of flux measurement errors from model

    residuals is a viable alternative to the standard paired observation approach.

    # 2007 Elsevier B.V. All rights reserved.

    * Corresponding author. Tel.: +1 603 868 7654; fax: +1 603 868 7604.E-mail address:[email protected](A.D. Richardson).

    a v a i l a b l e a t w w w . s c i e n c e d i r e c t . c o m

    j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / a g r f o r m e t

    0168-1923/$ see front matter # 2007 Elsevier B.V. All rights reserved.

    doi:10.1016/j.agrformet.2007.09.001

    mailto:[email protected]://dx.doi.org/10.1016/j.agrformet.2007.09.001http://dx.doi.org/10.1016/j.agrformet.2007.09.001mailto:[email protected]
  • 8/13/2019 63 Statistical properties of random CO2 flux

    2/13

    1. Introduction

    The eddy covariance method is used at research sites around

    the world to measure surfaceatmosphere exchanges of

    carbon, water and energy (Baldocchi et al., 2001b). Increas-

    ingly, these flux data are being used to constrain large-scale

    carbon cycle models (e.g.,Papale and Valentini, 2003; Friend

    et al., 2006). In the context of this so-called model-datasynthesis,Raupach et al. (2005)have argued the importance

    of accurately characterizing data uncertainties, suggesting

    that uncertainties are as important as the data values

    themselves. This is due to the fundamental role of uncertain-

    ties in specifying an objective criterion (commonly referred to

    as the cost function; see Raupach et al., 2005), quantifying

    the mismatch or distance between model and data, used as

    the basis for optimization or synthesis (Trudinger et al., 2007).

    If the true flux is F0i, and the measurement error is a random

    variable di (the measured flux is thus Fi F0i di), then a

    convenient way to characterize the uncertainty is s(di), i.e., as

    thestandard deviationof the randomerror.Accordingly, 1/s(di)

    is a quantitative measure of our confidence in the measureddata (see Raupach et al., 2005). In conducting model-data

    syntheses, data values are typically weighted in the cost

    function by the reciprocal of their uncertainties, i.e., as 1/s(di)

    (possiblyraisedto a higher power). An important consequence

    of this is that estimated model parameters or states depend on

    both measurements and uncertainties.

    There is a range of other applications in which knowledge

    of flux data uncertainties is also critical. These include, for

    example, ecosystem model validation against flux data (e.g.,

    Thornton et al., 2002; Churkina et al., 2003), because

    uncertainty estimates are required to make statistically

    rigorous comparisons between measurements and model

    predictions. Additionally, flux data uncertainties are neededfor multi-site syntheses (e.g.,Law et al., 2002; Churkina et al.,

    2005) and regional-to-contintental integration efforts, espe-

    cially with regard to carbon accounting and policy decision-

    making (Wofsy and Harriss, 2002). In both of these latter

    examples, sites with smaller data uncertainties can be

    accorded more influence than sites with larger uncertainties.

    Flux measurement errors can be broadly classified into

    random and systematic sources of uncertainty; whereas

    statistical analyses can be used to estimate the size of random

    errors, systematic errors can be difficult to detect or quantify

    in data-based analyses (Moncrieff et al., 1996; but see Lee,

    1998). Random flux measurement uncertainties (e.g.,Hollinger

    and Richardson, 2005), which are stochastic, represent theaggregate uncertainty due to surface heterogeneity and a

    time-varying flux footprint, turbulence sampling errors

    (especially problematic over tall vegetation), and errors

    associated with the measurement equipment (e.g., gas

    analyzer and sonic anemometer). Systematic errors, on the

    other hand, operate at varying time scales (e.g., fully

    systematic vs. selectively systematic, see Moncrieff et al.,

    1996), and can influence measurements in a variety of ways

    (e.g., fixed offsets vs. relative offsets and positive vs. negative

    biases). Systematic errors due to the imperfect spectral

    response of the measurement system, nocturnal biases

    resulting from inadequate mixing, issues related to advection

    and non-flat terrain, and problems of energy balance closure,

    are discussed elsewhere (Goulden et al., 1996; Moncrieff et al.,

    1996; Lee, 1998; Massman and Lee, 2002; Wilson et al., 2002;

    Aubinet et al., 2005; Loescher et al., 2006).

    While any attempt to quantifying the total flux measure-

    ment uncertainty must account for both random and

    systematic errors, our goal here is to build on previous

    statistically-based analytical efforts to quantify the random

    component. For example, Hollinger et al. (2004) differencedpaired measurements from two flux towers in an effort to

    estimate the random flux measurement uncertainty, which

    was then characterized, as s(d), by the standard deviation of

    the differenced observations (divided by the square root of

    two). The two towers were located in similar old-growth Picea

    stands and instrumented identically, but were separated by

    775 m and had non-overlapping footprints and largely

    independent turbulence (cf. the more recent study byRannik

    et al., 2006, where the30 m separation of two towers resulted

    in paired measurements that were not fully independent).

    Results of the above studies were in close agreement with (but

    somewhat higher than) predictions of the Lenschow et al.

    (1994)and Mann and Lenschow (1994)error model based onturbulence statistics.

    In a subsequent paper, Hollinger and Richardson (2005)

    proposed that paired measurements at one tower, separated

    by 24 h and made under similar environmental conditions

    could also be used to estimate the random flux measurement

    uncertainty; this method was applied by Richardson et al.

    (2006a) to data from seven sites (one grassland, an agricultural

    site, and five forested sites) in the first multi-site synthesis of

    random measurement uncertainty. The one-tower approach

    may result in lower uncertainty estimates compared to the

    two-tower approach because spatial heterogeneity across an

    ecosystem is likely to be larger than that across the time-

    varying footprint of a single tower (seeKatul et al., 1999; Orenet al., 2006). Conversely, however, imperfect matching

    (between days) in environmental and turbulent conditions

    may result in slight over-estimation of uncertainty using the

    one-tower approach.

    We consider here a third method to quantify eddy flux

    random measurement uncertainty. In the hypothetical

    instance of a perfect model and no systematic measurement

    errors, then the concordance between model predictions and

    measured data is ultimately limited by the random flux

    measurement uncertainty. Thus an alternative means by

    which the random flux measurement uncertainty might be

    evaluated is through posterior analysis of residuals from a

    fitted model (Stauch et al., in press). However, models are animperfect representation of reality, andthus a disadvantage of

    this approach is that the resulting uncertainty estimates will

    be a combination of both measurement error and model error.

    With a particularly poor model (inappropriate functional form,

    mis-representation of processes and driving variables, or

    suboptimal parameter values), model error could potentially

    overwhelm the random measurement error. While estimates

    of measurement error inferred from model residuals will

    depend somewhat on the model itself, there is some evidence

    that this may be a reasonable approach to quantifying

    uncertainties. For example, Richardson and Hollinger (2005)

    reported that the standard deviationof model residuals forthe

    Lloyd and Taylor (1994) respiration model, fit to nocturnal

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 39

  • 8/13/2019 63 Statistical properties of random CO2 flux

    3/13

    data, was only 14% larger than the random flux measurement

    uncertainty estimated using paired measurements, suggest-

    ing that model error is potentially relatively small. In that

    study, the statistical properties of the residuals were other-

    wise similar to the uncertainty estimated using paired

    measurements, i.e., non-Gaussian and leptokurtic (see also

    Hagen et al., 2006), with comparable patterns in relation to

    wind speed and flux magnitude.A number of other studies have also conducted analyses of

    model residuals for the purpose of evaluating flux measure-

    ment uncertainty. Using data from a range of CO2 flux

    measurement sites,Chevallier et al. (2006) noted that at the

    daily time step, ORCHIDEE model residuals were non-

    Gaussian. At least two related studies have purposely avoided

    making assumptions about the specific probability distribu-

    tion function for the random flux measurement uncertainty.

    For example,Hagen et al. (2006)used bootstrap resampling of

    model residuals to estimate uncertainties in gross productiv-

    ity at different time scales (from 30 min. to annual), whereas

    Stauch et al. (in press) used a stochastic model to partition

    signal and noise in measured net fluxes, and character-ized the noise (i.e., model residuals) using a non-parametric

    statistical model based on time-varying cubic splines (As

    noted by both of these authors, annual flux sums are

    approximately Gaussian, despite the non-Gaussian uncer-

    tainties in the 30 min. data; this result follows directly from

    the Central Limit Theorem).

    In the present study, we use a wide array of data and

    models to investigate whether an approach based on model

    residuals results in uncertaintyestimates that are comparable

    to those derived from paired observations. We optimize model

    parameters (on a site-by-site basis) to maximize the agree-

    ment between model and measurements, and thus obtain

    model predictions that are informed by the data, rather thanindependent of the data. In this manner the models can be

    used as tools to predict the true flux, F 0i, conditional on the

    measured fluxes, environmental drivers, and implicit (in the

    model) relationships between these two.

    To evaluate the degree to which model-based estimates of

    uncertainty are influenced by the particular model chosen

    (i.e., the degree to which model error confounds random

    measurement error), we compare results from five different

    models, ranging from simple nonlinear regressions to more

    highly parameterized artificialneuralnetworks and a process-

    based ecosystem model. Some of the models (e.g., neural

    network and look-up table) contain few, if any, assumptions

    about the underlying functional form of relationships between

    CO2 flux and environmental drivers. Parameterization of some

    of the models (look-up table and nonlinear regression) was

    time-varying to account for seasonal changes in the flux-

    driver relationships.

    We begin by evaluating, first in the temporal domain and

    then in the frequency domain, model predictions and modelresiduals in relation to the measured CO2flux time series (i.e.,

    the net ecosystem exchange, NEE). By using new tools

    (improvements over traditional Fourier power analysis) to

    probe the spectral properties of both the measured fluxes and

    the inferred random error, we are able to investigate and

    characterize the correlation structure inherent in each of

    these time series. This analysis complements the more

    traditional analysis which follows, where we focus on the

    moments (standard deviation, skewness and kurtosis) of the

    probability distribution of the inferred random error. We

    compare inferred uncertainties calculated from paired obser-

    vations and model residuals, and look for similarities across

    sites. Finally, relationships between the standard deviation ofthe random flux measurement uncertainty and flux magni-

    tude developed here for six forested European sites are

    compared to those presented previously byRichardson et al.

    (2006a)for four forested AmeriFlux sites.

    2. Data and method

    The present analysis is based on 10 site-years of data from the

    CarboEurope database. The sites (seeTable 1) span a range of

    forest types, including deciduous (FR1, DE3), coniferous (FI1),

    mixed deciduous/coniferous (BE1), and Mediterranean (FR4,

    IT3). The data were assembled as part of a data set used forevaluation of a standardized data processing algorithm for

    half-hourly data (Papale et al., 2006), a comprehensive gap

    filling comparison (Moffat et al., 2007), and an evaluation of

    different algorithms for partitioning NEE to its component

    fluxes, gross productivity and ecosystem respiration (Desai

    et al., unpublished). Data were corrected for canopy storage

    but not energy balance closure; quality control procedures, u*

    filtering, and gap filling of meteorological data are discussed in

    the companion paper byMoffat et al. (2007). Measured CO2fluxes are here denoted Fc, with units of mmol m

    2 s1. The

    sign convention is that a negative flux indicates carbon uptake

    Table 1 Sites and years of data used for the uncertainty analysis. MAT = mean annual temperature

    Site Species Climate (MAT) Years Latitude Longitude Reference

    BE1 (Vielsalm, Belgium) Fagus sylvatica,

    Pseudotsuga menziesii

    Temperate-continental

    (7.5 8C)

    2000, 2001 50.308N 5.988E Aubinet et al. (2001)

    DE3 (Hainich, Germany) Fagus sylvatica Temperate-continental

    (6.8 8C)

    2000, 2001 51.078N 10.458E Knohl et al. (2003)

    FI1 (Hyytiala, Finland) Pinus sylvestris Boreal (3.8 8C) 2001, 2002 61.838N 24.288E Suni et al. (2003)

    FR1 (Hesse, France) Fagus sylvatica Temperate-suboceanic

    (9.9 8C)

    2001, 2002 48.678N 7.058E Granier et al. (2000)

    FR4 (Puechabon, France) Quercus ilex Mediterranean (13.5 8C) 2002 43.738N 3.588E Rambal et al. (2004)

    IT3 (Roccarespampani, Italy) Quercus cerris Mediterranean (15.28

    C) 2002 42.408

    N 11.928

    E Tedeschi et al. (2006)

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 040

  • 8/13/2019 63 Statistical properties of random CO2 flux

    4/13

    by the ecosystem, whereas a positive flux indicates carbon

    release.

    For each site-year, fitted model predictions from five

    different models selected from among those submitted to

    the gap filling comparison were used. The models (described

    fully byMoffat et al., 2007) were as follows: an artificial neural

    network model (ANN), a non-linear regression model (NLR), a

    process-based ecosystem model (BETHY), a semi-parametricmodel (SPM), and a marginal distribution sampling method

    (MDS). The ANN (Papale and Valentini, 2003) is essentially a

    multi-stage, empirical, non-linear regression model. An

    advantage of neural networks is that no a priori decisions

    about the functional form of relationships between data

    inputs (e.g., meteorological drivers and time of day/year as a

    fuzzy variable) and outputs (measured fluxes) are required.

    The particularNLR used here (Falge et al., 2001)wasbasedona

    MichaelisMenten light response function for gross photo-

    synthesis and aLloyd and Taylor (1994)temperature function

    for ecosystem respiration; model parameters were fit twice-

    monthly. BETHY (Knorr and Kattge, 2005) is a terrestrial

    biosphere model that predicts CO2, waterand energy fluxes. Inaddition to meteorological drivers, required input data include

    site-specific information about soil and canopy properties

    (e.g., leaf area index, estimated from remote sensing pro-

    ducts). SPM (Stauch and Jarvis, 2006) is analogous to a multi-

    dimensional look-up table, with an underlying semi-para-

    metric interpolation defined by three-dimensional cubic

    splines (time-varying functions of PPFD and temperature).

    For MDS (Reichstein et al., 2005), which is essentially a moving

    look-up table, fluxes are averaged from similar meteorological

    conditions (global radiation, air temperature and vapor

    pressure deficit within pre-specified criteria) across time-

    windows as small as possible. Thus the models ranged from

    data based (e.g., ANN) to process based (e.g., BETHY), and theadaptation to measurements ranged from time constant (e.g.,

    BETHY) to time varying parameterization (e.g., MDS). While

    our goal here is not to identify any one of these as the best

    model, the application of objective model selection criteria

    (e.g., Akaikes Information Criterion) to similar models of CO2exchange data has been described previously (Richardson

    et al., 2006b).

    2.1. Evaluation of spectral properties

    The spectral properties (i.e., correlation structure) of a time

    series can be summarizedby calculating the scaling exponent,

    a, of the power spectrum P(f) =fa (Mandelbrot, 1999). Here frefersto the frequencies contained in a given timeseries, anda

    is conventionally estimated as the slope of the log-log

    transformed relationship between frequency and power

    (i.e., the Fourier power spectrum). Uncorrelated white noise

    has a= 0 (equal power across all frequencies), whereas a

    strong autocorrelation structure, such as characterizes Brow-

    nian motion, has a 2. The power spectrum of this red

    noise is dominated by the lower frequencies, which imparts

    the signal with substantial memory. Intermediate between

    white and red noise is pink noise, for which the correlation

    structure is moderate and a 1.

    With the goal of evaluating whether the different models

    were able to reproduce the temporal correlation structure of

    the measured NEE time series, and to characterize the spectral

    properties of the model residuals, we used the Lomb-Scargle

    periodogram (L-S, Lomb, 1976; Scargle, 1982) and the

    multiple segmentation method (MSM, Miramontes and

    Rohani, 2002; Rohani et al., 2004) to estimate a for the

    measured fluxes, model predictions, and model residuals. The

    L-S method (following Press et al., 2002) estimates the

    equivalent of a Fourier power spectrum for unevenly-spacedtime series data, and is thus suited to analysis of fragmented

    time series with up to 20% missing observations; a is

    calculated from the resulting L-S power spectrum using the

    conventional method.

    Since a reasonable estimate of the scaling exponent a is

    only possible forvery long time series when standard methods

    are applied, new methods (e.g., MSM) have been developed to

    yield statistically robust estimates ofa for short time series.

    The MSMmethod subdivides a timeseries into segments, with

    segment lengths defined by powers of two. The scaling

    exponent for each segment is calculated as the slope of the

    power spectrum of the sub-series. An overall estimate ofa is

    obtained by fitting a linear model to the relationship betweensegment length and the mean of the scaling exponents found

    at each segment size. In the presentstudythe powerspectra of

    the sub-series in MSM were determined using theL-S method.

    With this hybrid approach that blends MSM and L-S, we could

    estimate a notonly forshort,but also forshort andfragmented

    time series. Thus we were able to estimate the scaling

    exponent for measured fluxes, without having to resort to

    gap-filled time series.

    2.2. Inferred random flux measurement error

    For each site-year of data, the paired observation approach to

    estimating uncertainty was implemented as described inRichardson et al. (2006a). Briefly, this involved considering as a

    pair anytwo fluxmeasurements made exactly 24 h apart, with

    mean half-hourly photosynthetically active photon flux

    densities within 75 mmol m2 s1, air temperatures within

    3 8C, and vapor pressure deficits within 0.2 kPa. Across sites,

    between 19% and 27% (overall mean, 23%) of half-hourly

    measurements met these criteria. For this method, the

    inferred error, di, equals the difference between the two

    measurements divided by the square root of two (see also

    Hollinger and Richardson, 2005).

    The model-based approach to estimating uncertainty was

    implemented using the five models named above. The

    inferred random error was taken to be the residual, ei,between the predicted value of the fitted model and the

    measured flux. With data coverage averaging 69% across all

    sites, this method provided roughly three-fold as many

    estimates of the inferred random error, compared to the

    paired observation approach.

    The statistical properties (i.e., moments of the distribution)

    of the inferred random error were analyzed with respect to

    flux magnitude. Based on predictions of the ANN model (the

    standard deviation of the residuals was smallest for this

    model), observations were grouped into bins of comparable

    flux magnitude (bin width = 5 mmol m2 s1) and the mean,

    standard deviation, skewness, and kurtosis of the inferred

    random error calculated for each bin; as in the past, we

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 41

  • 8/13/2019 63 Statistical properties of random CO2 flux

    5/13

    characterize therandom fluxmeasurement uncertainty by the

    second moment of the inferred random error, i.e., as the

    standard deviation, s(d). Moments were only calculated for

    bins containing at least 10 observations.

    3. Results

    3.1. Time series of model residuals

    Model residuals were well-correlated across models, with

    pairwise linear correlations generally 0.80, suggesting that

    the dominant source of variability in residualsis relatedto flux

    measurement errors (which would affect residual time series

    from all models similarly), rather than model error. To use the

    FR1 2001 data as an example, MDS and BETHY residuals were

    more weakly correlated (r= 0.80) than, for example, ANN and

    SPM residuals (r= 0.94). Visual inspection of the raw residual

    data (Fig. 1) suggested only modest differences among models,

    but when the mean bias was calculated over weekly periods, it

    became apparent that predictions of some models (e.g.,BETHY) were consistently biased at certain times of the year,

    whereas other models (e.g., MDS) exhibited little or no

    persistent bias.

    3.2. Spectral properties of measured fluxes, model

    predictions, and model residuals

    The Lomb-Scargle periodogram suggested that all models

    did a reasonable job at describing the higher-frequency

    components of the data, but due to time constant para-

    meterization, some models (e.g. BETHY) were less successful

    at describing the lower-frequency components than other

    models (e.g., MDS) that used time-varying parameterization.

    All residual time series exhibited a spectral peak at

    frequency of 1/day. This does not necessarily mean that

    the models fail to adequately capture the diurnal cycle;

    rather, it just means that there is an underlying structure to

    the residuals, which could arise from diurnal flux patterns

    (e.g., day vs. night) and corresponding patterns to the

    uncertainty.

    Similar estimates of the scaling exponent, a, wereobtained with both the L-S and hybrid MSM/L-S methods,

    except that the latter methodtendedto result in more clearly

    defined differences ina amongthedifferentmodels.Wefocus

    here on the hybrid MSM/L-S results. For the measured NEE

    fluxes, a varied among sites between 0.6 and 1.1 (Fig. 2),

    indicating pink noise (an explanation for this variation

    across sites is given below). On the other hand, the scaling

    exponent for each model was more typically 1.25,

    indicatingthatthemodeledtimeserieshavealongermemory

    effect (i.e., tending towards red noise) than the measured

    NEEtime series. The site-to-site variationin aofthemeasured

    fluxes was not captured by any model. The MDS model

    consistently came closest to reproducing the spectral proper-ties of the measured fluxes, probably because the narrow

    sliding window at which the algorithm operates minimizes

    persistent biases that could result from more slowly varying

    parameters.

    The a of model residuals was generally between 0.2 and

    0.8 (although for one site, BE1 2000, the values ranged from

    0.1 to 0.2), rather than zero as would be expected if

    residuals were uncorrelated white noise (Fig. 2). The fact that

    the a of theresiduals varied among sites, and in a manner that

    was similar to that for the NEE series (but not captured by any

    model), suggests that the residuals maintain a certain

    temporal correlation structure that is characteristic to each

    site.

    Fig. 1 Comparison of model residuals for MDS and BETHY

    models. (A) Half hourly residuals; (B) weekly mean of

    model residuals. From (B) it is clear that BETHY results in

    stronger and more persistent biases (e.g., negative bias at

    days 120 and 240, positive biases at days 150, 180, and

    210) than MDS.

    Fig. 2 fa scaling exponents estimated from time series of

    measured (+) and modeled (filled circles) net ecosystem

    exchange of CO2 (NEE), and the residual (open circles)

    between these two. Models are as follows: artificial neural

    network model (ANN), non-linear regression model (NLR),

    process-based ecosystem model (BETHY), semi-

    parametric model (SPM), and a look-up table based on

    marginal distribution sampling (MDS); sites (y-axis) are as

    identified inTable 1.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 042

  • 8/13/2019 63 Statistical properties of random CO2 flux

    6/13

    3.3. General properties of the uncertainty

    The statistical properties of the aggregated (i.e., calculated

    across all site-years) data set of inferred random errors for

    the different methods are shown in Table 2A. In all cases,both the mean model residual, and the mean difference

    between paired observations, was relatively close to zero.

    The standard deviation of the inferred random error, s(d),

    was 20% smaller for the paired observation approach

    (2.19 mmol m2 s1) than for the best of the model-based

    approaches (2.733.10 mmol m2 s1), presumably because

    uncertainties estimated via the paired observation approach

    do not incorporate additional model error (conversely, these

    results also suggest that ANN and MDS have the lowest

    model error, while BETHY and NLR have the highest model

    error). The skewness was consistently close to zero,

    suggesting a relatively symmetric error distribution. How-

    ever, the strong positive kurtosis indicated by all methods,approximately10, is characteristic of a leptokurtic error

    distribution, i.e., one characterized by a more strongly

    defined central peak and heavier tails than a normal

    distribution.

    When a single model was used, and moments compared

    across sites (results shown for ANN model inTable 2B), the

    most obvious differences tended to be in terms of the standard

    deviation of the inferred random error, s(d), which was

    smallest for site-years FI1 2001 and 2002, and largest for

    site-years BE1 2001and 2000, and FR1 2001and 2002. Skewness

    was not especially prominent at any particular site. All sites

    had kurtosis of3 or more. For individual sites, skewness and

    kurtosis estimates were similar for all models.

    3.4. Moments in relation to flux magnitude

    Results were somewhat different when the data were broken

    down into bins of different flux magnitude (Fig. 3). Within a

    site, patterns were quite similar regardless ofthe model used,and the model-based approach was largely consistent with

    the paired observation approach, as illustrated by results

    from FR1 2001 (seeFig. 3AC; this site-year was chosen for

    illustration because it had the fewest missing observations,

    21%). Thus, the standard deviation of the inferred random

    error consistently increased with the absolute magnitude of

    theflux,and there were only small differences among models

    in the nature of this relationship (Fig. 3A). For this particular

    site-year, the skewness was generally negligible, i.e., 1, and

    didnot show anyclearpatternsof variation in relation to flux

    magnitude(Fig.3B).Forthe0mmol m2 s1 Fc bin,skewnessof

    random error inferred from NLR model residuals was much

    higher (1.6) than any of the other models, which wereotherwise in the range 0.00.6 (note that while a similar

    pattern was seen for the other year of data for this site, FR1

    2002, the pattern was not repeated across sites). The kurtosis

    was most pronounced, i.e., consistently 5, for the

    0 mmol m2 s1 Fcbin, and appeared to become smaller, i.e.,

    closer to zero, with increasing absolute flux magnitude

    (Fig. 3C).

    Across sites, patterns were consistent when a single

    model was used, as illustrated for the ANN model (Fig. 3DF).

    The rate at which the standard deviation of the flux

    uncertainty increased with the magnitude of the flux (slope)

    was quite similar across sites, but the base level of

    uncertainty (y-axis intercept), i.e., at Fc= 0 mmol m2 s1,

    Table 2 Statistical properties of inferred random flux measurement error (CO2fluxes measured in mmol mS2 sS1)

    Method N Mean Standard deviation Skewness Kurtosis

    (A) All sites, by method

    ANN 121217 0.004 2.73 0.22 11.61

    NLR 121217 0.001 3.10 0.05 9.22

    BETHY 121217 0.048 3.10 0.16 8.37

    MDS 121217 0.067 2.80 0.04 10.23

    SPM 121217 0.000 2.93 0.09 9.66

    Paired 40563 0.029 2.19 0.18 10.52

    Site N Mean Standard deviation Skewness Kurtosis

    (B) ANN method, by site-year

    BE1 2000 12519 0.043 3.21 0.03 3.55

    BE1 2001 12287 0.000 2.89 0.32 7.49

    DE3 2000 11452 0.014 2.46 0.16 7.01

    DE3 2001 11731 0.005 2.17 0.20 6.29

    FI1 2001 11236 0.002 1.28 0.21 7.31

    FI1 2002 11427 0.015 1.32 0.32 8.46

    FR1 2001 13744 0.010 3.60 0.20 8.84

    FR1 2002 13651 0.029 4.12 0.26 7.73

    FR4 2002 11190 0.010 1.81 0.31 2.99

    IT3 2002 11980 0.005 2.29 0.50 6.76

    (A) Two different approaches were used, one based on model residuals and the other on paired observations. For the former, a total of five

    different models were used (ANN, artificial neural network; NLR, nonlinear regression; BETHY, a process model; MDS, marginal distribution

    sampling; SPM, semi-parametric model). For the latter, the random error was inferred from the difference between paired observations made

    24 h apart under similar environmental conditions (Paired). Results based on 10 site-years of data, as described in Methods section. (B) ANN

    model only, with results broken down by site-year.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 43

  • 8/13/2019 63 Statistical properties of random CO2 flux

    7/13

    varied among sites (Fig. 3D; note the consistency with

    patterns reported byRichardson et al., 2006a, as indicated

    by the heavy blue shaded line; see alsoTable 3). Many siteshad skewness 0 regardless of the magnitude of the flux, but

    for some site-years (in particular, FI1 2002 and IT3 2002,

    although these are not specifically identified in the figure),

    there was pronounced skewness (resulting from long,

    asymmetric, tails of the inferred random error), especially

    for the 0 mmol m2 s1 Fc bin (Fig. 3E). The kurtosis pattern

    reported above for FR1 2001 appeared to be a more general

    result, as distributions became more mesokurtic (i.e.,

    kurtosis approached zero) as the absolute magnitude ofFcincreased, and were in fact only strongly leptokurtic for the

    0 mmol m2 s1 Fc bin. The strong kurtosis (as well as the

    skewness) associated with the 0 mmol m2 s1 Fcbin was due

    almost entirely to the extreme tails of the inferred random

    error. Trimming the top and bottom 1% of residuals resulted

    in a nearly symmetric distribution with less pronounced

    kurtosis.These results were generally comparable (compareFig. 3G

    and D, or I and F) to those obtained using the paired

    observation approach. However, although estimates of both

    the standard deviation and the kurtosis of the random

    measurement error (calculated for each flux magnitude bin

    within each site-year) were strongly correlated for model-

    based (ANN) and paired observation approaches (Fig. 4A and

    C), the skewness estimates for these two approaches were not

    at all correlated (Fig. 4B). This discrepancy is attributed to the

    inability of the paired observation approach to characterize

    odd-order moments of the distribution, because it is impos-

    sible to differentiate between a positive error in one

    measurement versus a negative error in the other measure-

    Fig. 3 Comparison of statistical properties (standard deviation, skewness and kurtosis) of inferred flux measurement error

    in relation to CO2flux magnitude. Two different methods, described more fully in the text, were used to estimate the error:

    (1) residuals between measured values and fitted models used for gap-filling (Residual), and (2) differences between

    paired observations made exactly 24 h apart under similar environmental conditions (Paired). Five different models were

    used for (1). In the left column (AC), results are shown for all five models (thin lines), and the paired method (thick red line),

    for a single site-year (FR1_2001; Hesse, France). In the middle column (DF), results are shown across all 10 site-years for a

    single model (ANN, artificial neural network). In the right column (GI), results are shown across all 10 site-years for the

    paired observations method. In panels (D) and (G), the thick blue line shows the uncertainty estimates reported by

    Richardson et al. (2006a)for four forested AmeriFlux sites, estimated using the paired observation method. (For

    interpretation of the references to colour in this figure legend, the reader is referred to the web version of the article.)

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 044

  • 8/13/2019 63 Statistical properties of random CO2 flux

    8/13

    ment. Thus, the paired observation approach should not be

    used to evaluate bias or skewness.

    Two strong conclusions emerge from the above results.

    First, flux measurement errors become more Gaussian asfluxes increase in magnitude. For fluxes close to

    0 mmol m2 s1, skewness, and especially pronounced kur-

    tosis, are observed. Second, flux measurement errors are

    clearly heteroscedastic, meaning that the error variance is

    not constant but rather varies, particularly in relation to flux

    magnitude. We calculated the relationship between flux

    magnitude and flux measurement uncertainty for each site

    using the model-based approach (see Table 3A for ANN

    results). Across the 10 site-years, they-axis intercept of each

    regression was strongly correlated (r= 0.95, P < 0.001) with

    the overall standard deviation of the inferred random

    measurement errors given inTable 2B, but the differences

    in slope among sites was not significant (P= 0.10). When datawere pooled for all site-years, and relationships calculated

    separately for model-based and paired observation

    approaches (Table 3B), the regression lines could not be

    statistically distinguished (P > 0.10). Furthermore, neither of

    theserelationships couldbe distinguished(P > 0.10)fromthat

    reported by Richardson et al. (2006a) for Fc 0. However,

    whereasRichardson et al. (2006a)found that the uncertainty

    increased more rapidly with flux magnitude for positive

    fluxes (nocturnal release) than negative fluxes (daytime

    uptake), the present results did not support this, as the

    difference in slopes for Fc 0 and Fc 0 was not significant

    (P > 0.30) for either model-based or paired observation

    approaches (see alsoStauch et al., in press). This difference

    may be a result of more stringent nocturnal rejection criteria

    (u*filtering, quality flags and other data cleaning, see Papale

    etal.,2006)attheEuropeansitescomparedtothesitesusedby

    Richardson et al. (2006a).

    Table 3 Relationship between CO2flux magnitude (Fc, inmmol mS2 sS1) and the standard deviation of the inferredrandom flux measurement error, s(d)

    Method Relationship

    (A) By site

    BE1 2000 2.91(0.30) + 0.09(0.03) jFcj

    BE1 2001 2.27(0.17) + 0.14(0.02) jFcj

    DE3 2000 1.31(0.33) + 0.18(0.02) jFcjDE3 2001 1.13(0.21) + 0.16(0.01) jFcj

    FI1 2001 0.87(0.24) + 0.12(0.03) jFcj

    FI1 2002 0.91(0.17) + 0.12(0.02) jFcj

    FR1 2001 2.62(0.34) + 0.18(0.02) jFcj

    FR1 2002 3.49(0.38) + 0.13(0.02) jFcj

    FR4 2002 1.28(0.41) + 0.09(0.05) jFcj

    IT3 2002 1.79(0.07) + 0.11(0.01) jFcj

    (B) All sites

    Model residuals (ANN) 1.69(0.20) + 0.16(0.02) jFcj

    Paired observations 1.47(0.22) + 0.17(0.02) jFcj

    (C)Richardson et al. (2006a)

    Fc 0 1.42(0.31) + 0.19(0.02) jFcj

    Fc 0 0.62(0.73) + 0.63(0.09) jFcj

    Results are based on 10 site-years of data from a range of forested

    sites, as described in Methods. In (A), results are calculated

    separately for each site-year, based on ANN model residuals. In

    (B), results are calculated for residual and paired observation

    approaches, using data for all sites together. In (C), previously

    published results for forested sites are shown for comparison.

    Standard errors on estimated coefficients are given in parentheses.

    Fig. 4 Comparison of statistical properties (standard

    deviation, skewness and kurtosis) of inferred flux

    measurement error calculated using two different

    approaches: the residual between measured values and a

    fitted artificial neural network model (ANN), and the

    difference between paired observations made exactly 24 h

    apart under similar environmental conditions (Paired).

    The 1:1 line is shown in all three panels. Reported

    correlation coefficients are for all site-years (each site-year

    shown by a different color) of data.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 45

  • 8/13/2019 63 Statistical properties of random CO2 flux

    9/13

    3.5. Linking flux measurement uncertainty and spectral

    properties

    Additional analysis of the above results suggested some

    interesting relationships between the scaling exponent a, and

    the estimated random flux measurement uncertainty as

    characterized by the overall standard deviation of the inferred

    random measurement error, s(d). First, the site-specificvariation in a of the measured NEE time series (+ symbols

    inFig. 2) was positively correlated with the standard deviation

    of model residuals for each site. Results are shown inFig. 5A

    for the ANN model (r= 0.87), but similar results were obtained

    for all other models (rbetween 0.80 and 0.86). Thus, it should

    be feasible to estimate site-specific average uncertainties

    purely on the basis of the calculated scaling exponent, i.e., as:

    sd 6:

    590:

    72 4:

    750:

    82 a(Coefficients were calculated using reduced major axis regres-

    sion, with standard errors on coefficients given in parenth-

    eses). In thesame vein, a was also strongly correlated(r= 0.84)

    with the y-intercept values reported in Table 3A, and given

    that results above indicated that the slope of the relationship

    between flux magnitude and flux uncertainty did not vary

    among sites, then this also provides a mechanism for devel-

    oping site-specific estimates of thisrelationship, purely on the

    basis ofa, i.e., as:

    sd 5:820:79 4:610:90 a 0:17 jFcj

    Second, the variation in a of model residuals (open circles

    in Fig. 2) was negativelycorrelated with the standard deviationof residuals for each model. Results shown in Fig. 5B are based

    on averages (across all sites) calculated for each model

    (r= 0.89), but similar results were obtained for individual

    sites as well (rbetween 0.72 and 0.98).

    4. Discussion

    4.1. General characteristics of the measurement error

    Random flux measurement uncertainties are characterized by

    non-stationary probability distributions, in that the statistical

    properties of the error vary over time and in relation to themagnitude ofthe flux.Fromthisstudywe find thatthe manner

    in which theuncertainty scales with the magnitude of theflux

    (Table 3) is similar to that presented previously (e.g.,Hollinger

    and Richardson, 2005; Richardson et al., 2006a; Stauch et al., in

    press). This provides a solid foundation for implementing a

    weighted optimization scheme in conducting model-data

    syntheses, whereby observations measured with greater

    confidence (lower s(di)) receive more weight during the

    optimization (i.e., in the cost function) than observations

    measured with less confidence (higher s(di)). Note that the

    non-zero intercept in the relationship between s(di) and jFcj

    means that large-magnitude fluxes have a better signal-to-

    noise ratio, and will still exert a greater influence during theoptimization, than small fluxes.

    The results presented in Fig. 4 indicate more complex

    patterns of variation in the statistical properties of the flux

    uncertainty compared to what has been previously reported

    (cf. Hollinger and Richardson, 2005). In particular, it would

    appear that while random measurement errors tend to be

    approximately Gaussian (symmetric and mesokurtic) for

    jFcj 0 mmol m2 s1, they are frequently non-Gaussian for

    jFcj 0 mmol m2 s1. In this regard, it may be considerably

    more challenging than has been previously acknowledged to

    properly implement a maximum likelihood optimizations

    scheme when inverting ecosystem models against flux data,

    because the error model needs to account forthese differences

    Fig. 5 Relationship between fa scaling exponents and

    estimated random flux measurement uncertainty,

    characterized by the standard deviation of model residuals

    (models are described fully in text). In (A), the scaling

    exponent of raw NEE time series is plotted against the

    standard deviation of ANN model residuals, with each

    data point representing one site-year of data. In (B), the

    scaling exponent of model residual time series is plotted

    against the standard deviation of the model residuals,

    with each data point representing the average across all

    site-years for an individual model. In both panels, the

    fitted line is based on reduced major axis regression.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 046

  • 8/13/2019 63 Statistical properties of random CO2 flux

    10/13

    in the probability distribution function of the measurement

    errors.

    If the random fluxmeasurement uncertainty doesnot follow

    a Gaussian distribution, then what distribution does it follow?

    Hollinger and Richardson (2005) suggested that the distribution

    of inferred flux measurement errors was better approximated

    by a double exponential (Laplace) distribution, which has botha

    prominent central peak and heavy tails, and kurtosis of 3. Bycomparison, Chevallier et al. (2006) suggested that residuals

    between the ORCHIDEE model and flux data from a range of

    sites followed a Cauchy distribution. (From the perspective of

    characterizing uncertainty for the purposes of data-model

    synthesis,the Cauchy distribution is a poorchoice,since its first

    fourmoments are undefined). In a recent data-model synthesis

    Owen et al. (2007)implicitly assumed an error model where a

    large part of the data is Gaussian and high kurtosis is only

    introduced by a few high-leverage points (outliers that strongly

    influence the fourth moment). In the present study, our results

    indicated pronounced error kurtosis for Fc 0 mmol m2 s1,

    although trimming outliers that accounted for the top and

    bottom 1% of theerror distribution resulted in kurtosis that wasintermediate between a Gaussian (kurtosis = 0) and double-

    exponential (kurtosis = 3) distribution.

    Hollinger and Richardson (2005)andStauch et al. (in press)

    suggested that heteroscedasticity, combined with the varying

    frequency of different fluxmagnitudes, could result in an error

    distribution that appeared non-Gaussian, even if each error

    was drawn from a Gaussian distribution. As an example, take

    8000 draws fromX1 N(0,1), and 2000 draws from X2 N(0,5):

    the distribution of these 10,000 draws is characterized by a

    strong peak (predominantly because of X1) and heavy tails

    (predominantly because of X2), and thus appears non-

    Gaussian despite the normality of bothX1and X2. While this

    addition of distributions no doubt contributes to fact that ateach site the overall kurtosis is always3 or larger, it does not

    explain why when the analysis is restricted to relatively

    narrow Fc bins, the kurtosis is consistently >5 for

    jFcj 0 mmol m2 s1. The present analysis gives additional

    support for the idea that, at least for near-zero fluxes, the

    uncertainty is better characterized by a double exponential

    than a Gaussian distribution.

    4.2. Patterns across sites

    Random flux measurement uncertainties have been shown to

    vary among sites, especially in relation to vegetation type. For

    example,Richardson et al. (2006a)reported that uncertaintiesin both carbon and energy fluxes were smaller at a grassland

    site compared to forested sites. However, even among forested

    sites there may be substantial variation in uncertainty; we

    found, forexample, that uncertainties were roughlythree-fold

    larger at FR1 than FI1 (Table 3A). The results presented in

    Table 3A areconsistent with those shown previously by Moffat

    et al. (2007) and Stauch et al. (in press), i.e., for a given Fc, there

    is greater uncertainty at FR1 than either DE3 or FR4.

    Differences across sites in s(d) can be attributed to a range

    of factors, such as measurement height, surface roughness,

    wind speed, surface heterogeneity, and measurement system

    errors. These appear to affect the base level of uncertainty, but

    not the manner in which s(d) scales with Fc, as only the

    intercept, but not the slope, of regressions presented in

    Table 3A were found to vary significantly among these

    forested sites.

    4.3. Uncertainty and the color of flux data time series

    Spectral analysis methods enable researchers to view time

    series in the frequency domain rather than the standardtemporal domain; in this manner, the time scales (from

    minutes to years) at which relevant variation occurs can be

    identified. Previous frequency domain analyses have made

    use of orthonormal wavelet transformations (e.g.,Katul et al.,

    2001) and Fourier analysis (e.g.,Baldocchi et al., 2001a). Here

    we used an alternative approach, based on the Lomb-Scargle

    periodogram (Lomb, 1976; Scargle, 1982) and the multiple

    segmentation method of Miramontes and Rohani (2002),

    which is naturally suited to investigating the correlation

    structure of short time series with missing data (i.e., gaps).

    This analysis showed that while the fa scaling exponent for

    most sites indicated a moderate correlation structure (pink

    noise), there were some sites that tended more towards rednoise (e.g., FR4) and other sites that tended towards white

    noise (FR1, BE1). An interesting finding was that there was a

    robust connection between the estimated random flux

    measurement uncertainty and the scaling exponent, i.e., a

    strong correlation between a and s(d) (Fig. 5A). A likely

    explanation for this result is that at sites with larger

    measurement uncertainties, the respective white noise

    components of the NEE measurements tends to sufficiently

    obscure the underlying signal such that persistence is much

    less pronouncedcomparedto sites with smaller measurement

    uncertainties. Thus, larger random flux measurement errors

    lead to whiter NEE time series.

    Similarly, variation among models in the aof residual timeseries was found to be correlated with the standard deviation

    of residuals for each model (Fig. 5B). A possible interpretation

    for this pattern is that models that fit the NEE time series less

    well tend to have not only larger residuals (and hence higher

    s(d)), but also largerand more long-livedsystematic biases (i.e.,

    greater persistence and hence more negative values of a).

    Thus, increasing model error (compare BETHY and MDS)

    results in reddened model residuals.

    The most flexible models used here, ANN and MDS, yielded

    inferred random errors that were closest to white noise and

    also had statistical properties that were most similar to those

    of differenced paired measurements. An explanation for this

    is that predictions of ANN and MDS models for time periodieffectively represent the mean of all fluxes measured under

    similar environmental conditions and at a similar time of day

    and time of the year as those that prevail at time i. Residuals

    from these models therefore can be seen as analogues of the

    paired measurement approach; the difference being that the

    model residual is the difference between a measurement (Fi)

    and the mean of a population of measurements (mi), rather

    than the difference between two pure measurements.

    4.4. Systematic errors and biases

    Until now, ouranalysisand discussion has focused on random

    flux measurement errors. Systematic flux measurement

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 47

  • 8/13/2019 63 Statistical properties of random CO2 flux

    11/13

    errors, or biases, are often dealt with by flagging according

    to data quality, after applying pre-defined filters (Foken and

    Wichura, 1996). Poor quality data often occur when low-

    frequency turbulent motions or non-stationary conditions

    occur. This is because there is substantial uncertainty about

    the extent to which low-frequency covariances contribute to

    the actual exchange between surface and atmosphere (von

    Randow et al., 2002). Apart from these, a wide range of errorsources contributes to bias, ranging from poor instrument

    maintenance and uncertain delays in sampling tubes to

    uncertainty on the height of the zero-plane displacement

    (which determines the magnitude of several frequency

    corrections to the data) (Kruijt et al., 2004). A number of

    systematic errors are uncertain even in sign, while for others

    the corrections to be made are themselves highly uncertain.

    This suggests that systematic errors (and attempts to correct

    for them) have a random component to them, although

    usually the error source varies on a (much) larger time scale

    (compared to the purely random measurement errors we have

    quantified here). Kruijt et al. (2004) assessed the importance of

    such errors by assuming ignorance on their magnitude and re-calculating fluxes from rawdata using a range of configuration

    settings (note also that Kruijt et al. expressed uncertainties in

    terms of percentages, rather than absolute amounts, which

    implies that systematic errors are heteroscedastic as well,

    similar to results reported here for random errors). The

    resulting standard errors can be used to assess random

    uncertainty, but, as they actat longertime scales, error will not

    reducewith the numberof data taken as would be expected for

    independent data points. Rather, for each error source the

    appropriate number of degrees of freedom in the data should

    be assessed, by assessing the periodicity of, for example,

    maintenance, wind directions, and other factors that affect

    slow errors. The methodology to assess such ranges oferrors at different time scales still needs to be developed, but

    in general this suggests that, instead of distinguishing

    random and systematic error, there is a range of time

    scales at which errors should be assessed.

    5. Summary and conclusions

    We evaluated the statistical properties of random flux

    measurement errors inferred first from model residuals, and

    second, using a standard paired measurement approach. The

    five models used in the present analysis were quite different

    with respect to structure and parameterization, but alladequately captured the patterns of variation in the flux data

    time series across a range of temporal scales. However, the

    power spectra of the model predictions indicated too little

    power at the highest frequencies (stochastic variation that is

    due to random measurement errors) and too much power at

    lower frequencies. There was a tendency for the residuals of

    some models to exhibit greater persistence than other models,

    and this memory effect (reddened model residuals) was

    attributed to model error which resulted in larger and more

    long-lived systematic biases.

    Two models in particular, ANN and MDS, made no prior

    assumptions about the functional form of relationships

    between environmental drivers and CO2 fluxes, and the

    resulting uncertainty estimates were in closest agreement

    with the paired observation approach. Compared to the paired

    observation approach, an advantage of inferring uncertainties

    from model residuals is that many more data points are

    available(an average of three times as many for ourdata) from

    which the statistics of the distributioncan be estimated;this is

    especially relevant with short flux time series, where there

    may be an insufficient number of paired observations meetingthe criteria for environmental similarity. While the model

    residual method does not require two towers, or arbitrary

    limits for what constitutes similar environmental condi-

    tions on successive days, care must be taken to ensure that a

    good model is used, so that model error does not confound

    uncertainty estimates.

    In all cases, the flux measurement uncertainty increased

    with flux magnitude. The base level of uncertainty differed

    markedly among sites, and patterns of variation were

    consistent regardless of whether residuals or paired measure-

    ments were used to infer the error, suggesting that both

    methods yield robust uncertainty estimates that are sensitive

    to real underlying differences in uncertainty (caused by,among other things, site heterogeneity and measurement

    system characteristics). We were able to link the site-specific

    uncertainties to the spectral properties (fa scaling) of the

    original NEE time series: these results indicated the potential

    for estimatings(d) on the basis of the scaling coefficient a .

    The results presented here offer a basis by which the

    random flux measurement errors can be specified for a variety

    of applications, including data-model fusion efforts, statisti-

    cally rigorous model evaluation, and regional-to-continental

    integration and synthesis projects. A logical next step would

    be to reconcile randomand systematicerrorsundera common

    framework, so that the total uncertainty in measured fluxes

    (and annual integrals) might be more accurately specified.

    Acknowledgements

    A.D.R. was supported by the Office of Science (BER), U.S.

    Department of Energy, through Interagency Agreement No.

    DE-AI02-07ER64355. D.P. was supported by the CarboEurope-IP

    research program, funded by the European Commission. The

    Max Planck Institute for Biogeochemistry provided funding

    and support for the gap filling comparison workshop held in

    Jena in September 2006. Site PIs Marc Aubinet (Vielsalm),

    WernerKutsch (Hainich), Andre Granier (Hesse), Serge Rambal

    (Puechabon), Riccardo Valentini (Roccarespampani) and TimoVesala (Hyytiala) are thanked for making their data available.

    Ankur Desai and three anonymous reviewers are all thanked

    for their constructive criticism.

    r e f e r e n c e s

    Aubinet, M., Berbigier, P., Bernhofer, C.H., Cescatti, A.,Feigenwinter, C., Granier, A., Grunwald, T.H., Havrankova,K., Heinesch, B., Longdoz, B., Marcolla, B., Montagnani, L.,Sedlak, P., 2005. Comparing CO2storage and advectionconditions at night at different CARBOEUROFLUX sites.

    Boundary-Layer Meteorol. 116, 6394.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 048

  • 8/13/2019 63 Statistical properties of random CO2 flux

    12/13

    Aubinet, M., Chermanne, B., Vandenhaute, M., Longdoz, B.,Yernaux, M., Laitat, E., 2001. Long term carbon dioxideexchange above a mixed forest in the Belgian Ardennes.Agric. Forest Meteorol. 108, 293315.

    Baldocchi, D., Falge, E., Wilson, K., 2001a. A spectral analysis ofbiosphere-atmosphere trace gas flux densities andmeteorological variables across hour to multi-year timescales. Agric. Forest Meteorol. 107, 127.

    Baldocchi, D., Falge, E., Gu, L., Olson, R., Hollinger, D., Running,S., Anthoni, P., Bernhofer, C., Davis, K., Evans, R., Fuentes, J.,Goldstein, A., Katul, G., Law, B., Lee, X., Malhi, Y., Meyers, T.,Munger, W., Oechel, W., Paw U, K.T., Pilegaard, K., Schmid,H.P., Valentini, R., Verma, S., Vesala, T., Wilson, K., Wofsy,S., 2001b. FLUXNET: a new tool to study the temporal andspatial variability of ecosystem-scale carbon dioxide, watervapor, and energy flux densities. Bull. Am. Meteorol. Soc. 82,24152434.

    Chevallier, F., Viovy, N., Reichstein, M., Ciais, P., 2006. On theassignment of prior errors in Bayesian inversions of CO2surface fluxes. Geophys. Res. Lett. 33, Art. no. L13802.

    Churkina, G., Schimel, D., Braswell, B.H., Xiao, X.M., 2005.Spatial analysis of growing season length control over netecosystem exchange. Global Change Biol. 11, 17771787.

    Churkina, G., Tenhunen, J., Thornton, P., Falge, E.M., Elbers, J.A.,Erhard, M., Grunwald, T., Kowalski, A.S., Rannik, U., Sprinz,D., 2003. Analyzing the ecosystem carbon dynamics of fourEuropean coniferous forests using a biogeochemistrymodel. Ecosystems 6, 168184.

    Falge, E., Baldocchi, D., Olson, R., Anthoni, P., Aubinet, M.,Bernhofer, C., Burba, G., Ceulemans, R., Clement, R.,Dolman, H., Granier, A., Gross, P., Grunwald, T., Hollinger,D., Jensen, N.O., Katul, G., Keronen, P., Kowalski, A., Lai,C.T., Law, B.E., Meyers, T., Moncrieff, J., Moors, E., Munger,

    J.W., Pilegaard, K., Rannik, U., Rebmann, C., Suyker, A.,Tenhunen, J., Tu, K., Verma, S., Vesala, T., Wilson, K.,Wofsy, S., 2001. Gap filling strategies for defensible annualsums of net ecosystem exchange. Agric. Forest Meteorol.107, 4369.

    Foken, T., Wichura, B., 1996. Tools for quality assessment ofsurface-based flux measurements. Agric. Forest Meteorol.78, 83105.

    Friend, A.D., Arneth, A., Kiang, N.Y., Lomas, M., Ogee, J.,Rodenbeck, C., Running, S.W., Santaren, J.-D., Sitch, S.,Viovy, N., Woodward, F.I., Zaehle, S., 2006. FLUXNET andmodelling the global carbon cycle. Global Change Biol. 13,610633.

    Goulden, M.L., Munger, J.W., Fan, S.-M., Daube, B.C., Wofsy, S.C.,1996. Measurements of carbon sequestration by long-termeddy covariance: methods and critical evaluation ofaccuracy. Global Change Biol. 2, 169182.

    Granier, A., Biron, P., Lemoine, D., 2000. Water balance,transpiration and canopy conductance in two beech stands.Agric. Forest Meteorol. 100, 291308.

    Hagen, S.C., Braswell, B.H., Linder, E., Frolking, S., Richardson,A.D., Hollinger, D.Y., 2006. Statistical uncertainty of eddy-flux based estimates of gross ecosystem carbon exchange atHowland Forest, Maine. J. Geophys. Res.Atmos. 111, Art.no. D08S03.

    Hollinger, D.Y., Aber, J., Dail, B., Davidson, E.A., Goltz, S.M.,Hughes, H., Leclerc, M.Y., Lee, J.T., Richardson, A.D.,Rodrigues, C., Scott, N.A., Achuatavarier, D., Walsh, J., 2004.Spatial and temporal variability in forest-atmosphere CO2exchange. Global Change Biol. 10, 16891706.

    Hollinger, D.Y., Richardson, A.D., 2005. Uncertainty in eddycovariance measurements and its application tophysiological models. Tree Physiol. 25, 873885.

    Katul, G., Hsieh, C.I., Bowling, D., Clark, K., Shurpali, N.,Turnipseed, A., Albertson, J., Tu, K., Hollinger, D., Evans, B.,

    Offerle, B., Anderson, D., Ellsworth, D., Vogel, C., Oren, R.,

    1999. Spatial variability of turbulent fluxes in the roughnesssublayer of an even-aged pine forest. Boundary-LayerMeteorol. 93, 128.

    Katul, G., Lai, C.-T., Schafer, K., Vidakovic, B., Albertson, J.,Ellsworth, D., Oren, R., 2001. Multiscale analysis ofvegetation surface fluxes: from seconds to years. Adv.Water Resour. 24, 11191132.

    Knohl, A., Schulze, E.D., Kolle, O., Buchmann, N., 2003. Large

    carbon uptake by an unmanaged 250-year-old deciduousforest in Central Germany. Agric. Forest Meteorol. 118, 151167.

    Knorr, W., Kattge, J., 2005. Inversion of terrestrial ecosystemmodel parameter values against eddy covariancemeasurements by Monte Carlo sampling. Global ChangeBiol. 11, 13331351.

    Kruijt, B., Elbers, J.A., von Randow, C., Araujo, A.C., Oliveira, P.J.,Culf, A., Manzi, A.O., Nobre, A.D., Kabat, P., Moors, E.J., 2004.The robustness of eddy correlation fluxes for Amazon rainforest conditions. Ecol. Appl. 14, S101S113.

    Law, B.E., Falge, E., Gu, L., Baldocchi, D.D., Bakwin, P., Berbigier,P., Davis, K., Dolman, A.J., Falk, M., Fuentes, J.D., Goldstein,A., Granier, A., Grelle, A., Hollinger, D., Janssens, I.A., Jarvis,P., Jensen, N.O., Katul, G., Mahli, Y., Matteucci, G., Meyers,

    T., Monson, R., Munger, W., Oechel, W., Olson, R., Pilegaard,K., Paw U, K.T., Thorgeirsson, H., Valentini, R., Verma, S.,Vesala, T., Wilson, K., Wofsy, S., 2002. Environmentalcontrols over carbon dioxide and water vapor exchange ofterrestrial vegetation. Agric. Forest Meteorol. 113, 97120.

    Lee, X., 1998. On micrometeorological observations of surfaceair exchange over tall vegetation. Agric. Forest Meteorol. 91,3949.

    Lenschow, D.H., Mann, J., Kristensen, L., 1994. How long is longenough when measuring fluxes and other turbulencestatistics? J. Atmos. Oceanic Technol. 11, 661672.

    Lloyd, J., Taylor, J.A., 1994. On the temperature dependence ofsoil respiration. Funct. Ecol. 8, 315323.

    Loescher, H.W., Law, B.E., Mahrt, L., Hollinger, D.Y., Campbell, J.,Wofsy, S.C., 2006. Uncertainties in, and interpretation of,

    carbon flux estimates using the eddy covariance technique.J. Geophys. Res.Atmos. 111, Art. no. D21S90.

    Lomb, N.R., 1976. Least-squares frequency analysis of unequallyspaced data. Astrophys. Space Sci. 39, 447462.

    Mandelbrot, B.B., 1999. Multifractals and 1/fnoise. Springer,New York.

    Mann, J., Lenschow, D.H., 1994. Errors in airborne fluxmeasurements. J. Geophys. Res.Atmos. 99, 1451914526.

    Massman, W.J., Lee, X., 2002. Eddy covariance flux calculationsand uncertainties in long-term studies of carbon and energyexchanges. Agric. Forest Meteorol. 113, 121144.

    Miramontes, O., Rohani, P., 2002. Estimating 1/fa scalingexponents from short time-series. Phys. D-NonlinearPhenomena 166, 147154.

    Moffat, A.M., Papale, D., Reichstein, M., Hollinger, D.Y.,

    Richardson, A.D., Barr, A.G., Beckstein, C., Braswell, B.H.,Churkina, G., Desai, A.R., Falge, E., Gove, J.H., Heimann, M.,Hui, D., Jarvis, A.J., Kattge, J., Noormets, A., Stauch, V.J.,2007. Comprehensive comparison of gap filling techniquesfor net carbon fluxes. Agric. Forest Meteorol. 147,209232.

    Moncrieff, J.B., Malhi, Y., Leuning, R., 1996. The propagation oferrors in long-term measurements of land-atmospherefluxes of carbon and water. Global Change Biol. 2, 231240.

    Oren, R., Hsieh, C.I., Stoy, P., Albertson, J., McCarthy, H.R.,Harrell, P., Katul, G.G., 2006. Estimating the uncertainty inannual net ecosystem carbon exchange: spatial variation inturbulent fluxes and sampling errors in eddy-covariancemeasurements. Global Change Biol. 12, 883896.

    Owen, K.E., Tenhunen, J., Reichstein, M., Wang, Q., Falge, E.,

    Geyer, R., Xiao, X., Stoy, P., Amman, C., Arain, A., Aubinet,

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 0 49

  • 8/13/2019 63 Statistical properties of random CO2 flux

    13/13

    M., Aurela, M., Bernhofer, C., Chojnicki, B.H., Granier, A.,Gruenwald, T., Hadley, J., Heinesch, B., Hollinger, D., Knohl,A., Kutsch, W., Lohila, A., Meyers, T., Moors, E., Moureaux,C., Pilegaard, K., Saigusa, N., Verma, S., Vesala, T., Vogel, T.,2007. Linking flux network measurements to continentalscale simulations: ecosystem carbon dioxide exchangecapacity under non-water-stressed conditions. GlobalChange Biol. 13, 734760.

    Papale, D., Reichstein, M., Aubinet, M., Canfora, E., Bernhofer,C., Kutsch, W., Longdoz, B., Rambal, S., Valentini, R., Vesala,T., Yakir, D., 2006. Towards a standardized processing of netecosystem exchange measured with eddy covariancetechnique: algorithms and uncertainty estimation.Biogeosciences 3, 571583.

    Papale, D., Valentini, R., 2003. A new assessment of Europeanforests carbon exchanges by eddy fluxes and artificialneural network spatialization. Global Change Biol. 9,525535.

    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.,2002. Numerical Recipes in C++, second ed. Cambridge UP,Cambridge.

    Rambal, S., Joffre, R., Ourcival, J.M., Cavender-Bares, J.,Rocheteau, A., 2004. The growth respiration component in

    eddy CO2flux from aQuercus ilex mediterranean forest.Global Change Biol. 10, 14601469.

    Rannik, U., Kolari, P., Vesala, T., Hari, P., 2006. Uncertainties inmeasurement and modelling of net ecosystem exchange ofa forest. Agric. Forest Meteorol. 138, 244257.

    Raupach, M.R., Rayner, P.J., Barrett, D.J., Defries, R.S., Heimann,M., Ojima, D.S., Quegan, S., Schmullius, C.C., 2005. Model-data synthesis in terrestrial carbon observation: methods,data requirements and data uncertainty specifications.Global Change Biol. 11, 378397.

    Reichstein, M., Falge, E., Baldocchi, D., Papale, D., Aubinet, M.,Berbigier, P., Bernhofer, C., Buchmann, N., Gilmanov, T.,Granier, A., Grunwald, T., Havrankova, K., Ilvesniemi, H.,

    Janous, D., Knohl, A., Laurila, T., Lohila, A., Loustau, D.,Matteucci, G., Meyers, T., Miglietta, F., Ourcival, J.M.,

    Pumpanen, J., Rambal, S., Rotenberg, E., Sanz, M.,Tenhunen, J., Seufert, G., Vaccari, F., Vesala, T., Yakir, D.,Valentini, R., 2005. On the separation of net ecosystemexchange into assimilation and ecosystem respiration:review and improved algorithm. Global Change Biol. 11,14241439.

    Richardson, A.D., Braswell, B.H., Hollinger, D.Y., Burman, P.,Davidson, E.A., Evans, R.S., Flanagan, L.B., Munger, J.W.,Savage, K., Urbanski, S.P., Wofsy, S.C., 2006b.Comparing simple respiration models for eddy flux anddynamic chamber data. Agric. Forest Meteorol. 141,219234.

    Richardson, A.D., Hollinger, D.Y., 2005. Statistical modeling ofecosystem respiration using eddy covariance data:maximum likelihood parameter estimation, and Monte

    Carlo simulation of model and parameter uncertainty,applied to three simple models. Agric. Forest Meteorol. 131,191208.

    Richardson, A.D., Hollinger, D.Y., Burba, G.G., Davis, K.J.,Flanagan, L.B., Katul, G.G., Munger, J.W., Ricciuto, D.M.,Stoy, P.C., Suyker, A.E., Verma, S.B., Wofsy, S.C., 2006a. Amulti-site analysis of random error in tower-basedmeasurements of carbon and energy fluxes. Agric. ForestMeteorol. 136, 118.

    Rohani, P., Miramontes, O., Keeling, M.J., 2004. The colour ofnoise in short ecological time series data. Mathematical

    Med. Biol. 21, 6372.Scargle, J.D., 1982. Studies in astronomical time series analysis.

    2. Statistical aspects of spectral analysis of unevenly spaceddata. Astrophys. J. 263, 835853.

    Stauch, V.J., Jarvis, A.J., Schulz, K., 2007. Estimation of netcarbon exchange using eddy covariance CO2fluxobservations and a stochastic model. J. Geophys. Res.Atmos., in press.

    Stauch, V.J., Jarvis, A.J., 2006. A semi-parametric model for eddycovariance CO2flux time series data. Global Change Biol. 12,17071716.

    Suni, T., Rinne, J., Reissell, A., Altimir, N., Keronen, P., Rannik,U., Dal Maso, M., Kulmala, M., Vesala, T., 2003. Long-termmeasurements of surface fluxes above a Scots pine forest inHyytiala, southern Finland, 19962001. Boreal Environ. Res.

    8, 287301.Tedeschi, V., Rey, A., Manca, G., Valentini, R., Jarvis, P.G.,

    Borghetti, M., 2006. Soil respiration in a Mediterranean oakforest at different developmental stages after coppicing.Global Change Biol. 12, 110121.

    Thornton, P.E., Law, B.E., Gholz, H.L., Clark, K.L., Falge, E.,Ellsworth, D.S., Golstein, A.H., Monson, R.K., Hollinger, D.,Falk, M., Chen, J., Sparks, J.P., 2002. Modeling and measuringthe effects of disturbance history and climate on carbonand water budgets in evergreen needleleaf forests. Agric.Forest Meteorol. 113, 185222.

    Trudinger, C.M., Raupach, M.R., Rayner, P.J., Kattge, J., Liu, Q.,Pak, B., Reichstein, M. Renzullo, L., Richardson, A.D.,Roxburgh, S., Styles, J., Wang, Y.P., Briggs, P., Barrett, D.,Nikolova, S., 2007. OptIC project: An intercomparison of

    optimization techniques for parameter estimation interrestrial biogeochemical models. J. Geophys. Res.Biogeosci., Art. no. G02027.

    von Randow, C., Sa, L.D.A., Gannabathula, P.S.S.D., Manzi, A.O.,Arlino, P.R.A., Kruijt, B., 2002. Scale variability ofatmospheric surface layer fluxes of energy and carbon overa tropical rain forest in southwest Amazonia 1. Diurnalconditions. J. Geophys. Res.Atmos. 107, Art. no. 8062.

    Wilson, K., Goldstein, A., Falge, E., Aubinet, M., Baldocchi, D.,Berbigier, P., Bernhofer, C., Ceulemans, R., Dolman, H.,Field, C., Grelle, A., Ibrom, A., Law, B.E., Kowalski, A.,Meyers, T., Moncrieff, J., Monson, R., Oechel, W., Tenhunen,

    J., Valentini, R., Verma, S., 2002. Energy balance closure atFLUXNET sites. Agric. Forest Meteorol. 113, 223243.

    Wofsy, S.C., Harriss, R.C. (Eds.), 2002. The North American

    Carbon Program (NACP). Report of the NACP Committee ofthe U.S. Interagency Carbon Cycle Science Program. U.S.Global Change Research Program, Washington, D.C.

    a g r i c u l t u r a l a n d f o r e s t m e t e o r o l o g y 1 4 8 ( 2 0 0 8 ) 3 8 5 050