inferring past environments from biological data - progress, problems, and pitfalls

INFERRING PAST

ENVIRONMENTS FROM

BIOLOGICAL DATA -

PROGRESS, PROBLEMS, AND

PITFALLSH.J.B. Birks

University of Bergen&

University College London

BASIC IDEA OF BIOINDICATION OR ENVIRONMENTAL RECONSTRUCTION

Fossil biological data Environmental variable(e.g., pollen, chironomids) (e.g., temperature)'Proxy data'

1, ........... m species 1

YO XO Unknown.

t tsamples samples

To be estimated or reconstructed

To solve for XO, need modern data or 'training data' or 'calibration set'

1, ........... m species 1

Y X

n nsamples samples

Modern biology Modern environment(e.g., pollen, chironomids) (e.g., temperature)

BASIC BIOLOGICAL ASSUMPTIONS

Marine planktonic foraminifera - Imbrie & Kipp 1971Foraminifera are a function of sea-surface temperature Foraminifera can be used to reconstruct past sea-surface temperaturePollenPollen is a function of vegetationVegetation is a function of climate Pollen is an indirect function of climate and can be used to reconstruct past climate

Chironomids (aquatic non-biting midges)Chironomids are a function of lake-water temperatureLake-water temperature is a function of climate Chironomids are an indirect function of climate and can be used to reconstruct past climate

Freshwater diatoms (microscopic algae)Diatoms are a function of lake-water chemistry Diatoms can be used to reconstruct past lake-water chemistryLake-water chemistry may be some weak function of climate Diatoms may be a weak function of climate

BIOLOGICAL 'PROXY' DATA PROPERTIES

May have 200-300 species, expressed as proportions or percentages in 200-500 samples

Multicollinearity

Biological data contain many zero values (absences)

Species invariably show non-linear unimodal responses to their environment, not simple linear responses

'PROXIES'

Betula (birch) Alnus (alder) Quercus (oak)

Pinus (pine) Empetrum nigrum (crowberry)

Agropyron repens (Gramineae) (grass)

Modern pollen, identical treatment, all at same magnification, all stained with safranin

Pollen - good indicators of vegetation and hence indirect indicators of climate.

Chironomids - good indicators of past lake-water temperatures and hence past climate

Common late-glacial chironomid taxa. A: Tanytarsina; b: Sergentia; c: Heterotrissocladius; d: Hydrobaenus/Oliveridia; e: Chironomus; f: Dicrotendipes; g: Microtendipes; h: Polypedilum; i: Cladopelma. Scale bar represents 50 m.

Freshwater diatoms - excellent indicators of lake-water chemistry (e.g. pH, total P). Not reliable climate indicators.

BASIC NUMERICAL MODELS

(1) Y = f(X) + error

Biology Environment

(2) Estimate f by some mathematical procedure and 'invert' our esti-mated (f) to find unknown past environment X0 from fossil data Y0

XO f-1(YO)

(3) X = g(Y) + error

(4) XO = g(YO)

Obtain 'plug-in' estimate of past environment XO from fossil data YO

f or g are 'transfer functions'

In practice, for various mathematical reasons, do an inverse regression or calibration

INVERSE APPROACH

CLASSICAL APPROACH

'INVERSE' PROCEDURES

1. Principal components regression. Imbrie & Kipp (1971)

Multiple linear regression or polynomial regression of X on PC1, PC2, PC3, etc.

PCA components maximise

variance within Y

Selection of components done visually until very recently. Now cross-validation is used to select model with fewest components and low RMSEP and maximum bias.

Y X

PC1PC2PC3

2. Two-way weighted averaging. ter Braak & van Dam (1989) and Birks et al. (1990)

(i) Estimate species optima (u) by weighted averaging of the environmental variable (x) of the sites. Species abundant at a site will tend to have their ecological optima close to the environmental variable at that site. (WA regression).

(ii) Estimate the environmental values (x) at the sites by weighted averaging of the species optima (u). (WA calibration.)

(iii) Because averages are taken twice, the range of estimated x-values is shrunken, and a simple 'inverse' or 'classical' deshrinking is required. Usually regress x on the preliminary estimates (x) and take the fitted values as final estimates of x.

Can downweight species in step (ii) by their estimated WA tolerances (niche breadths) so that species with wide tolerances have less weight than species with narrow tolerances

^

^^

^

3. Weighted averaging partial least squares regression (WA-PLS). ter Braak & Juggins (1983) and ter Braak et al. (1993)

Y X

PLS1PLS2PLS3 Components selected to

maximise covariance between species weighted averages and environmental variable x

Selection of number of PLS components to include based on cross-validation. Model selected should have fewest components possible and low RMSEP and maximum bias.

4. Modern analog technique (MAT) = k-nearest neighbours (k-NN). Hutson (1980), Prell (1985), ter Braak (1995), et al.

Repeat for all fossil samples

Repeat for all modern samples

Compare fossil sample t with

modern sample i

Calculate DC between t and i

Estimate past environment for sample t as (weighted)

mean of the environment of the k analogues

Select k-closest analogues for fossil

sample tValue of k estimated by visual inspection, arbitrary rules (e.g., 10, 20, etc.), or cross-validation

USE OF METHODS

Marine studies(foraminifera, diatoms)

- PCR, some MAT plus variants, very few WA or WA-PLS uses

Freshwater studies(diatoms, chironomids)

- WA or WA-PLS, very few MAT uses

Terrestrial studies (pollen)

- MAT plus variants, some WA-PLS uses

In comparisons using simulated and real data, WA and WA-PLS usually outperform PCR and MAT but not always.

Classical methods of Gaussian logit or multinomial logit regression and calibration rarely used (freshwater, terrestrial). Some applications of artificial neural networks and few studies within a Bayesian framework.

Bayesian framework may be an important future research direction.

HIDDEN BASIC ASSUMPTIONS

1. Species in training set (Y) are systematically related to the physical environment (X) in which they live.

2. Environmental variable (XO , e.g. summer temperature) to be reconstructed is, or is linearly related to, an ecologically important variable in the system.

4. Mathematical methods used in regression and calibration adequately model the non-linear biological responses (Gm) to the environmental variable (X).

5. Other environmental variables than, say, summer temperature have negligible influence, or their joint distribution with summer temperature in the past is the same as in the modern training set.

3. Species in the training set (Y) are the same as in the fossil data (YO) and their ecological responses (Gm) have not changed significantly over the timespan represented by the fossil assemblage.

MODEL PERFORMANCE AND SELECTION

1. Root mean square error of prediction (RMSEP) as low as possible.

2. Maximum bias as low as possible.

3. Smallest number of components to avoid 'overfitting'.

Based on leave-one-out cross-validation, n-fold cross-validation, or boot-strapping. Very rare to have an independent test set.

MODEL VALIDATION

Compare reconstructed values with historical data. Rarely possible as few historical data exist.

But when done, sometimes the model that gives the closest cor-respondence is not the model with lowest RMSEP or maxi-mum bias!

Conflict between model performance and selection based on cross-validation and validation results using independent historical test-sets.

Renberg & Hultberg (1992)

304 modern pollen samples Norway, northern Sweden, Finland (Sylvia Peglar, Heikki Seppä, John Birks, Arvid Odland)

Seppä & Birks (2001)

AN EXAMPLE OF RECONSTRUCTING PAST

CLIMATE FROM POLLEN DATA

RMSEP R2 Max. bias

July temperature (7.7 - 17.1ºC) 1.0ºC 0.73 3.64ºC

Annual precipitation (300 - 3234 mm)

341 mm 0.71 960 mm

Performance statistics - WA-PLS - leave-one-out cross-validation


Summary pollen diagram from Tsuolbmajavri, northern Finland. The age scale in modelled calibrated years BP is shown along with four phases. The total pollen- and spore-accumulation rate (grains cm-2 yr-1) is also shown. The hollow silhouette curves denote the 10 x exaggeration of the percentages.



RECONSTRUCTIONS

RECONSTRUCTION VALIDATION

Tibetanus, Abisko Valley, Sweden

Hammarlund et al. (2002)

Inferred from pollen

Inferred from pollen

Isotopes Theory

BROAD-SCALE PATTERNS

Changes in July summer temperature relative to present-day reconstructed temperature on a south-north transect west of the Scandes mountains. 16 sites covering all or much of the Holocene.

South

Anne Bjune et al.

North

FINE-RESOLUTION CHANGES

Oxygen isotope ratios in Greenland ice-core

Inferred mean July air temperature

Brooks & Birks, (2000)

STATISTICAL AND BIOLOGICAL PROBLEMS AND PITFALLS

1. Sample specific errors of reconstruction for fossil samples.

Estimate by boot-strapping.

Mean square error of prediction (MSEP) =

Error due to variability in Error due to variation in species estimates of species para- + abundances at a given environ-meters in the training set mental value (actual prediction (s.e. of boot-strap estimates) error between observed and mean boot-strap estimate)

boot

bootibooti

boot

bootibooti

n

xx

n

xx 2,,

2,, )()ˆ(

s1 s2

where xi,boot is the mean of xi,boot for all cycles when i is in the test

set.

RMSEP = (s1 + s2)½ (s1 usually ca. 25% of RMSEP, s2 ca.

75%)

For temperature RMSEP usually 1-1.5ºC (about 10% of the modern range sampled)

pH RMSEP usually 0.3-0.5 pH units (about 10%)

Components of RMSEP

(i) Within-lake variability - Heiri et al. (2002) Maximum of 15% of total RMSEP.

(ii) Variability in modern environmental data - Nilsson et al. (1996). Can be 30-40% (even 70%) of total RMSEP. Major problem. Cannot take account of natural variability of environmental data.

(iii) Variance in the model (model error or lack of fit).

What to do with sample specific errors?

There is a consistent temporal trend but also continuous overlap in RMSEP!

2. How do we identify signal from noise in reconstructions? LOESS smoothers are a help.


Brooks & Birks (2001)

Trends or RMSEP?

3. Different methods, although they have similar modern model performances, can give very different reconstruction results.

Birks (2003)

4. Some indication of consistent model bias when applied to fossil data.

MAT - low variability, insensitive

WA - some variability, overestimates at low values

WA-PLS - more variability

PCR - considerable variability

but in terms of modern model performance, all seem good in terms of RMSEP and maximum bias.

Extensive experiments using simulated independent test data-sets currently underway by Richard Telford are showing important model differences and biases.

5. Biological data, when sampled over natural environmental gradients, show a mixture of symmetric unimodal (40%) and monotonic responses (40%) and some skewed unimodal responses (5%) and no statistically significant responses (ca. 15%), great variation in species tolerances or niche breadths, and a compositional turnover gradient of 3-4 standard deviations.Perhaps too many monotonic responses to feel comfortable with a unimodal-based model like WA or WA-PLS but too many unimodal responses for linear-based models, like PLS or PCR.Classical approach based on Gaussian or multinomial logit regression and calibration (tried but dropped because of computa-tional limitations in the 1990s) should be re-investigated, possibly within a Bayesian framework (e.g. Toivenen et al. (2001); Korhola et al. (2002)) but incorporating a priori ecological information about the species concerned (depth preferences, lake-chemical preferences, sediment preferences) as priors or conditionals.

6. Incorporation of species tolerances (niche widths) into WA-PLS is needed so that species with narrow tolerances ('good' indicators) have greater weight in the model.

7. Use non-linear deshrinking equations (e.g. smoothing spline) in WA or WA-PLS because the pattern of initially estimated x in relation to observed x is often non-linear, especially at the gradient ends ('edge effects').

8. Some species may show great dominance and abundance in some ecological settings ('weeds') but then occur with lower abundance in other settings. Great dominance can bias estimates of species parameters, not only of the few dominants, but also of the other species because of the percentage compositional constraint.

9. Do we really need all 200-300 species in a calibration set? Would a model based on only those species that are necessary for the model to perform well be more robust as it is not so 'overfitted' as a model based on 200-300 species?ANN with a backward-elimination pruning algorithm, Racca et al. (2003)

SWAP diatom-pH data-set

167 samples 267 species

18.5% +ve data entries pH 4.3-7.3

Species N2 1-120.9 Sample N2 5.1-57.2

Could eliminate 85% of species with little change in model performance

RMSEP maximum bias

All 267 species 0.32 -0.44

37 species remaining 0.33 -0.46

Use difference between RMSE(apparent) and RMSEP(jack-knife) as a

guide to possible model 'overfitting'.

Racca et al. (2003)

In general we have many species and few lakes in our modern calibration sets. 'Curse of dimensionality' and hence model overfitting.

Ideally ratio of species number to lake number should be as close to 1 as possible to minimise 'curse of dimensionality'.

Racca et al. (2003)

How to find minimal set of 'driving' species in WA or WA-PLS?

10. Covarying environmental variables e.g. temperature and lake trophic status (e.g. total N or P) or temperature and lake depth

Brodersen & Anderson

(2002)

Anderson

(2000)

pH and climate

Validate using another proxy - macrofossils of tree birch

Importance of independent validation

11. Use of different proxies - different proxies may give different reconstruction, e.g. mean July temperature at Bjørnfjell, northern Norway.

12. One large modern calibration data-set or several regional data-sets?

Merging data-sets increases the floristic diversity and environmental range of the resulting transfer function but can introduce further noise due to secondary environmental gradients.

Dynamic or local calibration data-set. Use MAT to find 10-20 closest modern analogues for each fossil sample in a core, and use these selected samples as a local calibration data-set for that site.

Current evidence suggests a modest improvement only in RMSEP and maximum bias of about 2-5%.

13. Hidden assumption number 5. 'Other environmental variables than, say, summer temperature have neglible influence or their joint distribution with summer temperature in the past is the same as the training set.'

Climate model and glaciological results suggest that the joint distribution between summer temperature and winter accumulation has not been the same in the past 11,000 years.Good evidence to suggest that lake-water pH has decreased naturally (soil deterioration) whilst summer temperature rose and then fell in the last 11,000 years.

In Norway today, lake-water pH is negatively correlated with summer temperature because lakes of pH 6-7.5 are on basic rock and this happens in Norway to occur mainly at high altitudes and hence at low temperatures. In the past after deglaciation, almost all lakes had a higher pH than today, so the pH-temperature relationship in the past was different than today.

PROJECTS THAT HAVE STIMULATED OUR TRANSFER

FUNCTION WORK

SWAP

SurfaceWaterAcidificationProject

1987-1990

NORPAST

1998-2002

NORPAST-2

2003-1995-2000 2000-2004

NFRKILO1993-1996

NFRSETESDAL1996-1999

EUCHILL

1998-2001

Cajo ter Braak, April 1992

PERSON WHO HAS STIMULATED OUR TRANSFER FUNCTION WORK

Major attributes of Cajo:

1. Wonderful person and loyal friend

2. Exceptional scientist with over 7400 citations

3. Revolutionised numerical ecology and quantitative palaeoecology with his creative ideas, remarkable powers of synthesis, and genius at working at the interface between practical ecology and statistical theory.

Thank you Cajo, for all your contributions in the last 20 years.

inferring past environments from biological data - progress, problems, and pitfalls

Documents

function of lake

indirect function of

function of climate

function of sea

past climatecommon

past environments

past environment xo

indirect indicators