John Birks University of Bergen, University College London, and University of Oxford

Quantitative Environmental Reconstructions in Palaeoecology: Progress, current status, & future needs

John BirksUniversity of Bergen,

University College London, and University of Oxford

Tage Nilsson LectureCentre for GeoBiosphere ScienceUniversity of Lund, 7 March 2013

INTRODUCTION

Early attempts at quantitative environmental reconstructions used presence of one or more ‘indicator species’ (e.g. Andersson, Samuelsson, Iversen, Grichuk, Coope) or species groups (e.g. Hustedt, Nygaard). Major development in Quaternary science occurred in 1971 with publication of the classic paper by Imbrie & Kipp. Paper laid the foundation of

calibration functions (transfer functions) as a tool for the quantitative reconstruction of past environments using the whole fossil assemblage, not just a few indicator species. Paradigm shift, not only in palaeoceanography but also in quantitative palaeoecology.

Quickly followed by Webb & Bryson (1972) using pollen data in the Midwest, USA, to reconstruct climate. Used liner-based canonical correlation analysis. Palaeoclimatology

Basic Biological Assumptions

Marine planktonic foraminifera - Imbrie & Kipp 1971

Foraminifera are a function of sea-surface temperature (SST) Foraminifera can be used to reconstruct past SST

Pollen is a function of regional vegetation – Webb & Bryson 1972

Regional vegetation is a function of climate Pollen is an indirect function of climate and can be used to reconstruct past regional climate at a broad spatial scale

Chironomids (aquatic non-biting midges) are a function of lake-water temperature – Walker et al. 1991

Lake-water temperature is a function of climate Chironomids are an indirect function of climate and can be used to reconstruct past climate but problems may arise

Freshwater diatoms are a function of lake-water chemistry – Renberg & Hellberg 1982 Diatoms can be used to reconstruct past lake-water chemistry

Basic Approach to Quantitative Environmental Reconstruction –

Calibration-in-Space

1, , m taxa

tsamples

Yf

Fossil data (e.g. diatoms) ‘Proxy data’

Environmental variable (e.g. pH)

1 variable

tsamples

Xf Unknown

To be estimated or reconstructed

To solve for Xf, need modern data about species and pH from n samples

Model Ym in relation to Xm to derive modern calibration function Ûm

Apply Ûm to Yf to estimate past environment Xf

Imbrie & Kipp provided the basic theory and assumptions, a robust method, and modern and fossil data

1, , m taxa

nsamples

Ym

Modern biology (e.g. diatoms)

Modern environment (e.g. pH)

1 variable

nsamples

Xm

Juggins & Birks (2012)

Xm

Xf

Yf

Ym

Ûm

transfer function

Calibration-in-space

Alternative Approach – Calibration-in-Time

1, , m taxa

t samples

Yf

Fossil data (e.g. diatoms)

Environmental variable (e.g. pH)

Xf

p samples

1

t samples

p observations

Y0 X0Known from historical data

Unknown, to be reconstructed

All done at one site

To solve for Xf, model Y0 in relation to X0, derive and apply calibration function F0 to Yf to estimate Xf

^

Potential problems

1. Temporal autocorrelation in Y0 and X0. How many independent samples are there? What is n?

2. Chronological – sample correlation between Y0 and X0

3. Applicability – can the model be applied to other sites other than the site where the calibration is made? Similar problem of applicability with intra-lake approach (Ym and Xm to derive Ûm from one lake applied to other non-training set lakes).

Only consider Calibration-in-Space

In palaeolimnology, after Nygaard’s (1956) , , and indices and Merilainen’s (1967) calibration, first major step towards robust environmental reconstructions was made in 1982 by Renberg & Hellberg with their Index B

ind = indifferent species (either side of pH7) acp = acidophilous (pH<7) acb = acidobiontic (pH<7, optimum 5.5 or less)alk = alkaliphilous (pH7 or more) alb = alkalibiontic (pH>7)

Represented a great breakthrough, only 30 years ago

Renberg & Hellberg (1982)

State of the subject in palaeolimnology prior to 1989

1986

Discussed diatom-pH calibration functions in Lund with Rick Battarbee in 1986. Suggested how they might be improved

Major breakthrough occurred in 1989 as result of work of Cajo ter Braak with his 1987 doctoral thesis

Advances in Ecological Research 1988

Several important papers that have been very influential on quantitative palaeolimnology

Through his work at the Research Institute for Nature Management at Leersum, ter Braak advised ecologists about data analysis and developed many new techniques to help answer particular ecological questions.

One such ecologist was the diatomist Herman van Dam who was working on the impact of acidification on diatoms and water chemistry of Dutch moorland ponds (this work led ter Braak to publish his first paper on multivariate data analysis (principal component biplots) in 1982).

This collaboration led to ter Braak & van Dam (1989)

Changed the approaches to quantitative environmental reconstruction in palaeolimnology (and in much of palaeoecology)

Fortunately coincided with Surface Water Acidification Project’s (SWAP) Palaeolimnology Programme led by Rick Battarbee and Ingemar Renberg 1987-1990.

ter Braak & van Dam (1989)

99 training-set diatom-pH samples; 61 independent test-set diatom samples. RMSEP is root mean squared error of prediction (‘standard error’). Generally want it as low as possible

Test-set RMSEP

Test-set RMSEPMaximum-likelihood Weighted averaging

Gaussian logit0.63

No tolerance downweighting

0.71

Multinomial logit (equal tolerances)

0.70 Tolerance downweighting

0.74

Multinomial logit 0.67 pH groups 0.75

Multiple regression Index B 0.83

26 taxa2.24

Correspondence analysis regression

0.71

7 step-wise selected taxa 2.74

pH groups 0.71Set the scene for weighted-averaging based methods – computationally simple, heuristic equivalents to the theoretically more rigorous maximum-likelihood methods.

Biological Proxy-Data Properties

• Contain many taxa (200-300)

• Contain many zero values (absences)

• Commonly expressed as proportions or percentages - "closed" compositional data

• Multicollinearity between variables

• Quantitative data are highly variable, invariably show a skewed distribution. Few common taxa, many rare taxa

• Can show spatial autocorrelation e.g. forams, dinocysts, pollen

• Taxa generally have non-linear relationship with their environment, and the relationship is often a unimodal function of the environmental variables

Species Response Models

A unimodal relation between the abundance value (y) of a species and an environmental variable (x). (u=optimum or mode; t=tolerance; c=maximum). Modelled . Modelled by by Gaussian logit Gaussian logit regressionregression (GLR) (GLR)

A straight line displays the linear relation between the abundance value (y) of a species and an environmental variable (x). Modelled by linear regression.

LINEAR

UNIMODAL

Environmental Data Properties

• Generally few variables, often show a skewed distribution

• Strong multicollinearity (e.g. July mean temperature, growing season duration, annual mean temperature)

• Often difficult to obtain (few modern climate stations, corrections for altitude of sampling sites, etc.)

• Strong spatial autocorrelation (tendency of values at sites close to each other to resemble one another more than randomly selected sites). Values at one site can be partially predicted from its values at neighbouring sites.

• Problem of nearly all data in real world. Recognised by Francis Galton in 1889. First methods to eliminate spurious correlation due to spatial position developed by ‘Student’ in 1914.

PROGRESSSince 1971, calibration functions widely used in palaeoceanography, terrestrial palaeoecology, and palaeolimnology

Used with wide range of biological proxies

• foraminifera, radiolaria, marine diatoms, coccolithophores

• pollen, testate amoebae, mollusca, bryophytes, plant macrofossils

• diatoms, chrysophytes, chironomids, ostracods, cladocerans

Now many different numerical reconstruction methods – at least 26 methods published, many minor variants of established methods

Reconstruction methods can be divided into three main types (Birks et al. 2010)

1.Indicator-species approach – one or many taxa considered as presence/absence

2.Similarity-based assemblage methods involving a quantitative comparison between past assemblages Yf and modern assemblages Ym (e.g. MAT, smooth response surfaces)

3.Multivariate calibration methods involving a quantitative calibration function Ûm estimated from Xm and Ym, modern calibration or training data-set (e.g. weighted averaging regression and calibration)

Concentrate on calibration-function approach

Approaches to Estimating Calibration Functions

Y = f(X) + error

Biology Environment

Estimate f by some mathematical procedure and 'invert' estimated (f) to find unknown past environment Xf from fossil data Yf

Xf f-1(Yf)

• Classical Approach

1. Basic Numerical Models

Can be difficult computationally

Obtain 'plug-in' estimate of past environment Xf from fossil data Yf

f or g are calibration functions

X = g(Y) + error

In practice, for various mathematical reasons, do an inverse regression or calibration

• Inverse Approach

Xf = g(Yf)

Easier to compute g and nearly always performs as well as classical approach

2. Assumed Species Response Model

• Linear or unimodal

• No response model assumed (linear or non-linear)

3. Dimensionality of Model

• Full (all species considered)

• Reduced (selected components of species used)

4. Estimation Procedure for Model

• Global (estimate parametric functions, extrapolation possible)

• Local (estimate non-parametric functions, extrapolation not possible)

Birks et al. (2010)

Commonly Used MethodsPrincipal components regression (PCR) I L (U) R G CF

Index B I L R G CF

Inverse multiple regression I L R G CF

Partial least squares (PLS) I L R G CF

Gaussian logit regression (GLR) C U F G CF

Two-way weighted averaging (WA) I U F G CF

WA-PLS I U R G CF

Artificial neural networks (ANN) I NA F Ln CF

Modern analogue technique (MAT) I NA F Ln S

Smooth response surfaces C NA F Ln S

I = inverse; C = classicalL = linear; U = unimodal; NA = not assumed; R = reduced dimensionality; F = full dimensionality; G = global parametric estimation; Ln = local non-parametric estimationCF = calibration-function based; S = similarity-based

Good reasons for preferring methods with assumed biological response model, full dimensionality, and global parametric estimation (ter Braak (1995), ter Braak et al. (1993), etc.)

1. Can test statistically if taxon A has a statistically significant relation to particular environmental variables

2. Can develop ‘artificial’ simulated data with realistic assumptions for numerical ‘experiments’

3. Such methods have clear and testable assumptions – less of a ‘black box’ than e.g. artificial neural networks

4. Can develop model evaluation or diagnostic procedures analogous to regression diagnostics in statistical modelling

5. Having a statistical basis, can adopt well-established principles of statistical model selection and testing. Minimises ‘ad hoc’ aspects of MAT

“To make sense of an observation, everyone needs a model … whether he or she knows it or not” Marc Kéry (2010)

Basic Requirements in Quantitative Palaeoenvironmental Reconstructions

1. Need biological system with abundant fossils that is responsive and sensitive to environmental variables of interest.

2. Need a large, high-quality training set of modern samples. Should be representative of the likely range of variables, be of consistent taxonomy and nomenclature, be of highest possible taxonomic detail, be of comparable quality (methodology, count size, etc.), and be from the same sedimentary environment.

3. Need fossil set of comparable taxonomy, nomenclature, quality, and sedimentary environment.

4. Need robust statistical methods for regression and calibration that can adequately model taxa and their environment with the lowest possible error of prediction and the lowest bias possible and sound methods for model selection.

5. Need means of establishing if reconstruction is statistically significant.

6. Need statistical estimation of standard errors of prediction for each reconstructed value.

7. Need statistical and ecological evaluation and validation of the reconstruction and of each reconstructed value.

Birks et al. (1990)

Principal components regression (PCR) = Imbrie & Kipp (1971) approach

Ym Xm

PC1

PC2

PC3

Multiple linear regression or quadratic regression of Xm on PC1, PC2, PC3, etc, to derive Ûm. Express Yf as principal components and apply Ûm to estimate Xf

Principal components maximise variance within Ym only

Selection of PCA components done visually until recently. Now cross-validation is used to select model with fewest components, lowest root mean squared error of prediction (RMSEP), & lowest maximum bias. ‘Minimal adequate model’ in statistical modelling

Inverse, linear, reduced dimensionality, global estimation. Linear response model is assumed, although non-linear responses are possible.

Early Methods Used

Index B approach

Inverse, linear, reduced dimensionality, global parametric estimation. Needs a priori taxon groupings

Ym + Xm

Ind

Acp

Acb

Alb

Alk

Index B (Um) Xf

+

Yf

(fossil data)

pH recon-struction

Inverse, linear, reduced dimensionality, global parametric estimation. Linear model is assumed, although non-linear responses are possible. Can be done with a priori species groups or individual taxa (forward selection).

Related inverse multiple linear regression approach (Davis & Berge 1980, Charles 1982, Davis et al. 1983, Davis &

Anderson 1984, Flower 1986)

Ind

Acp

Acb

Alb

Alk

Ym

+ Xm Um Xf

+

Yf

(fossil data)

pH reconstruction

Gaussian logit regression (GLR) and maximum likelihood (ML) calibration

Ym + Xm Yf

Xf

b0, b1, b2

b0, b1, b2

b0, b1, b2

modern data

fossil data

ML calibration

environmental reconstruction

taxon GLR regression coefficients for all taxa Ûm

Classical, unimodal, full dimensionality, global estimation. Robust to spatial autocorrelation. Can be computationally difficult. ML finds the most likely value of Xf that maximises the likelihood function given Yf and Ûm

ter Braak & van Dam (1989)

Major Methods Used

Two-way weighted averaging regression and calibration (WA)

Ym + Xm Yf

Xf

U1

U2

Ut

taxa WA optima ‘calibration function’

Ûm

modern data

fossil data

environmental reconstruction

WA regression

WA calibration

Inverse, unimodal, full dimensionality, global parametric estimation. Robust to spatial autocorrelation. First used in Quaternary science by Lynts and Judd (1971) Science 171: 1143-1144

ter Braak & van Dam (1989); Birks et al. (1990)

1. Ecologically plausible – based on unimodal species response model.

2. Mathematically simple but has a rigorous mathematical theory. Properties fairly well known now.

3. Empirically powerful:a.does not assume linear responsesb.not hindered by too many taxa, in fact helped by many

taxa! Full dimensionalityc. relatively insensitive to outliers

4. Tests with simulated and real data – at its best with noisy, taxon-rich compositional percentage data with many zero values over long environmental gradients.

5. Because of its computational simplicity, can derive error estimates for predicted inferred values by bootstrapping.

6. Does well in ‘non-analogue’ situations as it is not based on the assemblage as a whole but on INDIVIDUAL taxa optima and/or tolerances. Robust to spatial autocorrelation. Global parametric estimation.

7. Ignores absences of taxa.

Weaknesses

1. Sensitive to distribution of environmental variable in training set, leading to ‘edge effects’ where responses are truncated.

2. Disregards residual correlations in biological data.

pH

WA GLR WA GLR

J. Oksanen (2002)

Can extend WA to WA-partial least squares to include residual correlations in biological data in an attempt to improve estimates of taxon optima

Weighted averaging partial least squares regression and calibration (WA-PLS)

ter Braak & Juggins (1993) and ter Braak et al. (1993)

Ym Xm

PLS1

PLS2

PLS3Components selected to maximise covariance between taxon weighted averages and environmental variable X

Selection of number of PLS components to include based on cross-validation. Model selected should have fewest components possible and low RMSEP and maximum bias – minimal adequate model. Inverse, unimodal, reduced dimensionality, global parametric estimation. Can be sensitive to spatial autocorrelation.

βm Yf Xf

WA-PLS regression WA-PLS

calibration

coefficients (Ûm)

Comparison of different methods

Imbrie & Kipp (1971) data

Model performance statistic is root mean squared error of prediction (RMSEP) based on leave-one-out cross-validation

RMSEP Summer SST Winter SST

PC regression 2.55C 2.57C

PC regression with quadratic terms

2.15C 1.54C

CA regression 1.72C 1.37C

GLR (ML) 1.63C 1.20C

WA 2.02C 1.07C

WA-PLS 1.53C 1.17C

Un

imod

al

Lin

ear

Shows importance of using a unimodal-based method (ter Braak et al. (1993))

Other Areas of Progress

Besides the development of new methods for deriving calibration functions and of modern calibration data-sets, there have been major developments in model evaluation and selection and in reconstruction assessment, namely statistics of calibration functions and in understanding the strengths and weaknesses of different methods and in their underlying theory

See Juggins (2013 QSR)

1. Model evaluation and selection

Tendency to use several different methods and to select so-called ‘best’ method. Resulted in a shift from an obsession with the model with lowest RMSEP or, even worse, the highest r2.

More concern with model performance statistics including estimates of bias and number of components fitted (e.g. in WA-PLS).

Model performance usually based on some form of internal cross-validation (leave-one-out, n-fold cross-validation, or bootstrapping) or external cross-validation with independent test-set.

Juggins & Birks (2012)

Birks & Simpson (2013) revisited the classical SWAP 167-sample diatom-pH calibration-set using modern methods (WA, WAPLS, GLR, MAT, etc.)

1. Internal cross-validation, done 50 times

167 samples 110 training-set samples20 optimisation-samples (no.

WAPLS components etc.37 test-samples

2. External cross-validation, done 50 times

167 samples 167 training-set samples23 external optimisation-samples50 external test-samples

+

+

++

Internal cross-validation

37 test-samples

50 randomisations

Birks & Simpson (2013)

External cross-validation

50 test-samples

50 randomisations

Birks & Simpson (2013)

Internal cross-validation RMSEP values(I = inverse; C = classical; M = monotonic; T = Tolerance downweighting)

WAI = WAC = WAM = WTM

< WATI = WATC = MAT < WAPLS < GLR

External cross-validation

GLR < WAM = WTM < WAI = WAPLS

< WAC < WATI < MAT < WATC

Which to use as a guide to model selection?

External cross-validation involving independent test-set samples is ‘the appropriate benchmark to compare methods’ because all sources of error are considered (ter Braak & van Dam 1989)

van der Voet (1994) randomisation test of models helps find ‘minimal adequate model’ (MAM).

Model with good performance statistics and fewest number of fitted parameters. May be more than one MAM.

More work needed on model selection using criteria like Akaike Information Criterion (AIC) where unnecessary parameters are penalised. Active research area in ecology and evolutionary biology today.

Of course, performance of modern model is being assessed with other modern data, not with fossil data! Major problem. External cross-validation provides as rigorous a test as possible of performance.

2. Effects of spatial autocorrelation

Estimating model performance in terms of RMSEP, r2, maximum bias, etc, assumes that the test-set is statistically independent of the training-set. Cross-validation in presence of spatial autocorrelation violates this assumption as test samples are not spatially and statistically independent.

Spatial autocorrelation property of almost all environmental data and much ecological and biological data.

Telford & Birks (2005) Quat. Sci. Rev. 24: 2173-2179Telford (2006) Quat. Sci. Rev. 25: 1375-1382Telford & Birks (2009) Quat. Sci. Rev. 28: 1309-1316Telford & Birks (2011) Quat. Sci. Rev. 30: 3210-3213

Results show the apparent performance of some models is enhanced as a result of spatial autocorrelation in oceans and on land

Problems in finding spatially independent test-sets to test inference models

Telford & Birks (2009) have developed methods for cross-validating a calibration function in presence of spatial autocorrelation, h-block cross-validation

Spatial autocorrelation does not appear to be a problem in many palaeolimnological calibration-sets. May be a problem in within-lake calibration-sets developed for water-level reconstructions (Velle et al. 2012)

Effect of spatial autocorrelation

MAT, ANN High Local, non-parametric estimation

WA-PLS Some Global, parametric + potentially some local estimation

GLR, WA Low Global parametric estimation

3. Partitioning Root Mean Squared Error of Prediction

Model uncertainty commonly expressed as RMSEP

s1 Error due to variability in estimates of taxon parameters in training-set (model error or lack of fit)

20-25%

s2 Error due to variation in taxon abundances at a given environmental value

75-80%

1. Within-lake variability (Heiri et al. 2002) c. 15-20%

2. Variability in modern environmental data (Nilsson et al. 1996)

Models cannot, at present, take account of variation in environmental data

25-40% (up to 60%)

3. Variability in assemblages at a given environmental value due to unknown historical, ecological, stochastic, taphonomic, etc, processes. Unexplained variation

10-35%

Can only hope to reduce RMSEP by 20-25%

4. Testing the statistical significance of a quantitative palaeoenvironmental reconstruction

Telford & Birks 2011 Quat. Sci. Rev. 30: 1272-1278

H.H. Birks et al. 2012 Quat. Sci. Rev. 33: 100-120

All calibration-function programs will produce output or ‘reconstruction’

Does the resulting reconstruction explain more of the variance in the fossil data than most (say 95%) reconstructions derived from calibration functions trained on random environmental data?

If it does, then it is statistically significant.

Global test of significance

Stages

• PCA of fossil core data to determine the maximum amount of variance explicable by one axis or latent variable, say 30%

• Do a reconstruction and use the reconstruction as an ‘environmental’ variable in a RDA to see how much variance the reconstruction explains, say 20%

• Do 999 reconstructions using the same biological data, modern and fossil, but with environmental data drawn from a uniform distribution

• Derive an empirical distribution of variance explained based on 999 randomisations and calculate the p-value of the actual reconstructed value asp = Number of reconstructions ≥ 20% (including actual one)

Number of reconstructions + 1 (the actual one)

Telford & Birks (2011)

Round Loch of Glenhead, p = 0.006

Can test if more than one reconstruction made from one biological data-set is statistically significant.

Chukchi Sea dinoflagellates – summer sea-surface temperature; sea-ice duration; summer salinity

Summer salinity not significant (p = 0.146)

What about ice duration and SST?


Partial out SST first as it explains marginally more of the variance (p = 0.003). Ice no longer significant when SST is allowed first. No significant independent information.

Applicable to almost all reconstruction methods, not just WA or WA-PLS


5. Evaluation of individual reconstructed estimates

Assuming overall reconstruction is statistically significant, some individual estimates may be less reliable than others (poor preservation, unusual composition or peak, etc). Need to evaluate individual reconstructed values. Local evaluation

• Goodness-of-fit measures for each individual fossil sample, as in regression modelling (Birks et al. 1990)

• Analogue statistics (Birks et al. 1990; Simpson 2007)

• Proportions of taxa in fossil assemblage absent or rare in modern training data with no or poorly estimated taxon parameters (Birks 1998)

• Sample-specific errors for reconstructed values estimated by bootstrapping, bagging (aggregated bootstrapping) or Monte Carlo simulation (Birks et al. 1990)

What to do with sample-specific errors?

Birks & Peglar (unpub.)

Has a statistically significant (p=0.009) reconstruction but there is also a continuous overlap in RMSEP. Problems of temporal autocorrelation in assessing RMSEP for samples.

Unresolved

6. Highlighting ‘signal’ from ‘noise’ in reconstructions

Use of LOESS smoother a great help

Sample-specific errors or LOESS smoother

Brooks & Birks (2001)

Seppä & Birks (2002)

7. Ecological validation

Compare reconstructed values with historical data. Rarely possible as few historical data exist.

But when done, sometimes the model that gives the closest correspondence is not the model with lowest RMSEP or maximum bias!

Conflict between model performance and selection based on cross-validation of modern data and validation results using independent historical test-sets

Renberg & Hultberg (1992)

8. Palaeoecological validation by multi-proxy data

Similar trends, different absolute values. Not surprising, given different biology of different groups of organisms

Birks & Ammann (2000)

1. The biggest set of problems is that the calibration-function approach, like any other quantitative procedure, makes assumptions, as originally stated by Imbrie & Kipp (1971), Imbrie & Webb (1981), and Birks et al. (1990).

These assumptions are being increasingly violated, especially in the last 5-10 years.

What are these assumptions?

CURRENT STATUS AND PROBLEMS

1. Taxa in training set (Ym) are systematically related to the physical environment (Xm) in which they live

2. Environmental variable (Xf , e.g. summer temperature) to be reconstructed is, or is linearily related to, an ecologically important variable in the system over the time period of interest

3. Taxa in the training set (Ym) are the same as in the fossil data (Yf) and their ecological responses (Ûm) have not changed significantly over the timespan represented by the fossil assemblage

4. Mathematical methods used in regression and calibration adequately model the biological responses (Um) to the environmental variable (Xm)

5. Other environmental variables than, say, summer temperature have negligible influence, or their joint distribution with summer temperature in the fossil set is the same as in the training set

6. In model evaluation by cross-validation, the test-data are independent of the training data

Imbrie & Kipp (1971), Imbrie & Webb (1981), Birks et al. (1990), Telford & Birks (2005), Juggins & Birks (2012), Juggins (2013)

1. Assumptions in quantitative palaeoenvironmental reconstructions

2. Multiple-variable reconstructions – what variables can be reconstructed?

Assumption 2 requires that the environmental variable (Xm, Xf) should explain a significant (statistically and ecologically) and independent portion of the variation in the biological data (Ym, Yf)

Increasing tendency to reconstruct 2 or 3, even 7-8, environmental variables that on the basis of current ecological knowledge of, e.g., vegetation, chironomids, or diatoms, cannot all be ‘ecologically important’ (assumption 2)

e.g. mean January, mean July, mean annual temperature, growing degree days above 0C and above 5C, annual precipitation, and evaporation : potential evaporation.

Ecological data are not usually influenced by 8 independent ‘ecologically important’ variables. Usually only 1-3 significant ordination axes.

All variables may be statistically significant in a RDA or CCA when considered individually (‘marginal’ effects) but almost certainly not significant when considered together (‘conditional’ effects, high multicollinearity, variance inflation factors). Many reconstructions of, for example, ‘distance to littoral vegetation’ suspect.

Basic statistical error (Juggins 2013)

Other potentially powerful approach is hierarchical partitioning (HP)

HP is designed to overcome multicollinearity problems by using a mathematical theorem by which the explanatory capacities of a set of predictor environmental variables can be estimated. Uses goodness-of-fit measures for each of the 2k possible models for k independent variable. In HP, the variances are partitioned so that the independent contribution (I) of a given environmental variable is estimated. Furthermore, the variation shared with another environmental variable (conjoint contribution J) can be computed.

HP allows differentiation between those environmental variables whose independent, as distinct from partial, correlation with the response variable may be important from those variables that have little or no independent effect on the responses (hier.part in R).

Used by Steve Juggins with diatom data and encouraging results (2013)

3. Confounding effects of correlated environmental variables (assumptions 2 and 5)

Present in all studies, starting with Imbrie & Kipp (1971) with reconstructions of summer and winter sea-surface temperature and salinity.

Covarying environmental variables e.g. temperature and lake trophic status (e.g. total N or P) or temperature and lake depth and chironomids. Is the fossil chironomid signal temperature or trophic status?

Broderson & Anderson (2002)

In almost all ecological systems, assemblages are a complex function of multiple climatic, edaphic, land-use, biotic, and historical factors.

First part of assumption 5 (environmental variables other than the variable being reconstructed have negligible influence) is therefore almost never met. Need very careful design of modern training-set and rigorous statistical analysis to establish what can reliably and significantly be reconstructed.

Second part of assumption 5 (the joint distribution of additional variables with the variable of interest does not change with time) is also violated in many cases.

Climate model and glaciological results suggest that the joint distribution between summer temperature and winter accumulation has not been the same in the past 11,000 years.

Good evidence to suggest that lake-water pH has decreased naturally (soil deterioration) whilst summer temperature rose and then fell in the last 11,000 years. pH-climate relationship changed with time.

In Norway today, lake-water pH is negatively correlated with summer temperature because lakes of pH 6-7.5 are on basic rock and this happens in Norway to occur mainly at high altitudes and hence at low temperatures. In the past after deglaciation, almost all lakes had a higher pH than today, so the pH-temperature relationship in the past was different than today.

4. Assumption 3 “Taxa in the training-set are the same as in the fossil data and their ecological responses have not changed significantly over the timespan represented by the fossil assemblage”

Assumption not unique to calibration functions. Basic assumption of all Quaternary palaeoecology, namely uniformitarianism.

Considerable interest in niche-conservatism amongst biogeographers and conservation and evolutionary biologists. Increasing evidence for conservatism of ecological niche characteristics in the timespan of last 20,000 years.

Problems of ‘cryptic’ species and of taxa like Saxifraga oppositifolia-type in environmental reconstructions currently unresolved.

5. Use of different proxies can give different reconstructions

Validate using another proxy – e.g. macrofossils of tree birch

Validate using second proxy – e.g. chironomidsImportance of independent validation and establishing what is statistically significant

Mean July temp, Bjørnfjell

p = 0.001

p = 0.183 ns

FUTURE NEEDS

Quantitative palaeoenvironmental reconstructions in the context of Quaternary palaeoecology are not really an end in themselves (in contrast to Quaternary palaeoclimatology) but they are a means to an end.

Use the reconstructions based on one proxy (e.g. chironomids) to provide an environmental history against which observed biological changes in another, independent proxy (e.g. pollen) can be viewed and interpreted as biological responses to environmental change.

Minden Bog, Michigan Booth & Jackson (2003)

Major change 1000 years ago towards drier conditions, decline in Fagus and rise in Pinus in charcoal

Climate vegetation fire frequency

Black portions = wet periods, grey = dry periods

Env. predictor

Biotic responses

These approaches involving environmental reconstructions independent of the main fossil record can be used as a long-term ecological observatory or laboratory to study long-term ecological dynamics under a range of environmental conditions, not all of which exist on Earth today (e.g. lowered CO2 concentrations, low human impact).

Can begin to study the Ecology of the Past.

Exciting prospect, many potentialities in future research, as outlined by Flessa and Jackson (2005) and discussed by Birks et al. (2010 Open Ecol J 3: 68-110)

Other important future needs

1. Increased rigour in model evaluation and selection with greater use of external cross-validation, development of ‘minimal adequate model’

2. Testing significance of reconstructions

3. Greater rigour in deciding what environmental variables can be reconstructed (critical use of RDA/CCA, hierarchical partitioning, and ecological knowledge!)

4. Consider the likelihood of confounding and ‘surrogate’ environmental variables

Diatoms, pH, and climate

Is the reconstruction a reconstruction of pH or climate?

Anderson (2000)

5. Different methods can give very different reconstructions, even though they have similar modern model performances

Birks (2003)

6. There are increasing numbers of calibration data-sets (e.g. Norwegian, Swiss, Norwegian + Swiss, N Sweden, Finland 1 & 2 chironomid data-sets). How to select the ‘appropriate’ one?

RMSEP (C) Max bias (C) 0.85 0.98 0.75 1.09 0.91 1.56 0.85 1.14

Same July T range, different continentality (3), one with lower July T range. Similar but not identical RMSEP and maximum bias, all two-way WA

Salonen et al. (in press)

Salonen et al. (in press)

All reconstructions statistically significant (p<0.05). Likely explanation is that WA optima are different in areas of different continentality. Higher in areas of high continentality (e.g. Ulmus, Tilia, Quercus)

Basic problem in palaeoecology – really interested in the fundamental niche but can only study the realised niche as a result of confounding environmental variables. Realised niche may be different in different areas. Conflicts with assumptions 3 and 5.

7. Do not ignore inconsistent results – Velle et al. (2012)

8. Try to understand why results are seemingly inconsistent

9. Remember what the six basic assumptions of calibration functions are and try not to violate them or, even better, try to test them (e.g. niche conservatism)

Effective use of calibration functions needs

• good understanding of underlying ecology, mathematics, and principles of statistical modelling and cross-validation

• good quality modern and fossil data

Importance of continued research collaboration between palaeoecologists and applied statisticians

Bayesian framework is an important future research direction but it presents very difficult and time-consuming computational problems. No available software (cf. DECORANA, CANOCO, WACALIB, CALIB, etc. philosophy)

CONCLUSIONS

To paraphrase the statistician G.P.E. Box

“All reconstructions are wrong, but some reconstructions may be useful”

The challenge is to identify the useful and reliable ones

It is a difficult task and one that has received surprisingly little attention until recently. Major challenge for the future.

Simple two-way weighted-averaging appears hard to beat

Takes account of % data, ignores zero values, assumes unimodal responses, can handle several hundred species, and gives calibration functions of high precision (0.8ºC), low bias, and high robustness.

Xm = g(Y1, Y2, Y3, ... ... ..., Yp) Modern data WA regression

Xf = g(Yf1, Yf2, Yf3, ... ... ..., Yfp) Fossil data WA calibration

g is our calibration function for Xm and Ym

Simple, ecologically realistic, and robust

WA is robust to spatial autocorrelation, as are Gaussian logit regression and ML calibration. WA (with monotonic deshrinking) and GLR are, to me, the preferred methodsLynts and Judd 1971 Science 171: 1143-1144

Late Pleistocene Paleotemperatures at Tongue of the Ocean, Bahamas

It too is 42 years old! Has 20 citations (cf. 652)

Major problem in all reconstructions are the effects of secondary variables, confounding variables, and non-causal environmental variables on resulting reconstructions.

Only recently beginning to receive attention – Juggins & Birks (2012) and Juggins (2013).

We must all give greater attention to what can and cannot be reconstructed and explicitly address the dangers of reconstructing surrogate variables (e.g. water depth) and confounding variables (e.g. climate and nutrients)

Juggins (2013)

Key Figures in Calibration-Function Research

John Imbrie Cajo ter Braak Tom Webb

Svante Wold Steve Juggins Richard Telford

One cannot do calibration-function research without high quality data and these need skilled palaeoecologists. Many colleagues have contributed to the development of calibration functions by creating superb modern-environmental data sets

Heikki Seppä Andy Lotter Oliver Heiri

Steve Brooks Viv Jones Ulrike Herzschuh

Nilva Kipp

Sylvia Peglar

Tage Nilsson 1905-1986

1935 19611948

197219671964 1964

Only met Tage Nilsson once, in 1969, when we came to Lund to go to Blekinge with Björn and to Öland with Knix Königsson. The Lund lab then was very small with Tage Nilsson as Professor (1969-1971) and Björn Berglund and Gunnar Digerfeldt.

My next visit was in 1975. Great expansion with tree-ring lab, radiocarbon-dating lab, faunal research, and palaeomagnetism, as well as pollen and macrofossil analyses.

Tage Nilsson had laid the foundations for something great, namely the University of Lund Quaternary research centre. Proud to have been associated with Lund for over 40 years.

Date post:	30-Dec-2015
Category:	Documents
Upload:	kirsten-horn
View:	30 times
Download:	2 times

John Birks University of Bergen, University College London, and University of Oxford

Documents