1
Presentation to the National Academy of Sciences Expert Panel, “Surface Temperature Reconstructions for the Past 1,000-2,000 Years.”
Stephen McIntyreToronto Ontario
Ross McKitrickGuelph Ontario
Washington DC, March 2 2006.
2
Guiding Questions (Boehlert)2)
(b) What are the principal scientific criticisms of their [MBH] work and how significant are they?
(c) Has the information needed to replicate their work been available?
(d) Have other scientists been able to replicate their work?
3)(b) How central is the work of Drs. Mann, Bradley and Hughes to the consensus on the temperature record?
3
Our Answers
2b) Principal Criticisms “new” statistical methods mined for hockey stick shaped series. These methods were not accurately described Reconstruction failed an important verification test said to have
used in the study and this failure was unreported and the statistical skill was misrepresented both in the original article and by the IPCC.
Dominant weight was placed on proxies known to be inappropriate temperature proxies and a false claim of robustness was made;
The method of confidence interval calculation leads to unrealistically narrow confidence intervals.
4
Our Answers
2c) Information available for replication?
No.
Many obstructions encountered in replication attempts. The underlying data as used has not always been available. The methodology was not accurately described in the paper and
requests for source code were refused.
5
Our Answers
2d) Have others replicated MBH?
No.
Ammann and Wahl have not confirmed MBH claims of statistical skill and robustness
In fact, their code actually confirms our claims that MBH verification statistics are insignificant.
Their emulation of MBH is almost identical to ours. Differences between us pertain entirely to interpretation
6
Our Answers
3b) How central is hockey stick?
Origin of claims of “warmest decade of millennium”
Relied upon by IPCC and governments
Became standard for subsequent studies
Results and methods continue to affect papers published today
7
Hockey stick is still iconic
8
Why was the hockey stick so influential? “New” Statistical Approach
Skill
Robustness
Confidence Intervals
9
New statistical method
PC methods “Climate fields”
PC methods on tree ring networks
Novel multivariate method for regression
10
Changes the picture
Figure 2: Top – Average of 450 series in MBH98 “dataall” dataset archived in July 2004.
Bottom – MBH98 reconstruction.
11
Bias in Tree Ring PC method Data decentered against post-1902 mean
Preferentially adds weight to hockey stick-shaped series in PC1
12
Questions about multivariate methodology
Sui generis method with unknown statistical properties
No literature on estimating confidence intervals
13
Skill claims
The reconstruction failed an important verification test (r2) said to have been used in the study.
This failure was not reported
The statistical skill was misrepresented both in the original article and by the IPCC.
14
Claim in MBH98
• β [or RE] is a quite rigorous measure of the similarity between two variables, measuring their correspondence not only in terms of the relative departures from mean values (as does the correlation coefficient r) but also in terms of the means and absolute variance of the two series. For comparison, correlation (r) and squared-correlation (r2) statistics are also determined.
15
Claim in IPCC
IPCC TAR WG1, Pages 133, 136
…they [MBH] estimated the Northern Hemisphere mean temperature back to AD 1400, a reconstruction which had significant skill in independent cross-validation tests. Self-consistent estimates were also made of the uncertainties…. Taking into account these substantial uncertainties, Mann et al. (1999) concluded that the 1990s were likely to have been the warmest decade, and 1998 the warmest year, of the past millennium for at least the Northern Hemisphere.
16
Failed r2 test value
Verification RE
Verification R2 CE
MBH98 0.48 n.r. n.r.
MM05a Emulation 0.46 0.02 -0.26 Ammann&Wahl Code 0.47 0.02 -0.24
17
Robustness – warnings about bristlecones IPCC 1996
“the extent to which multidecadal, century and longer time-scale variability is expressed can vary, depending on the length of individual …series that make up the chronologies and the way in which these series have been processed to remove non-climatic trends. In addition, the possible confounding effects of carbon dioxide fertilization needs to be taken into account when calibrating tree ring data against climate variations
Biondi et al. (1999)• “[Bristlecone”] are not a reliable temperature proxy for the last
150 years as it shows an increasing trend in about 1850 that has been attributed to atmospheric CO2 fertilization (Graybill and Idso 1993)
18
Claims about robustness
MBH98:• the long-term trend in NH is relatively robust to the
inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)
Mann et al. 2000:• possible low-frequency bias due to non-climatic influences on
dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions…Whether we use all data, exclude tree rings, or base a reconstruction only on tree rings, has no significant effect on the form of the reconstruction for the period in question. (http://www.ngdc.noaa.gov/paleo/ei/ei_nodendro.html, emphasis added.)
19
Dominant PC weight on Bristlecones 70 series in the North American AD1400 network
15 are Graybill-Idso bristlecone/foxtail pines
In MBH method they account for 93% of PC1 weights; PC1 assigned 38% explained variance
Under conventional (covariance) PC method they fall to PC4 and get < 8% explained variance
20
Role of Bristlecones
Removing them eliminates the hockey stick shape from PCs and final reconstruction
Contradicts robustness claim
21
Role of bristlecones
From Mann’s FTP site (CENSORED folder)
22
Robustness
Burger and Cubasch (2005)
23
Confidence Intervals
Confidence intervals based on calibration-period residuals not verification-period residuals, thus understating actual uncertainty
24
Red Zone autocorrelation properties …
Modeled as ARMA(1,1), AR1 ρ > 0.9 in all cases and sometimes >0.97. Above figure: left- AR1 coefficient ρ; right – MA1 coefficient.
CRU J98 MBH99 MJ03 CL00 BJ00 BJ01 Esp02 Mob05
-1.0
0.0
1.0
25
Durbin-Watson Statistic
All multiproxy reconstructions as archived, except MBH99, fail Durbin-Watson statistic (minimum 1.5).
Passing a DW test is a necessary but not sufficient test of model validity.
J98 MBH99 MJ03 CL00 BJ00 BJ01 Esp02 Mob05
0.0
1.0
2.0
26
Cross-Validation R2 Statistic
Multiproxy reconstructions show decreased and insignificant cross-validation R2 statistics. BJ01 does not have MWP results and is reported not to verify after 1960. Left – calibration; right – verification R2.
J98 MBH99 MJ03 CL00 BJ00 BJ01 Esp02 Mob05
0.0
0.4
0.8
Cross-Validation R2
27
Replication
MBH results are not limited to a squiggly line: they include claims of skill and robustness
Ammann and Wahl claim to have replicated MBH results but they have only confirmed ours
28
Replication: Statistical Skill
Ammann and Wahl code yields the same conclusion as ours
There is no support for MBH skill claims
Verification RE
Verification R2 CE
MBH98 0.48 n.r. n.r.
MM05a Emulation 0.46 0.02 -0.26 Ammann&Wahl Code 0.47 0.02 -0.24
29
SKILL: The dot.com reconstruction
30
Replication: Robustness
Ammann and Wahl code shows results hinge on bristlecones
There is no support for MBH claims of robustness
31
Problems with other studies
ReplicationRobustnessProxy selectionProxy validityLack of independence of authorship
and proxies
32
MBH outputs used in other studies PC methods used in Mann and Jones (2003), Jones and
Mann (2004) PC series used in Rutherford et al. (2005) NOAMER PC1 used a few weeks ago in Osborne and
Briffa (2006, Science)
33
Greenland
Dahl-Jensen boreholes
34
Why do VZ differ?Did PC on decentered correlation matrix and
not on decentered data matrix: only the same if centered
Assumed >0.3 proxy correlation to temperature
35
The “Divergence” Problem:large population of 387 temperature-sensitive sites
36
The “Explanation” Briffa et al. (1998b) discuss various causes for this decline in tree growth parameters, and Vaganov et al. (1999) suggest a role for increasing winter snowfall.… In the absence of a substantiated explanation for the decline, we make the assumption that it is likely to be a response to some kind of recent anthropogenic forcing. On the basis of this assumption, the pre-twentieth century part of the reconstructions can be considered to be free from similar events and thus accurately represent past temperature variability. [Briffa et al. 2002]
37
“Divergence” in action
38
Site Spaghetti: Polar Urals
39
Robustness?Little changes make big differences
Crowley & Lowery (2000) without bristlecones and Dunde
1000 1200 1400 1600 1800 2000
0.3
0.4
0.5
0.6
40
End of Presentation
Medieval foxtail pineCalifornia
Thank you.
41
Variance is already rescaled
CRU J98
0.0
0.2
0.4
CRU MBH 1902-80
0.0
0.2
0.4
42
CO2 Adjustment
MBH claim to have fixed bristlecone problem with CO2 adjustment.
What is CO2 adjustment in MBH99?