+ All Categories
Home > Documents > HOMOGENITY AND HOMOGENIZATION METHODS. A review.

HOMOGENITY AND HOMOGENIZATION METHODS. A review.

Date post: 02-Jan-2016
Category:
Upload: hamilton-whitehead
View: 24 times
Download: 0 times
Share this document with a friend
Description:
HOMOGENITY AND HOMOGENIZATION METHODS. A review. Enric Aguilar Climate Change Research Group Geography Department Universitat Rovira i Virgili de Tarragona (Spain). [email protected]. OUTLINE. A few questions to start What does homogeneous (and inhomogeneous) mean? - PowerPoint PPT Presentation
Popular Tags:
72
SUMMER SCHOOL ON THE PREPARATION OF CLIMATE ATLAS Sitke (Hungary); 10 - 14 September 2007 HOMOGENITY AND HOMOGENIZATION METHODS. A review. Enric Aguilar Climate Change Research Group Geography Department Universitat Rovira i Virgili de Tarragona (Spain) [email protected]
Transcript
  • HOMOGENITY AND HOMOGENIZATION METHODS. A review.Enric AguilarClimate Change Research GroupGeography Department Universitat Rovira i Virgili de Tarragona (Spain)[email protected]

  • OUTLINEA few questions to startWhat does homogeneous (and inhomogeneous) mean?Why a time series may become inhomogeneousWhat does homogeneity assessment mean?What does homogenization mean?How does the lack of homogeneity compromise climate analysis?Some general procedrures for homogeneity assessment and homogenizationA brief review of techniques. Will briefly introduce these methods (not necessarily in this order)Craddock + MetadataLikelihood ratio: SNHTRegression modelsTwo-phase regressionCaussinus-MestreVincents interpolation of daily factorsDella Marta and WannerThe HOME-COST Action

  • A FEW QUESTIONS TO STARTWhat does homogeneous mean?From. lat. homogenus, and from gr. of the same natureTranslating the term to climate time series:A homogeneous climate time series is defined as one where variations are caused only by variations in climate. If a long-term time series is homogeneous, then all variability and change is due to the behavior of the climate system (WMO TD-1186)Conversely, inhomogeneous time series are those presenting any kind of bias, which impacts the recorded values and is not strictly caused by true climatic variability and change

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?

    Because a change has been applied or an error has been introduced into the conditions the data are measured, recorded, transmitted, stored and or analyzed resulting in a systematic bias of a particular segment of the time series

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example I: errors in temperature units leading to an artificial change in variance

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example II: change of rain gauge exposure, leading to an artificial bias in precipitation amount

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example III: changes in the way of computing the daily mean temperature, leading to important biases when compared to WMO standard (max+min)/2

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example IV: impact of urbanization on temperature series, leading to an artificial enhancement of trends

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example V: instrument replacement, leading to an artificial bias (jump) on radiation series

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example VI: impact of relocations and changes in environment on wind speed time series

  • A FEW QUESTIONS TO STARTWhy a time series may become inhomogeneous?. Example VII: changes in screen type, leading to an artificial bias in temperature

    Differences between Simoultaneous temperatures measured in Murcia (Spain). Lines are Stevenson Montsouris screen. Red is Tmax (much larger temperatures in Montsouris screen lead to negative differences); blue is Tmin (slightly larger values in stevenson screen lead to positive differences); green is Tmean, which balances the effect of Tmax and Tmin into negative differences (i.e. larger temperatures registerd on ancient screens)

  • A FEW QUESTIONS TO STARTWhat does homogeneity assessment mean?To learn if a time series is or is not homogeneous (I)

    Peterson et al, 2002Aguilar et al, 2005

  • A FEW QUESTIONS TO STARTWhat does homogeneity assessment mean?To learn if a time series is or is not homogeneous (II)

    Source: Lucie VincentQuebec reference series

  • A FEW QUESTIONS TO STARTWhat does homogenization mean? Apply statistical techniques to transform a) into b) and try to eliminate as much as possible non climatic influences biasing the time series

    a) Quebec City, 1895-2002(Non adjusted)Source: Lucie Vincent.b) Quebec City, 1895-2002(Non adjusted)

  • A FEW QUESTIONS TO STARTHow does the lack of homogeneity compromise climate analysis?

    Trend before homogenization: -0.7C in 106 years Quebec City, 1895-2002Source: Lucie Vincent.

  • A FEW QUESTIONS TO STARTHow does the lack of homogeneity compromise climate analysis?

    Trend after homogenization: +2.1 C in 106 years Quebec City, 1895-2002Source: Lucie Vincent.

  • SOME GENERAL QUESTIONSUse Quality Controlled DataDetect inhomogeneities (in other words, identify homogeneous subperiods)Adjust to the last homogeneous subperiodValidate results

  • A GENERAL PROCEDURE FOR HOMOGENEITY ASSESSMENT AND HOMOGENIZATIONDetect inhomogeneities (in other words, identify homogeneous subperiods)MetadataTestVisual inspection

  • SOME GENERAL QUESTIONSTestOver the data?Using reference series?Identify if period A is different from period B (good when you have reliable metadata) Identify in which data point the time series is most likely to have breakpoint Iterate over the series or use a model that allows multiple breaks detectionClimate fluctuations may be confused with inhomogeneitiesUsing a model based on a reference series or running the test with the help of a reference series should help to distinguish climate effects from true inhomogeneities

  • ON THE USE OF REFERENCE SERIESUsing reference series?Decorrelation (network density)Homogeneity of the reference seriesAuer et al, 2005- Specially problematic at early stages (i.e. 19th century), due the lack of data

  • THERE IS A MULTIPLICITY OF TESTS AVAILABLESimple (but elaborated!) formulatations Craddock Test + MetadataCaussinus-Mestre Likelihood ratio tests SNHT and variantsRegression model tests VincentTwo Phase Regression WangMASH will be discused later on by T. SzentimreyFor introductory explanations on these and other methods, see Aguilar et al (2003)*: Guidance on Metadata and Homogenization, WMO-TD-1186.

    *: This is almost 5 years old see later on slides on COST-HOME action!

  • ON THE APLICATION OF THE CRADDOCK TESTSAuer et al (2007)and many others apply the Craddock Test to climatological dataThe test has a simple formulation and HISTALP heavily relies on metadata and expertise to identify/confirm/reject potential breaksIt accumulates the normalized differences between two series (a and b) according to one of the following formulas:

    Where: s is the sum at the current obs; s-1 the sum at the previous obs; an is the obs at the candidate station; am is the mean of the candidate station; bn is the obs at the reference station; bm is the mean of the reference station

  • From Maugheri, M.

  • Craddock test - Bologna precipitation recordAllinizio del 1857 a questo pluviometro, ridotto in cattivo stato pel lungo uso, ne venne sostituito un altro di migliore costruzione, e lavorato con molta precisione...Introduction of a new pluviometer (Fuess recorder): ... fu collocato a cura del prof Bernardo Dessau nel periodo 1900-1903 ...Change in data origin: from Osservatorio Astronomico to Istituto IdrograficoNews about a damage to the pluviometer. In corrispondence with repairing the damage, the cause of the underestimation of precipitation has been removed for the period 1900-1928From Maugheri, M.

  • ON THE APPLICATION OF CRADDOCK TEST (generalizable to homogeneity work)Auer et al (2007) say:For the nucleus of homogeneity testing (the comparison of two series) we use Craddocks normalised accumulated difference/ ratio series (Craddock, 1979), although HOCLIS would allow any method of relative homogeneity testing to be used. The practical experience in our group with a number of such methods tells us that the rejection of break signals due to statistical non-significance (as provided by higher developed methods) is often misleading. Strong breaks may remain in the series simply owing to the fact that the typical length of a homogeneous subinterval (Table I) is short in relation to interannual variability. We try to compensate for the deficits of our method in pure statistical terms by investing much work into metadata analysis, which we regard as the ultimate measure to decide whether a break can be accepted or not.

  • ON THE APPLICATION OF CRADDOCK TEST1. Ignore any previous homogeneity work undertaken for any of the series (i.e. start from the beginning, assuming all series contain potential breaks).2. Test in small, well-correlated subregions (a maximum of 10 series tested against each other results in a 10 10 decision matrix, which enables most breaks detected to be assigned to a most likely candidate series).3. Choose the most appropriate reference series with a non-affected subinterval for the adjustment of each break detected (i.e. different reference series can be used for each break detected in a candidate series).4. Avoid erratic monthly precipitation adjustments by smoothing the annual course of adjustment factors.5. Detect outliers and overshooting adjustments using spatial comparisons (by mapping precipitation values both in absolute and relative units) for each month of the study period.6. Attempt to determine support for homogeneity adjustments when few metadata are available (i.e. contact data providers for more information in difficult cases).Auer et al (2005)

  • LIKELIHOOD RATIO TESTS: THE STANDARD NORMAL HOMOGENEITY TEST AND VARIANTSFormulated by Alexandersson (1986) and Alexandersson et al (1997)Critical values derived after MCSRecently Khaliq and Ouarda have recalculated the critical values using improved MCSWidely re-formulated (for example, Reeves et al, 2006)Widely applied (will see example by CCRG)

  • SHNT

  • SNHTMost Probable Breakpoint: Max of Correction Factor:

  • SNHT MODIFICATIONS BY REEVES et al (2007)The standarization into z series proposed by Alexandersson : Uses s to estimate the standard deviation of the series, which might be ineficient if the candidate series is inhomogeneousThey propose to avoid standarization by using:

  • Aguilar, E., Brunet, M., Saladi, O., Sigr, J. : Homogenization of the Spanish Daily Temperature Series. A step forward.QCd daily data of TMax and TMinScreen Bias Minimisation over monthly series of TMax and TMinSDTS Calculation of Monthly Values of TMax and TMinBlind break-point detection over annual, seasonal TMax, Tmin, Tmean with automated SNHT (1997)Breakpoint validation (metadata, plot checks, )Generation of correction patternApplication to monthly Tmax and Tmin (As described in Aguilar et al, 2002)Monthly, Seasonal, Annual Tmax, Tmin, DTR, TMean Series (STS)Interpolation to daily data (Vincent et al., 2002)Validation of daily corrected valuesA HOMOGENIZATION PROCEDURE BASED ON THE SNHT TEST (AND OTHER METHODS)

  • SCREEN BIAS MINIMIZATION

  • SCREEN BIAS MINIMIZATIONCCRGs SCREEN project (CICYT) 2 replicas of Montsouris Screen, on operation since 2003Large effect on TMaxMuch smaller effect on TMin

  • SCREEN BIAS MINIMIZATIONMurcia: TMaxStev = -0.508 + TMaxMont*0.975

  • SCREEN BIAS MINIMIZATIONTmax data for Murcia (August)Red: Murcia OriginalGreen: Murcia Screen-Corrected

  • BREAK POINT DETECTIONBLIND RUN OF AUTOMATED SNHT (see ALEXANDERSON ET AL., 1997) OVER ANNUAL AND SEASONAL VALUES OF TMAX, TMIN, TMEAN AND DTR

  • BREAK POINT DETECTIONBLIND RUN OF AUTOMATED SNHT (see ALEXANDERSON ET AL., 1997) OVER ANNUAL AND SEASONAL VALUES OF TMAX, TMIN, TMEAN AND DTRMost Probable Breakpoint: Max of Correction Factor:

  • BREAK POINT DETECTION

  • BREAK POINT VALIDATIONTn Detection03 17 1914 5.5103 17 1954 -2.5803 17 1935 2.16Tm Detection03 17 1955 -1.89Tx Detection03 17 1970 2.54Green: number of referencesRed: z-series

  • CORRECTION PATERNGreen: number of referencesRed: z-series Period: 1880-2006In 1954 station was relocated from the city center to the airport

  • CORRECTION RESULTS OVER ANNUAL TMean (BADAJOZ)OriginalCorrectedRed: original; green: corrected (Screen +SNHT)

  • APPLICATION TO MONTHLY SERIESBADAJOZ, TMaxAll Values in 1/10th of C

    Grfico1

    -2.382

    -1.826

    6.681

    7.811

    11.461

    10.319

    7.4

    8.285

    12.457

    4.099

    3.498

    0.996

    Grfico2

    0.059

    -0.017

    -0.22

    -0.238

    -0.308

    -0.311

    -0.266

    -0.355

    -0.438

    -0.166

    -0.113

    0.016

    Hoja1

    Factors for 1954 Breakpoint. Tmax, Badajoz

    StationMonthStartYearBreakPointEndYearFactor

    31187619542005-2.382

    32187619542005-1.826

    331876195420056.681

    341876195420057.811

    3518761954200511.461

    3618761954200510.319

    371876195420057.4

    381876195420058.285

    3918761954200512.457

    3101876195420054.099

    3111876195420053.498

    3121876195420050.996

    Factors for Trend Removal

    MonthStartYearEndyearFactor

    31190919540.059

    3219091954-0.017

    3319091954-0.22

    3419091954-0.238

    3519091954-0.308

    3619091954-0.311

    3719091954-0.266

    3819091954-0.355

    3919091954-0.438

    31019091954-0.166

    31119091954-0.113

    312191019540.016

    Hoja2

    Hoja3

    Grfico1

    -2.382

    -1.826

    6.681

    7.811

    11.461

    10.319

    7.4

    8.285

    12.457

    4.099

    3.498

    0.996

    Grfico2

    0.059

    -0.017

    -0.22

    -0.238

    -0.308

    -0.311

    -0.266

    -0.355

    -0.438

    -0.166

    -0.113

    0.016

    Hoja1

    Factors for 1954 Breakpoint. Tmax, Badajoz

    StationMonthStartYearBreakPointEndYearFactor

    31187619542005-2.382

    32187619542005-1.826

    331876195420056.681

    341876195420057.811

    3518761954200511.461

    3618761954200510.319

    371876195420057.4

    381876195420058.285

    3918761954200512.457

    3101876195420054.099

    3111876195420053.498

    3121876195420050.996

    Factors for Trend Removal

    MonthStartYearEndyearFactor

    31190919540.059

    3219091954-0.017

    3319091954-0.22

    3419091954-0.238

    3519091954-0.308

    3619091954-0.311

    3719091954-0.266

    3819091954-0.355

    3919091954-0.438

    31019091954-0.166

    31119091954-0.113

    312191019540.016

    Hoja2

    Hoja3

  • ADJUSTMENT OF MONTHLY SERIESAugust TXJanuary TXRed: original; green: adjusted

  • CONVERTING MONTHLY FACTORS TO DAILY FACTORSFollowing Vincent et al. (2002), monthly factors are assigned to the 15th of each month to avoid abrupt discontinuities at the end of the month

  • CLIMATE CHANGE INDICES DERIVED FROM DAILY TIME SERIESBadajoz, TX90p ndexRed: oiginalGreen: corrected

  • CLIMATE CHANGE INDICES DERIVED FROM DAILY TIME SERIESBadajoz, TX10p ndexRed: oiginalGreen: corrected

  • CLIMATE CHANGE INDICES DERIVED FROM DAILY TIME SERIESBadajoz, TN90p ndexRed: oiginalGreen: corrected

  • CLIMATE CHANGE INDICES DERIVED FROM DAILY TIME SERIESBadajoz, TN10p ndexRed: oiginalGreen: corrected

  • CAUSSINUS and MESTRE (2004)Use a fairly more complicated penalized log-likelihood procedure to correct groups of stations sharing the same climate signal

    C-M assume that each series is the sum of climate effect, a station effect and a random white noise. The station effect is constant if the series is reliable (homogeneous). If not, the station effect is piecewise constant between two shifts (except for outliers)The maximization of the statistic, and thus the model selection, implies testing a far too large number of combinations of break-points and oultiers positions

  • CAUSSINUS and MESTRE (2004)For practical application, pair-wise comparisons across the neighbours are performed by calculating difference series and the arising breaks/outliers are selected over a decission table.

  • CAUSSINUS and MESTRE (2004)Expertise and metadata records are used again to decide which break points are retained to be preliminary corrected with a simplified two factors model. Peliminary corrected data is submitted again to pairwise comparison to ensure that no important breakpoints were left untreated and to benefit of the corrections applied to other stationsOnce all beakpoints and outliers are known for all the stations, the full model is run

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)Defined their method in 10 steps:Define HSPs for the candidate and as reference stations as possible (this method does not provide its own detection tool)Starting from the most recent inhomogeneity find a reference station which is highly correlated and has HSP that adequately overlaps both HSP1 and HSP2 of the candidate station

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)

    Model the relationships between the paired candidate and reference observations before the inhomogeneity (i.e. in the period of common overlapping within HSP1 of the candidate, 1988-2003 for Graz-Uni, using Wien)

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)

    Predict the temperature at the candidate station after the inhomogeneity using observations from the reference and the previously obtained modelCreate a paired difference series between the predicted and the observed model within HSP2Find the probability distribution of the candidate station in HSP1 and HSP2

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)

    Bin each temperature difference in the difference series (step 5), according to its associated predicted temperature, in a decile of the probability distribution of the candidate station at HSP1Fit a smoothly varying function between the binned decile differences (step 7) to obtain an estimated adjustment for each percentileUnsing the probability distribution of the candidate in HSP2 (step 5) determine the percentile of each observation in HSP2 and ajust by the amount calculated in step 8Proceed to the remaining HSPs

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)Results

  • A MORE SOFISTICATED MODEL FOR ADJUSTING DAILY TEMPERATURE DATA:DELLA-MARTA and WANNER (2006)Comparisons

  • REGRESSION MODEL TESTS VINCENTS TESTVincent, in 1998, described a new approach based on the fitting of a hierarchy of regression based models to a time series an the analysis of the residualsThe model has been, as well as SNHT, widely applied and many variants have appeared

  • VINCENTS TESTA visual inspection of the plots may tell enough to infer whether the series is homogeneous (top, model accepted, process finished), has an artificial trend (left) or an artificial jump (closer plot) - The Durbin-Watson statistics and the analysis of the correlogram are use to take the final decision

  • VINCENTS TESTIteration throguh all possible p changepoints. When

  • VINCENTS TEST

  • REGRESSION MODEL TESTS VINCENT AND VARIANTS

    Reeves et al (2007) describe as follows Vincents procedure

  • TWO PHASE REGRESSIONAnother widely used family of methods are those based on two phase regressoin.With pioner applications by Solow (1987) and Peterson and Easterling (1995) and reformulated by Lund and Reeves (2002)

  • RH-TEST

    Wang (2003) defines the following model with commond trendRH-Test is a software package base in an improved version of the former model, widely used at WMO-CCl-ETCCDI workshops and in many other publicationsIncludes an iterative procedure to detect multiple breakpoints and the new statistics account for important aspects as serial autocorrelationAvailable at: http://cccma.seos.uvic.ca/ETCCDMI/software.shtml

  • WHICH IS THE BEST METHOD?There is no agreementThere is a sense that MANY methods work well (specially for detection on annual to monthly) if they are applied with care and expertiseMany authors have performed statistical comparisons:Ducr et al (2003)De Gaetano (2005)Reeves et al (2007)...Comparisons (as well as all homogeneity work) depend on several thingsThe tested data The test application proceduresThe software usedWhich quality do we prefer (false detection, false negatives, position, magnitude of the break)

  • WHICH IS THE BEST METHOD?From Ducr-Robitaille et al

  • WHICH IS THE BEST METHOD?From Ducr-Robitaille et al (2003)False detection magnitude and positions over 1000 simultaed series

  • COST ACTION HOME. Scientific Activities

  • COST ACTION HOME. Working Groups

  • FINAL MESSAGEThe method is important; its application, even more i.e.: the best method in bad hands will do less a worst method in good handsBEST SCENARIO: good method, good hands.

  • Thank you!


Recommended