Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | tyrone-reynolds |
View: | 213 times |
Download: | 0 times |
A Review of Benchmarking Methods
G Brown, N Parkin, and N Stuttard, ONS
Overview
• Introduction• What is benchmarking?• What we did and why• Some methods for benchmarking• Some quality measures• Comparison of methods• Summary
2
Introduction
• Purpose: to recommend a method for benchmarking to ONS and wider GSS
• Benchmarking combines two time series of same phenomenon, measured at different frequencies
• Result: benchmarked series is higher quality• Work funded from Quality Improvement Fund
3
What we did and why
• Identified appropriate benchmarking methods• Tested using several hundred ONS time series• Used range of quality measures to rank
methods• Made judgment to combine results from
different quality measures• Recommended a benchmarking method• Update of ONS computer systems prompted
examination of methods
4
Benchmarking
• Want good estimates of levels and growth• Have two series measuring same
phenomenon• Different frequencies• Higher frequency more timely, accurate
growthso Indicator series
• Lower frequency delayed, more accurate levelso Benchmark series
5
Benchmarking
• Resulting high frequency serieso Benchmarked series
• Has good estimates of growth combined with good estimates of level
6
Benchmarking
• Two types of relation between indicator and benchmark:o Point in timeo Average
7
Benchmarking, point in time
• Example: unemployment monthly and quarterly
• Benchmarks apply to the third month in each quarter
• Third monthly estimate in each quarter is forced to equal benchmark
8
9
10
Benchmarking, average
• Example: turnover monthly and quarterly• Benchmarks apply to each month in each
quarter• Average turnover of three months in each
quarter is forced to equal benchmark
11
12
13
Non-negativity
• Most indicator series must be non-negative
• In those cases the benchmarked series must be non-negative too
• Process of benchmarking can produce negative benchmarked series
14
15
What we did and why
• Identified appropriate benchmarking methods• Tested using several hundred ONS time series• Used range of quality measures to rank
methods• Made judgment to combine results from
different quality measures• Recommended a benchmarking method• Update of ONS computer systems prompted
examination of methods
16
Benchmarking methods
• Methods suggested by ONS, variants with different splineso proc Expand (in SAS)
o INTER
o Kruger
• Denton
• Cholette-Dagum
• Constrained versions of the above for non-negativity
17
Benchmarking methods
• Methods suggested by ONS, variants with different splineso proc Expand (in SAS)
o INTER
o Kruger
• Denton
• Cholette-Dagum
• Constrained versions of the above for non-negativity
18
Benchmarking methods
• Methods suggested by ONS, variants with different splineso proc Expand (in SAS)
o INTER
o Kruger
• Denton
• Cholette-Dagum
• Constrained versions of the above for non-negativity
19
Benchmarking methods
• Methods suggested by ONS, variants with different splineso proc Expand (in SAS)
o INTER
o Kruger
• Denton
• Cholette-Dagum
• Constrained versions of the above for non-negativity
20
ONS methods (and variants)
• Summary: fits smooth curve through knots
1. Aggregate indicator series
2. Calculate ratio of aggregated to benchmark
3. Augment with fore/backcasts using X-12-ARIMA
4. Interpolate to frequency of indicator
5. Multiply indicator by interpolated series
6. Iterate 1 to 5
• Variants use different ways to interpolate21
Interpolation
• Three types of cubic spline
1.Proc Expand (point in time/average)
2. INTER (average)
3.Kruger (point in time)
• Progressively less prone to produce negative values
22
Denton type
• Summary: try to preserve movements in indicator
• Minimise a penalty function of differences or relative differences between indicator and benchmark
• Minimisation using either special methods or off-the-shelf methods for quadratic minimisation
• Denton usually set up to minimise first differences or proportionate first differences
23
Denton and Cholette-Dagum
• For indicator points with no benchmark:• Denton carries forward the most recent
difference between benchmark and indicator• Cholette-Dagum assumes the difference
decays to zero in a defined way• Flexible in the way this is modelled• We assume:
o Decay is geometrico Rate of decay fixed in advance for all series
24
25
Non-negativity
• ONS suggestion:o Benchmark on log scaleo Exponentiateo Distribute residual differences
• Optimisation approach for Denton type:o Set up basic method as a matrix problemo Add constraints as part of matrix setupo Solve using off-the-shelf optimiser in SAS
26
What we did and why
• Identified appropriate benchmarking methods• Tested using several hundred ONS time series• Used range of quality measures to rank
methods• Made judgment to combine results from
different quality measures• Recommended a benchmarking method• Update of ONS computer systems prompted
examination of methods
27
Time series used for testing
• Mixture of: o Monthly to quarterlyo Quarterly to annualo Average and point in time
• Different lengths• Included some awkward series (to test non-
negativity)
28
What we did and why
• Identified appropriate benchmarking methods• Tested using several hundred ONS time series• Used range of quality measures to rank
methods• Made judgment to combine results from
different quality measures• Recommended a benchmarking method• Update of ONS computer systems prompted
examination of methods
29
How the methods were compared
1. Failures
2. Verification of benchmarking constraint
3. Preserving change
4. Revisions
5. Smoothness
6. Closeness
30
How the methods were compared
1. Failures – program fails to benchmark
2. Verification of benchmarking constraint
3. Preserving change
4. Revisions
5. Smoothness
6. Closeness
31
How the methods were compared
1. Failures
2. Verification of benchmarking constraint - benchmarked not equal to benchmark
3. Preserving change
4. Revisions
5. Smoothness
6. Closeness
32
How the methods were compared
1. Failures
2. Verification of benchmarking constraint
3. Preserving change – size and direction
4. Revisions
5. Smoothness
6. Closeness
33
How the methods were compared
1. Failures
2. Verification of benchmarking constraint
3. Preserving change
4. Revisions – size & bias when perturbing or adding benchmark
5. Smoothness
6. Closeness
34
How the methods were compared
1. Failures
2. Verification of benchmarking constraint
3. Preserving change
4. Revisions
5. Smoothness – relative variance of indicator and benchmarked
6. Closeness
35
How the methods were compared
1. Failures
2. Verification of benchmarking constraint
3. Preserving change
4. Revisions
5. Smoothness
6. Closeness – between indicator and benchmarked
36
How the methods were compared
• For each one of preserving change, revisions, smoothness and closeness, calculate:o For each method, for each time series, for
different lengths of the serieso Rank methods for each series and lengtho Average the ranks over all serieso Plot and compare average ranks by length
37
38
39
Recommended method
• Around 100 plots compared• Judgment made on overall best performing
method• Based on good performance and lack of bad
performance• Recommended method:
Cholette-Dagum (0.8)
40
Summary
• Aim: recommend method for benchmarking to ONS and wider GSS
• Update of ONS computer systems prompted examination of methods
• Used several quality measures to rank methods
• Made judgment to combine results from different quality measures
• Recommended: Cholette-Dagum (0.8)
41
Any questions?
42