Date post: | 29-Aug-2014 |
Category: |
Technology |
Upload: | rob-hyndman |
View: | 1,734 times |
Download: | 0 times |
1
Rob J Hyndman
Automatic time seriesforecasting
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Motivation 2
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
1 Common in business to have over 1000products that need forecasting at least monthly.
2 Forecasts are often required by people who areuntrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
å determine an appropriate time series model;
å estimate the parameters;
å compute the forecasts with prediction intervals.
Automatic time series forecasting Motivation 4
Motivation
1 Common in business to have over 1000products that need forecasting at least monthly.
2 Forecasts are often required by people who areuntrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
å determine an appropriate time series model;
å estimate the parameters;
å compute the forecasts with prediction intervals.
Automatic time series forecasting Motivation 4
Example: Asian sheep
Automatic time series forecasting Motivation 5
Numbers of sheep in Asia
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
250
300
350
400
450
500
550
Example: Asian sheep
Automatic time series forecasting Motivation 5
Automatic ETS forecasts
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
250
300
350
400
450
500
550
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Monthly cortecosteroid drug sales in Australia
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Automatic ARIMA forecasts
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Forecasting competitions 7
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation oftime series forecasting methods.
Highly controversial at the time.
Difficulties:8 How to measure forecast accuracy?8 How to apply methods consistently and objectively?8 How to explain unexpected results?
Common thinking was that the moresophisticated mathematical models (ARIMAmodels at the time) were necessarily better.If results showed ARIMA models not best, itmust be because analyst was unskilled.Automatic time series forecasting Forecasting competitions 9
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt toclassify series according to which forecasting techniquesperform “best”. The performance of any particulartechnique when applied to a particular series dependsessentially on (a) the model which the series obeys;(b) our ability to identify and fit this model correctly and(c) the criterion chosen to measure the forecastingaccuracy. — M.B. Priestley
. . . the paper suggests the application of normal scientificexperimental design to forecasting, with measures ofunbiased testing of forecasts against subsequent reality,for success or failure. A long overdue reform.
— F.H. Hansford-Miller
Automatic time series forecasting Forecasting competitions 10
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt toclassify series according to which forecasting techniquesperform “best”. The performance of any particulartechnique when applied to a particular series dependsessentially on (a) the model which the series obeys;(b) our ability to identify and fit this model correctly and(c) the criterion chosen to measure the forecastingaccuracy. — M.B. Priestley
. . . the paper suggests the application of normal scientificexperimental design to forecasting, with measures ofunbiased testing of forecasts against subsequent reality,for success or failure. A long overdue reform.
— F.H. Hansford-Miller
Automatic time series forecasting Forecasting competitions 10
Makridakis and Hibon (1979)
Modern man is fascinated with the subject offorecasting — W.G. Gilchrist
It is amazing to me, however, that after all thisexercise in identifying models, transforming and soon, that the autoregressive moving averages comeout so badly. I wonder whether it might be partlydue to the authors not using the backwardsforecasting approach to obtain the initial errors.
— W.G. Gilchrist
Automatic time series forecasting Forecasting competitions 11
Makridakis and Hibon (1979)
Modern man is fascinated with the subject offorecasting — W.G. Gilchrist
It is amazing to me, however, that after all thisexercise in identifying models, transforming and soon, that the autoregressive moving averages comeout so badly. I wonder whether it might be partlydue to the authors not using the backwardsforecasting approach to obtain the initial errors.
— W.G. Gilchrist
Automatic time series forecasting Forecasting competitions 11
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properlyapplied, can actually be worse than so many of thesimple methods — C. Chatfield
Why do empirical studies sometimes give differentanswers? It may depend on the selected sample oftime series, but I suspect it is more likely to dependon the skill of the analyst and on their individualinterpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simpleprocedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properlyapplied, can actually be worse than so many of thesimple methods — C. Chatfield
Why do empirical studies sometimes give differentanswers? It may depend on the selected sample oftime series, but I suspect it is more likely to dependon the skill of the analyst and on their individualinterpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simpleprocedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properlyapplied, can actually be worse than so many of thesimple methods — C. Chatfield
Why do empirical studies sometimes give differentanswers? It may depend on the selected sample oftime series, but I suspect it is more likely to dependon the skill of the analyst and on their individualinterpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simpleprocedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:empirical evidence is in disagreement with histheoretical arguments. — S. Makridakis & M. Hibon
Dr Chatfield expresses some personal views aboutthe first author . . . It might be useful for Dr Chatfieldto read some of the psychological literature quotedin the main paper, and he can then learn a littlemore about biases and how they affect priorprobabilities. — S. Makridakis & M. Hibon
Automatic time series forecasting Forecasting competitions 13
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:empirical evidence is in disagreement with histheoretical arguments. — S. Makridakis & M. Hibon
Dr Chatfield expresses some personal views aboutthe first author . . . It might be useful for Dr Chatfieldto read some of the psychological literature quotedin the main paper, and he can then learn a littlemore about biases and how they affect priorprobabilities. — S. Makridakis & M. Hibon
Automatic time series forecasting Forecasting competitions 13
Consequences of M&H (1979)
As a result of this paper, researchers started to:
å consider how to automate forecasting methods;
å study what methods give the best forecasts;
å be aware of the dangers of over-fitting;
å treat forecasting as a different problem fromtime series analysis.
Makridakis & Hibon followed up with a newcompetition in 1982:
1001 seriesAnyone could submit forecasts (avoiding thecharge of incompetence)Multiple forecast measures used.Automatic time series forecasting Forecasting competitions 14
Consequences of M&H (1979)
As a result of this paper, researchers started to:
å consider how to automate forecasting methods;
å study what methods give the best forecasts;
å be aware of the dangers of over-fitting;
å treat forecasting as a different problem fromtime series analysis.
Makridakis & Hibon followed up with a newcompetition in 1982:
1001 seriesAnyone could submit forecasts (avoiding thecharge of incompetence)Multiple forecast measures used.Automatic time series forecasting Forecasting competitions 14
M-competition
Automatic time series forecasting Forecasting competitions 15
M-competition
Main findings (taken from Makridakis & Hibon, 2000)
1 Statistically sophisticated or complex methods donot necessarily provide more accurate forecaststhan simpler ones.
2 The relative ranking of the performance of thevarious methods varies according to the accuracymeasure being used.
3 The accuracy when various methods are beingcombined outperforms, on average, the individualmethods being combined and does very well incomparison to other methods.
4 The accuracy of the various methods depends uponthe length of the forecasting horizon involved.
Automatic time series forecasting Forecasting competitions 16
M3 competition
Automatic time series forecasting Forecasting competitions 17
Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors tosettle the accuracy issue of various time series methods. . .The extension involves the inclusion of more methods/researchers (in particular in the areas of neural networksand expert systems) and more series.”
3003 seriesAll data from business, demography, finance andeconomics.Series length between 14 and 126.Either non-seasonal, monthly or quarterly.All time series positive.M&H claimed that the M3-competition supported thefindings of their earlier work.However, best performing methods far from “simple”.Automatic time series forecasting Forecasting competitions 18
Makridakis and Hibon (2000)Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be average oflinear regression and simple exponential smoothingwith drift, applied to seasonally adjusted data.
Later, the original authors claimed that theirexplanation was incorrect.
Forecast Pro
A commercial software package with an unknownalgorithm.
Known to fit either exponential smoothing or ARIMAmodels using BIC.
Automatic time series forecasting Forecasting competitions 19
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic time series forecasting Forecasting competitions 20
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic time series forecasting Forecasting competitions 20
ä Calculations do not match
published paper.
ä Some contestants apparently
submitted multiple entries but only
best ones published.
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Forecasting the PBS 21
Forecasting the PBS
Automatic time series forecasting Forecasting the PBS 22
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) isthe Australian government drugs subsidy scheme.
Many drugs bought from pharmacies aresubsidised to allow more equitable access tomodern drugs.
The cost to government is determined by thenumber and types of drugs purchased.Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecastsof drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) isthe Australian government drugs subsidy scheme.
Many drugs bought from pharmacies aresubsidised to allow more equitable access tomodern drugs.
The cost to government is determined by thenumber and types of drugs purchased.Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecastsof drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) isthe Australian government drugs subsidy scheme.
Many drugs bought from pharmacies aresubsidised to allow more equitable access tomodern drugs.
The cost to government is determined by thenumber and types of drugs purchased.Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecastsof drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) isthe Australian government drugs subsidy scheme.
Many drugs bought from pharmacies aresubsidised to allow more equitable access tomodern drugs.
The cost to government is determined by thenumber and types of drugs purchased.Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecastsof drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecastedby $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,uncontrollable expenditure.
Although monthly data available for 10 years,data are aggregated to annual values, and onlythe first three years are used in estimating theforecasts.
All forecasts being done with the FORECASTfunction in MS-Excel!Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecastedby $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,uncontrollable expenditure.
Although monthly data available for 10 years,data are aggregated to annual values, and onlythe first three years are used in estimating theforecasts.
All forecasts being done with the FORECASTfunction in MS-Excel!Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecastedby $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,uncontrollable expenditure.
Although monthly data available for 10 years,data are aggregated to annual values, and onlythe first three years are used in estimating theforecasts.
All forecasts being done with the FORECASTfunction in MS-Excel!Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecastedby $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,uncontrollable expenditure.
Although monthly data available for 10 years,data are aggregated to annual values, and onlythe first three years are used in estimating theforecasts.
All forecasts being done with the FORECASTfunction in MS-Excel!Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecastedby $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,uncontrollable expenditure.
Although monthly data available for 10 years,data are aggregated to annual values, and onlythe first three years are used in estimating theforecasts.
All forecasts being done with the FORECASTfunction in MS-Excel!Automatic time series forecasting Forecasting the PBS 24
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A03 concession safety net group
Time
$ th
ousa
nds
1995 2000 2005
020
040
060
080
010
0012
00
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A05 general copayments group
Time
$ th
ousa
nds
1995 2000 2005
050
100
150
200
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: D01 general copayments group
Time
$ th
ousa
nds
1995 2000 2005
010
020
030
040
050
060
070
0
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: S01 general copayments group
Time
$ th
ousa
nds
1995 2000 2005
050
010
0015
00
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: R03 general copayments group
Time
$ th
ousa
nds
1995 2000 2005
1000
2000
3000
4000
5000
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Time series forecasting 26
Time series forecasting
yt−1
yt−2
yt−3
yt
Inputs Output
Automatic time series forecasting Time series forecasting 27
Time series forecasting
yt−1
yt−2
yt−3
εt
yt
Inputs Output
Automatic time series forecasting Time series forecasting 27
Autoregression (AR)model
Time series forecasting
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting Time series forecasting 27
Autoregression movingaverage (ARMA) model
Time series forecasting
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting Time series forecasting 27
Autoregression movingaverage (ARMA) model
EstimationCompute likelihood L fromε1, ε2, . . . , εT.Use optimizationalgorithm to maximize L.
Time series forecasting
xt−1
εt
yt
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
Time series forecasting
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasting Time series forecasting 28
State space model
xt is unobserved.
EstimationCompute likelihood L fromε1, ε2, . . . , εT.Use optimizationalgorithm to maximize L.
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Evaluating forecast accuracy 29
Cross-validationTraditional evaluation
Automatic time series forecasting Evaluating forecast accuracy 30
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● timeTraining data Test data
Cross-validationTraditional evaluation
Standard cross-validation
Automatic time series forecasting Evaluating forecast accuracy 30
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● timeTraining data Test data
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●
Cross-validationTraditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Evaluating forecast accuracy 30
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● timeTraining data Test data
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
Cross-validationTraditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Evaluating forecast accuracy 30
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● timeTraining data Test data
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
Cross-validationTraditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Evaluating forecast accuracy 30
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● timeTraining data Test data
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
Also known as “Evaluation ona rolling forecast origin”
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number ofestimated parameters in the model.
This is a penalized likelihood approach.If L is Gaussian, then AIC ≈ c + T log MSE + 2kwhere c is a constant, MSE is from one-stepforecasts on training set, and T is the length ofthe series.
Minimizing the Gaussian AIC is asymptoticallyequivalent (as T →∞) to minimizing MSE fromone-step forecasts on test set via time seriescross-validation.
Automatic time series forecasting Evaluating forecast accuracy 31
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number ofestimated parameters in the model.
This is a penalized likelihood approach.If L is Gaussian, then AIC ≈ c + T log MSE + 2kwhere c is a constant, MSE is from one-stepforecasts on training set, and T is the length ofthe series.
Minimizing the Gaussian AIC is asymptoticallyequivalent (as T →∞) to minimizing MSE fromone-step forecasts on test set via time seriescross-validation.
Automatic time series forecasting Evaluating forecast accuracy 31
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number ofestimated parameters in the model.
This is a penalized likelihood approach.If L is Gaussian, then AIC ≈ c + T log MSE + 2kwhere c is a constant, MSE is from one-stepforecasts on training set, and T is the length ofthe series.
Minimizing the Gaussian AIC is asymptoticallyequivalent (as T →∞) to minimizing MSE fromone-step forecasts on test set via time seriescross-validation.
Automatic time series forecasting Evaluating forecast accuracy 31
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number ofestimated parameters in the model.
This is a penalized likelihood approach.If L is Gaussian, then AIC ≈ c + T log MSE + 2kwhere c is a constant, MSE is from one-stepforecasts on training set, and T is the length ofthe series.
Minimizing the Gaussian AIC is asymptoticallyequivalent (as T →∞) to minimizing MSE fromone-step forecasts on test set via time seriescross-validation.
Automatic time series forecasting Evaluating forecast accuracy 31
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number ofestimated parameters in the model.
This is a penalized likelihood approach.If L is Gaussian, then AIC ≈ c + T log MSE + 2kwhere c is a constant, MSE is from one-stepforecasts on training set, and T is the length ofthe series.
Minimizing the Gaussian AIC is asymptoticallyequivalent (as T →∞) to minimizing MSE fromone-step forecasts on test set via time seriescross-validation.
Automatic time series forecasting Evaluating forecast accuracy 31
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AICFor small T, AIC tends to over-fit. Bias-correctedversion:
AICC = AIC + 2(k+1)(k+2)T−k
Bayesian Information Criterion
BIC = AIC + k[log(T)− 2]
BIC penalizes terms more heavily than AICMinimizing BIC is consistent if there is a truemodel.Automatic time series forecasting Evaluating forecast accuracy 32
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AICFor small T, AIC tends to over-fit. Bias-correctedversion:
AICC = AIC + 2(k+1)(k+2)T−k
Bayesian Information Criterion
BIC = AIC + k[log(T)− 2]
BIC penalizes terms more heavily than AICMinimizing BIC is consistent if there is a truemodel.Automatic time series forecasting Evaluating forecast accuracy 32
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AICFor small T, AIC tends to over-fit. Bias-correctedversion:
AICC = AIC + 2(k+1)(k+2)T−k
Bayesian Information Criterion
BIC = AIC + k[log(T)− 2]
BIC penalizes terms more heavily than AICMinimizing BIC is consistent if there is a truemodel.Automatic time series forecasting Evaluating forecast accuracy 32
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automaticforecasting purposes. Also requires large T.
As T →∞, BIC selects true model if there isone. But that is never true!
AICc focuses on forecasting performance, canbe used on small samples and is very fast tocompute.
Empirical studies in forecasting show AIC isbetter than BIC for forecast accuracy.
Automatic time series forecasting Evaluating forecast accuracy 33
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automaticforecasting purposes. Also requires large T.
As T →∞, BIC selects true model if there isone. But that is never true!
AICc focuses on forecasting performance, canbe used on small samples and is very fast tocompute.
Empirical studies in forecasting show AIC isbetter than BIC for forecast accuracy.
Automatic time series forecasting Evaluating forecast accuracy 33
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automaticforecasting purposes. Also requires large T.
As T →∞, BIC selects true model if there isone. But that is never true!
AICc focuses on forecasting performance, canbe used on small samples and is very fast tocompute.
Empirical studies in forecasting show AIC isbetter than BIC for forecast accuracy.
Automatic time series forecasting Evaluating forecast accuracy 33
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automaticforecasting purposes. Also requires large T.
As T →∞, BIC selects true model if there isone. But that is never true!
AICc focuses on forecasting performance, canbe used on small samples and is very fast tocompute.
Empirical studies in forecasting show AIC isbetter than BIC for forecast accuracy.
Automatic time series forecasting Evaluating forecast accuracy 33
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automaticforecasting purposes. Also requires large T.
As T →∞, BIC selects true model if there isone. But that is never true!
AICc focuses on forecasting performance, canbe used on small samples and is very fast tocompute.
Empirical studies in forecasting show AIC isbetter than BIC for forecast accuracy.
Automatic time series forecasting Evaluating forecast accuracy 33
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Exponential smoothing 34
Exponential smoothing
Automatic time series forecasting Exponential smoothing 35
Exponential smoothing
Automatic time series forecasting Exponential smoothing 35
www.OTexts.com/fpp
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend methodA,A: Additive Holt-Winters’ method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend methodA,A: Additive Holt-Winters’ methodA,M: Multiplicative Holt-Winters’ method
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothingmethods.
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothingmethods.Each can have an additive or multiplicative error,giving 30 separate models.
Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothingmethods.Each can have an additive or multiplicative error,giving 30 separate models.Only 19 models are numerically stable.Automatic time series forecasting Exponential smoothing 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↑
TrendExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↑ ↖
Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↗ ↑ ↖
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↗ ↑ ↖
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↗ ↑ ↖
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 37
Innovations state space models
å All ETS models can be written in innovationsstate space form (IJF, 2002).
å Additive and multiplicative versions give thesame point forecasts but different predictionintervals.
ETS state space models
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasting Exponential smoothing 38
State space modelxt = (`t,bt, st, st−1, . . . , st−m+1)
EstimationOptimize L wrt θ = (α, β, γ, φ)and initial statesx0 = (`0,b0, s0, s−1, . . . , s−m+1).
ets algorithm in R
Automatic time series forecasting Exponential smoothing 39
Based on Hyndman & Khandakar(IJF 2008):
Apply each of 19 models that areappropriate to the data. Optimizeparameters and initial valuesusing MLE.
Select best method using AICc.
Produce forecasts using bestmethod.
Obtain prediction intervals usingunderlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 39
Based on Hyndman & Khandakar(IJF 2008):
Apply each of 19 models that areappropriate to the data. Optimizeparameters and initial valuesusing MLE.
Select best method using AICc.
Produce forecasts using bestmethod.
Obtain prediction intervals usingunderlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 39
Based on Hyndman & Khandakar(IJF 2008):
Apply each of 19 models that areappropriate to the data. Optimizeparameters and initial valuesusing MLE.
Select best method using AICc.
Produce forecasts using bestmethod.
Obtain prediction intervals usingunderlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 39
Based on Hyndman & Khandakar(IJF 2008):
Apply each of 19 models that areappropriate to the data. Optimizeparameters and initial valuesusing MLE.
Select best method using AICc.
Produce forecasts using bestmethod.
Obtain prediction intervals usingunderlying state space model.
Exponential smoothing
Automatic time series forecasting Exponential smoothing 40
Forecasts from ETS(M,A,N)
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
300
400
500
600
Exponential smoothing
fit <- ets(livestock)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting Exponential smoothing 41
Forecasts from ETS(M,A,N)
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
300
400
500
600
Exponential smoothing
Automatic time series forecasting Exponential smoothing 42
Forecasts from ETS(M,Md,M)
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
1.6
Exponential smoothing
fit <- ets(h02)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting Exponential smoothing 43
Forecasts from ETS(M,Md,M)
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
1.6
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
Automatic time series forecasting Exponential smoothing 44
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting ARIMA modelling 45
Time series forecasting
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Automatic time series forecasting ARIMA modelling 46
ARIMA modelAutoregression movingaverage (ARMA) modelapplied to differences.
EstimationCompute likelihood L fromε1, ε2, . . . , εT.Use optimizationalgorithm to maximize L.
ARIMA modelling
Automatic time series forecasting ARIMA modelling 47
ARIMA modelling
Automatic time series forecasting ARIMA modelling 47
ARIMA modelling
Automatic time series forecasting ARIMA modelling 47
Auto ARIMA
Automatic time series forecasting ARIMA modelling 48
Forecasts from ARIMA(0,1,0) with drift
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
250
300
350
400
450
500
550
Auto ARIMA
fit <- auto.arima(livestock)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting ARIMA modelling 49
Forecasts from ARIMA(0,1,0) with drift
Year
mill
ions
of s
heep
1960 1970 1980 1990 2000 2010
250
300
350
400
450
500
550
Auto ARIMA
Automatic time series forecasting ARIMA modelling 50
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
Auto ARIMA
fit <- auto.arima(h02)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting ARIMA modelling 51
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Tota
l scr
ipts
(m
illio
ns)
1995 2000 2005 2010
0.4
0.6
0.8
1.0
1.2
1.4
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1− B)dyt = c + θ(B)εt
Need to select appropriate orders p,q,d, andwhether to include c.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1− B)dyt = c + θ(B)εt
Need to select appropriate orders p,q,d, andwhether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:Select no. differences d via KPSS unit root test.Select p,q, c by minimising AICc.Use stepwise search to traverse model space,starting with a simple model and consideringnearby variants.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1− B)dyt = c + θ(B)εt
Need to select appropriate orders p,q,d, andwhether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:Select no. differences d via KPSS unit root test.Select p,q, c by minimising AICc.Use stepwise search to traverse model space,starting with a simple model and consideringnearby variants.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm)φ(B)(1− B)d(1− Bm)Dyt = c + Θ(Bm)θ(B)εt
Need to select appropriate orders p,q,d, P,Q,D, andwhether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:Select no. differences d via KPSS unit root test.Select D using OCSB unit root test.Select p,q, P,Q, c by minimising AIC.Use stepwise search to traverse model space,starting with a simple model and consideringnearby variants.
Automatic time series forecasting ARIMA modelling 53
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
AutoARIMA 19.04 13.86 1.47
Automatic time series forecasting ARIMA modelling 54
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
AutoARIMA 19.04 13.86 1.47
ETS-ARIMA 17.92 13.02 1.44
Automatic time series forecasting ARIMA modelling 54
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additiveand AutoARIMA) is the only fully documentedmethod that is comparable to the M3competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additiveand AutoARIMA) is the only fully documentedmethod that is comparable to the M3competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additiveand AutoARIMA) is the only fully documentedmethod that is comparable to the M3competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additiveand AutoARIMA) is the only fully documentedmethod that is comparable to the M3competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additiveand AutoARIMA) is the only fully documentedmethod that is comparable to the M3competition winners.
Automatic time series forecasting ARIMA modelling 55
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Automatic nonlinear forecasting? 56
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3competition!Very few machine learning methods getpublished in the IJF because authors cannotdemonstrate their methods give betterforecasts than linear benchmark methods,even on supposedly nonlinear data.
Some good recent work by Kourentzes andCrone (Neurocomputing, 2010) on automatedANN for time series.Watch this space!Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3competition!
Very few machine learning methods getpublished in the IJF because authors cannotdemonstrate their methods give betterforecasts than linear benchmark methods,even on supposedly nonlinear data.
Some good recent work by Kourentzes andCrone (Neurocomputing, 2010) on automatedANN for time series.Watch this space!Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3competition!
Very few machine learning methods getpublished in the IJF because authors cannotdemonstrate their methods give betterforecasts than linear benchmark methods,even on supposedly nonlinear data.
Some good recent work by Kourentzes andCrone (Neurocomputing, 2010) on automatedANN for time series.Watch this space!Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3competition!
Very few machine learning methods getpublished in the IJF because authors cannotdemonstrate their methods give betterforecasts than linear benchmark methods,even on supposedly nonlinear data.
Some good recent work by Kourentzes andCrone (Neurocomputing, 2010) on automatedANN for time series.Watch this space!Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3competition!
Very few machine learning methods getpublished in the IJF because authors cannotdemonstrate their methods give betterforecasts than linear benchmark methods,even on supposedly nonlinear data.
Some good recent work by Kourentzes andCrone (Neurocomputing, 2010) on automatedANN for time series.Watch this space!Automatic time series forecasting Automatic nonlinear forecasting? 57
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Time series with complex seasonality 58
Examples
Automatic time series forecasting Time series with complex seasonality 59
US finished motor gasoline products
Weeks
Tho
usan
ds o
f bar
rels
per
day
1992 1994 1996 1998 2000 2002 2004
6500
7000
7500
8000
8500
9000
9500
Examples
Automatic time series forecasting Time series with complex seasonality 59
Number of calls to large American bank (7am−9pm)
5 minute intervals
Num
ber
of c
all a
rriv
als
100
200
300
400
3 March 17 March 31 March 14 April 28 April 12 May
Examples
Automatic time series forecasting Time series with complex seasonality 59
Turkish electricity demand
Days
Ele
ctric
ity d
eman
d (G
W)
2000 2002 2004 2006 2008
1015
2025
TBATS model
TBATSTrigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term dynamics
Trend (possibly damped)
Seasonal (including multiple and non-integer periods)
Automatic algorithm described in AM De Livera,RJ Hyndman, and RD Snyder (2011). “Forecastingtime series with complex seasonal patterns usingexponential smoothing”. Journal of the AmericanStatistical Association 106(496), 1513–1527.
Automatic time series forecasting Time series with complex seasonality 60
Examples
fit <- tbats(gasoline)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting Time series with complex seasonality 61
Forecasts from TBATS(0.999, {2,2}, 1, {<52.1785714285714,8>})
Weeks
Tho
usan
ds o
f bar
rels
per
day
1995 2000 2005
7000
8000
9000
1000
0
Examples
fit <- tbats(callcentre)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting Time series with complex seasonality 62
Forecasts from TBATS(1, {3,1}, 0.987, {<169,5>, <845,3>})
5 minute intervals
Num
ber
of c
all a
rriv
als
010
020
030
040
050
0
3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
Examples
fit <- tbats(turk)fcast <- forecast(fit)plot(fcast)
Automatic time series forecasting Time series with complex seasonality 63
Forecasts from TBATS(0, {5,3}, 0.997, {<7,3>, <354.37,12>, <365.25,4>})
Days
Ele
ctric
ity d
eman
d (G
W)
2000 2002 2004 2006 2008 2010
1015
2025
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Time series forecasting
5 Evaluating forecast accuracy
6 Exponential smoothing
7 ARIMA modelling
8 Automatic nonlinear forecasting?
9 Time series with complex seasonality
10 Forecasts about automatic forecasting
Automatic time series forecasting Forecasts about automatic forecasting 64
Forecasts about forecasting
1 Automatic algorithms will become moregeneral — handling a wide variety of timeseries.
2 Model selection methods will take accountof multi-step forecast accuracy as well asone-step forecast accuracy.
3 Automatic forecasting algorithms formultivariate time series will be developed.
4 Automatic forecasting algorithms thatinclude covariate information will bedeveloped.
Automatic time series forecasting Forecasts about automatic forecasting 65
Forecasts about forecasting
1 Automatic algorithms will become moregeneral — handling a wide variety of timeseries.
2 Model selection methods will take accountof multi-step forecast accuracy as well asone-step forecast accuracy.
3 Automatic forecasting algorithms formultivariate time series will be developed.
4 Automatic forecasting algorithms thatinclude covariate information will bedeveloped.
Automatic time series forecasting Forecasts about automatic forecasting 65
Forecasts about forecasting
1 Automatic algorithms will become moregeneral — handling a wide variety of timeseries.
2 Model selection methods will take accountof multi-step forecast accuracy as well asone-step forecast accuracy.
3 Automatic forecasting algorithms formultivariate time series will be developed.
4 Automatic forecasting algorithms thatinclude covariate information will bedeveloped.
Automatic time series forecasting Forecasts about automatic forecasting 65
Forecasts about forecasting
1 Automatic algorithms will become moregeneral — handling a wide variety of timeseries.
2 Model selection methods will take accountof multi-step forecast accuracy as well asone-step forecast accuracy.
3 Automatic forecasting algorithms formultivariate time series will be developed.
4 Automatic forecasting algorithms thatinclude covariate information will bedeveloped.
Automatic time series forecasting Forecasts about automatic forecasting 65
For further information
robjhyndman.com
Slides and references for this talk.
Links to all papers and books.
Links to R packages.
A blog about forecasting research.
Automatic time series forecasting Forecasts about automatic forecasting 66