1
Decision 411: Class 4
• Non-seasonal averaging & smoothing models– Simple moving average (SMA) model– Simple exponential smoothing (SES) model– Linear exponential smoothing (LES) model
• Combining seasonal adjustment with non-seasonal smoothing
• Winters’ seasonal smoothing model
Guidelines for future HW writeups• Presentation should stand on its own (SG files are
mainly just for audit trail)• What’s the bottom line? (forecast, trend, key drivers?)• Clearly define the variables (units, dates,
transformations, etc.) used in the analysis• Use bullet points for key observations & findings• Use tables to present key numbers (forecasts & CI’s)• Embed the most important chart(s), with annotations• Show where the numbers came from• Explain your model’s assumptions in layman’s terms
2
Averaging & smoothing models
Today’s topics
Later: ARIMA modelsWe’ll meet ARIMA later in the course,
but briefly, an “ARIMA (p,d,q)” model is like a regression model in which the dependent variable is a
d-order difference of the input variable, and the independent variables
are p lagged values of the dependent variable (AR terms) and/or q lagged
values of the forecast errors (MA terms), plus an optional constant term. Many of the averaging & smoothing
models are special cases, e.g., an ARIMA(0,1,1) model is an SES
model.
p = # AR terms (lags of dependent
variable)
q = # MA terms (lags of errors)
d = order of differencing of input variable
3
Averaging & smoothing models• The problem: sometimes nonseasonal (or
seasonally adjusted) data appears to be “locally stationary” with a time-varying mean
• The mean (constant) model doesn’t track changes in the mean, has positivelyautocorrelated errors
• The random walk model may not perform well either in this situation: it “oversteers”, picks up too much “noise” in the data, and yields negatively correlated errors
Residual Autocorrelations for XConstant mean = 463.136
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Constant mean = 463.136
0 20 40 60 80 100 120100
300
500
700
900
Example: series “X”• Mean (constant) model yields positively
autocorrelated errors.... doesn’t react to changes in the local mean ...RMSE = 121
Strong positive autocorrelation at lag 1No reaction to local changes in data
4
Residual Autocorrelations for XRandom walk
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Random walk
0 20 40 60 80 100 120100
300
500
700
900
Example, continued
Random walk model for series X yields negatively autocorrelated errors.... overreactsto changes... RMSE=122 …not any better!
Strong negative autocorrelation at lag 1Over-reaction to local changes in data (always 1 period too late)
A solution:
• Use a model that averages or “smooths” the recent data to filter out some of the noise and estimate the local mean, such as the Simple Moving Average (SMA) model:
1 2Y Y ... Yt t t mt mY + + +− − −=
…i.e., just average the last m observed values.
5
m=3 ⇒ avg. age = 2
m=5 ⇒ avg. age = 3
m=9 ⇒ avg. age = 5, etc.
…hence it lags behind turning points by (m+1)/2 periods
Properties of SMA modelAverage age of the data in the forecast is (m+1)/2
(m+1)/2 is midway between 1 period old
and m periods old
1 2Y Y ... Yt t t mt mY + + +− − −=
Properties of SMA, continued• Long-term forecasts = horizontal straight line
(=simple average of last few values)
• Confidence limits??? No theory!!
• Works well on highly irregular data: no data point receives more weight than others, so it’s relatively robust against “outliers”
• Can also be “tapered” for even greater robustness
6
Residual Autocorrelations for XSimple moving average of 3 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 3 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedSMA with m=3 (average age=2) yields RMSE=104 (significantly better!) and less negative autocorrelation (50% confidence limits are shown here, but don’t trust them: they are based on the assumption of the mean remaining fixed at the latest value)
Forecasts lag behind turning point by about 2 periods
No autocorrelation at lag 150% confidence limits (?)
Residual Autocorrelations for XSimple moving average of 5 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 5 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedSMA with m=5 (average age=3) yields RMSE=102 (very slightly better), “smoother” forecasts, slight positive autocorrelation in errors
Forecasts lag behind turning point by about 3 periods Slight positive autocorrelation at lag 1
50% confidence limits shown
7
Residual Autocorrelations for XSimple moving average of 9 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 9 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continued
SMA with m=9 (average age=5) yields RMSE=104 (slightly worse), more positive autocorrelation in errors
Forecasts lag behind turning point by about 5 periods
More positive autocorrelation at lag 1
Simple moving average of 19 terms
0 20 40 60 80 100 120100
300
500
700
900Residual Autocorrelations for XSimple moving average of 19 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Example, continued
SMA with m=19 (average age=10) yields RMSE=118 (significantly worse), very smooth forecasts, much more positive autocorrelation
Forecasts lag behind turning point by about 10 periods
Strong positive autocorrelation at lag 1
8
Smoothness vs. responsiveness
• Note that the more we smooth the data, the more clearly we see the “signal” stand out.
• But...greater clarity comes at the expense of getting the news later.
• If we want our forecasting model to respond quickly to changes, it will also pick up “false alarms” due to noise in the data.
Conclusions
• For a time series with a randomly varying local mean, the SMA model may outperform both the mean model and the random walk model
• It allows us to “strike a balance” between averaging over too much past data or too little past data.
• However...
9
Shortcomings of SMA model• It’s hard to optimize the number of terms
(m), because it is a discrete parameter... you must use trial and error.
• Intuitively, you should not equally weightthe last m observations when computing the average... it would be better to “discount” the older data in a gradual fashion.
• These observations motivate....
Brown’s Simple Exponential Smoothing
• Let: α = “smoothing constant”St = smoothed series at period t
• Recursive smoothing formula:
St = αYt + (1− α) St-1
• Forecast for next period = current smoothed value:
tt SY =+1ˆ
10
Mathematically equivalent formulas for SES forecasts
ttt YYY ˆα)1(αˆ 1 −+=+
ttt eYY αˆˆ 1 +=+ ttt YYe ˆ−=
ttt eYY α)1(ˆ 1 −−=+
forecast=interpolation between previous forecastand previous observation
forecast=previous forecast plus fraction α of previous error:
forecast=previous observationminus fraction 1-α of previous error
Mathematically equivalent formulas for SES forecasts, continued
...]α)1(α)1(α)1(α[ˆ 33
22
11 +−+−+−+= −−−+ ttttt YYYYY
forecast = exponentially weighted moving average of all past observations
…or in other words, a discounted moving average with a discount factor of 1-α per period
Last but not least:
11
Properties of SES model• SES uses a smoothing parameter (α)
which is continuously variable, so it is easily optimized by least squares
• If α = 1, SES → random walk model
• If α = 0, SES → constant model
• Average age of data in SES forecast is 1/αExamples: α = 0.5 ⇒ avg. age = 2
α = 0.2 ⇒ avg. age = 5α = 0.1 ⇒ avg. age = 10, etc.
Properties of SES, continued• For a given average age, SES is
somewhat superior to SMA because it places relatively more weight on the most recent observation
• Hence it is slightly more "responsive" to changes occuring in the recent past.
• Caveat: it is also more sensitive to recent “outliers” than the SMA model--not so good for messy data.
12
SMA (m=9) vs. SES (α=0.2)
0
0.05
0.1
0.15
0.2
0.25
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Lag
SMA weightSES weight
SMA weights are 1/9 on first 9 lags of Y, zero afterward
SES weights are larger than SMA weights at first few lags, then gradually decline to zero
average age = 5 for both models
Average age is the center of mass (“balancing point”) of the weight distribution
Properties of SES, continued• Long-term forecasts from the basic SES
model are a horizontal straight line (no trend, as in random walk and SMA)
• SES = ARIMA(0,1,1), i.e., random walk model (without drift) plus MA=1, which adds a multiple of lag-1 forecast error:
ttt eYY α)1(ˆ 1 −−=+
random walk lag-1 error
13
Properties of SES, continued
• Note that it increases with k more slowly than for the random walk model, which is the special case α=1:
)1(2
)( α)1(1 fcstkfcst SEkSE −+=
)1()( fcstkfcst SEkSE =
• Exact k-step ahead forecast standard error can be computed using ARIMA theory:
• Hence the SES model assumes the series is “more predictable” than a random walk
Residual Autocorrelations for XSimple exponential smoothing with alpha = 0.2961
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Example, continuedSES with optimal α=0.3 (average age=3.3) yields RMSE = 99 (best yet, by a small margin), no significant residual autocorrelations
Don’t worry about an isolated spike at an oddball lag like lag 9—probably just due to a pair of large errors separated by 9 periods
50% confidence limits shown
Simple exponential smoothing with alpha = 0.2961
0 20 40 60 80 100 120100
300
500
700
900
14
SES with constant trend
• A constant linear trend can be added to an SES model by fitting it as an ARIMA(0,1,1) model with constant
• Alas, the ARIMA implementation of SES models can’t be combined with seasonal adjustment in the Forecasting procedure in Statgraphics (although you could seasonally adjust and then fit an ARIMA model in two steps)
SES with constant trend, continued
• A constant exponential trend can be added to SES by using the inflation adjustment option in Statgraphics
• The average percentage growth per period can be estimated from the slope coefficient of a linear trend model or ARIMA(0,1,0)+c model fitted with a natural log transformation
• See video clip #10 for examples
15
• Evidently what is needed is an estimate of the local trend as well as the local mean
• This is the motivating idea behind Brown’s Linear Exponential Smoothing (LES) model
• It’s also sometimes called “double exponential smoothing, because it involves a double application of exponential smoothing
What if the series has a time-varying trend, as well as a time-varying mean?
How LES works• Apply SES once to get a singly-smoothed
series St′ that lags behind the current value by 1/α − 1 periods.*
• Smooth the smoothed series (using same α) to get an even smoother series St″ that lags behind by 2(1/α − 1) periods
• To forecast the future, extrapolate a linebetween the two points (t − (1/α − 1), St′ ) and (t − 2(1/α − 1), St″ )
*Average age relative to next value is 1/α, so age relative to current value is 1/α - 1
16
XS'S''
0 20 40 60 80 1000
200
400
600
800
LES forecasts from t = 90, α=0.1*1. Draw a horizontal line extending 9 periods back in time from the current value of the singly-smoothed series
2. Draw a horizontal line extending 18 periods back in time from the current value of the doubly-smoothed series
3. Extrapolate a line into the future through the left endpoints of the
two horizontal lines
*1/α = 10, so 1/α - 1 = 9
How LES works• There are two equivalent sets of
mathematical formulas for implementing the logic of the LES model
• One set of formulas (I) explicitly computes the current estimates of level and trend in each period
• The other set of formulas (II) merely computes the next forecast from the observed data and forecast errors in the last two periods
17
LES formulas: I1. Compute singly smoothed series at period t:
S't = αYt + (1-α)S't-12. Compute doubly smoothed series:
S''t = α S't + (1-α) S''t-13. Compute the estimated level at period t:
Lt = 2S't − S''t4. Compute the estimated trend at period t:
Tt = (α/(1-α))(S't − S''t )5. Finally, the k-step ahead forecast is given by:
ttkt kTLY +=+ˆ
Startup: S'1 = S''1 = Y1
• Very important start-up values:
(If you don’t use these start-up values, the early forecasts will gyrate wildly!)
LES formulas: II
• Mathematically equivalent formula (requires fewer columns on a spreadsheet):
12
11 )α1()α1(22ˆ−−+ −+−−−= ttttt eeYYY
1221112 ,0 hence,ˆˆ YYeeYYY −====
18
Example, continuedLES model is optimized at α=0.16, yielding RMSE=102 (about the same as SES) …but the forecast plot shows a decreasing trend due to the local downward trend at end of series, confidence intervals also widen more rapidly due to assumption that trend may be varying
50% confidence limits shown
Brown's linear exp. smoothing with alpha = 0.1608
0 20 40 60 80 100 120100
300
500
700
900 Residual Autocorrelations for XBrown's linear exp. smoothing with alpha = 0.1608
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
LES vs. SES• SES assumes only a time-varying level (i.e., a
local mean), while LES assumes a time-varying level and trend.
• SES assumes that the series is more predictable than a random walk, while LES is assumes it is less predictable.
• LES model is relatively unstable, hence it may be dangerous to extrapolate the local trend very far.
• There are fancier versions of LES that include a “trend-dampening” factor.
19
LES vs. SES, continued• In both SES and LES, the smaller the value of
α, the more smoothing (i.e., less response to the most recent observation)
• Remember that the “average age” is 1/α in SES model (amount of lag behind turning points).
• In LES model, forecast is based on what was happening between 1/α and 2/α periods ago.
• When fitted to the same series, LES usually has a smaller optimal α than SES.
LES vs. SES, continued• SES is the most widely used non-seasonal forecasting model.
• It has a sounder underlying theory than the SMA model, and it is computationally convenient to use on hundreds or thousands of parallel time series (e.g., for SKU-level forecasting).
• Its assumption of no trend is often unrealistic, but it is surprisingly robust in practice for short-term forecasts--often better than LES even for series that have trends.
• You can add an exponential trend via the inflation adjustment option.
• You can add a linear trend to an SES model by fitting it as an ARIMA(0,1,1) model with constant--but you can’t combine ARIMA with seasonal adjustment in the Forecasting procedure.
20
Estimation issues• Optimization of α is performed by nonlinear
least squares (like Excel’s nonlinear solver).
• Nonlinear estimation requires a “search”process whose solution is inexact and may depend on the starting value.
• In Statgraphics, you may notice that the optimal α varies slightly when the model is revisited, because it restarts the estimation from the previous optimum.
Estimation issues, continued• α is constrained to lie between 0.0001 and
0.9999 for SES and LES models.
• If the best SES model is actually a random walk model (α=1), then the estimation algorithm will converge to 0.9999. This will often happen if the series has a significant trend.
• Once α hits its upper bound (0.9999), the estimation may get “stuck” there. Try manually changing the initial value to (say) 0.5 before re-fitting the model if the data sample is changed.
21
Estimation issues, continued• Because LES and SES use “recursive”
formulas in which each forecast depends on prior errors, their estimation also depends on how they are initialized (i.e., on the “prior errors” that are assumed at the very beginning).
• The usual approach is to just assume that the first error is zero.
• A more sophisticated approach, available as an estimation option in Statgraphics, is to use “backforecasting”* to start up the model.
*We’ll discuss this in more detail later in the course.
Holt’s linear exponential smoothing
• Holt’s model improves on LES by introducing separate smoothing constants for level and trend (“alpha” and “beta”)
• In theory, this allows it to perform more stable trend estimation while adapting to sudden jumps in level
22
Holt’s model formulas
1. Updated level Lt is an interpolation between the most recent data point and the previous forecast of the level:
1 1(1 )( )t t t tL Y L Tα α − −= + − +
Most recent data point Forecast of Ltmade at period t-1
Holt’s model formulas
2. Updated trend Tt is an interpolation between the change in the estimated level and the previous estimate of the trend:
11 1 −− β−+−β= tttt TLLT )()(
Just-observed change in the level
Previous trend estimate
23
Holt’s model formulas
3. k-step ahead forecast from period t:
Extrapolation of level and trend from period t
t k t tY L kT+ = +
Example, continuedHolt’s model is optimized at α=0.306, β=0.007 yielding RMSE = 100 (essentially same as SES & LES) …but forecast plot shows a slightly increasinglocal trend at end of series, due to relatively heavy smoothing of trend!
50% confidence limits shown
Residual Autocorrelations for XHolt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Holt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069
0 20 40 60 80 100 120100
300
500
700
900
24
Model comparisonsModels B-C-D-E hardly differ on error measures.
Model choice should also depend on
“theoretical” considerations,
such as the reasonableness
of the trend assumptions
A cautionary word about trend extrapolation
• If you are forecasting more than one period ahead, it is especially important to estimate the trend correctly
• In general, trend assumptions and estimation should be based on everything you know about a time series, not just error statistics of one-period-ahead forecasts or t-stats of slope coefficients
25
A cautionary word about trend extrapolation
• Extrapolation of time-varying trends estimated by “double smoothing” can be dangerous
• Hence SES (perhaps with fixed trend) often works better in practice
• A trend dampening factor is often used in conjunction with LES or Holt’s:
2ˆ ( ... )kt k t tY L Tφ φ φ+ = + + + +
(0 1)φ< <
Combining seasonal adjustment with a non-seasonal smoothing model
• Often a seasonally adjusted series looks like a good candidate for fitting with a smoothing or averaging model.
• Hence, you can forecast a seasonal series by a combination of seasonal adjustment and non-seasonal smoothing (or other non-seasonal model).
• This “hybrid” approach allows you to model the seasonal pattern explicitly, but it does not have a solid underlying statistical theory--confidence limits may be dubious.
• There is also some danger of overfitting the seasonal pattern if you don’t have enough seasons of data.
26
Example of LES + seasonal adjustment on a spreadsheet
The single-equation form of the LES model is easily implemented on a spread-sheet, and Solver can be used to find the value of αα that minimizes RMSE.
LES out-of-sample forecasts
The LES model, like any other one-step-ahead forecasting model, can extrapolate its forecasts into the future by “bootstrapping” itself, i.e., by
substituting the one-step-ahead forecast for the next data point and then forecasting the next period from there, and so on.
27
LES forecasts for seasonally adjusted data
0.000
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
500.000
Dec
-83
Dec
-84
Dec
-85
Dec
-86
Dec
-87
Dec
-88
Dec
-89
Dec
-90
Dec
-91
Dec
-92
Dec
-93
Dec
-94
Seasonally adjustedLES forecast
Note that LES lags behind turning points, like all smoothing models…
…but it tracks the data pretty well during stretches where
the trend is consistent……and its out-of-sample forecasts extrapolate the
most recent trend
Re-seasonalized LES forecasts
0.0
100.0
200.0
300.0
400.0
500.0
600.0
Dec-83
Jun-8
4
Dec-84
Jun-8
5
Dec-85
Jun-8
6
Dec-86
Jun-8
7
Dec-87
Jun-8
8
Dec-88
Jun-8
9
Dec-89
Jun-9
0
Dec-90
Jun-9
1
Dec-91
Jun-9
2
Dec-92
Jun-9
3
Dec-93
Jun-9
4
Dec-94
Jun-9
5
Original seriesReseasonalized forecast
28
Example: housing starts
Series displays strong seasonality as well as cyclicality
Original data (not seasonally adjusted)
Time Series Plot for HousesNSA
Hou
sesN
SA
1/83 1/87 1/91 1/95 1/99 1/0339
59
79
99
119
139
New residential construction since 1983
Note the last observation…
29
Seasonally adjusted data
After seasonal adjustment, variations in level and trend are clearer
Time Series Plot for SADJUSTED
SAD
JUST
ED
1/83 1/87 1/91 1/95 1/99 1/0354
74
94
114
134
In seasonally adjusted terms, the last observation is abnormally large!
How will different models react to it?
(This abnormality was not so
apparent on the unadjusted graph!)
Time Sequence Plot for SADJUSTEDRandom walk with drift = 0.139171
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
Nonseasonal forecasting model fitted to adjusted data: RW+drift
Depending on the kind of long-term trend assumptions we feel are appropriate, we could fit the seasonally adjusted series with
a non-seasonal model such as a random walk with drift...
This model extrapolates the long-term trend from the most recent (higher)
level
30
Time Sequence Plot for SADJUSTEDSimple exponential smoothing with alpha = 0.4682
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
…or a simple exponential smoothing model...
This model extrapolates a flat
trend from an exponentially-
weighted average of recent levels
Nonseasonal forecasting model fitted to adjusted data: SES
Time Sequence Plot for SADJUSTEDBrown's linear exp. smoothing with alpha = 0.2352
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
…or Brown’s linear exponential smoothing model...
This model tries to extrapolate the
recent trend, which is jerked upward by the
last observation
Nonseasonal forecasting model fitted to adjusted data: Brown’s LES
31
Time Sequence Plot for SADJUSTEDHolt's linear exp. smoothing with alpha = 0.4765 and beta = 0.015
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
… or Holt’s linear exponential smoothing model...
This model also tries to extrapolate the recent trend,
but the trend estimate is more conservative due
to small “beta” (heavy smoothing)
Nonseasonal forecasting model fitted to adjusted data: Holt’s LES
Hybrid seasonal models in SG• You can fit hybrid models in the Forecasting
procedure in Statgraphics by selecting “multiplicative seasonal adjustment” in conjunction with a RW or SES or LES model type.
• The forecasts are automatically “reseasonalized” in the plots and model comparison statistics
• Be on guard against overfitting: seasonal adjustment adds many parameters to the model, and estimation period statistics may not be fully adjusted to correct for additional parameters.
32
Hybrid seasonal models
Time Sequence Plot for HousesNSARandom walk with drift = 0.142988
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
RW + seasonal adjustment
Here’s the result of fitting the RW-with-drift model with multiplicative seasonal adjustment
Note sharply raised
forecasts, driven by unusual
seasonally adjusted value
of last data point
33
Time Sequence Plot for HousesNSASimple exponential smoothing with alpha = 0.4617
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting the SES model with multiplicative seasonal adjustment
More conservative (though still raised) forecasts, tighter confidence limits
SES + seasonal adjustment
Time Sequence Plot for HousesNSABrown's linear exp. smoothing with alpha = 0.2365
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting the LES model with multiplicative seasonal adjustment
Forecasts march steeply upward, confidence limits are rather wide
Brown’s LES + seasonal adjustment
34
Time Sequence Plot for HousesNSAHolt's linear exp. smoothing with alpha = 0.4667 and beta = 0.0144
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting Holt’s model with multiplicative seasonal adjustment
Forecasts start from higher level
but with flatter trend than LES, but confidence limits are rather
optimistic
Holt’s LES + seasonal adjustment
Time Sequence Plot for HousesNSALinear trend = 76.7875 + 0.0262053 t
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Just for fun, here’s a linear trend model with multiplicative seasonal adjustment
Obviously not appropriate!
Linear trend + seasonal adjustment (?)
35
Model comparison report shows that SES and Holt’s do the best in estimation
period, although RW model is slightly “luckier” in
validation period (last 4 years of data were held out)
Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
1/83 1/87 1/91 1/95 1/99 1/03-18
-8
2
12
22
Resid
ual
Residual Autocorrelations for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
lag
Aut
ocor
relat
ions
0 5 10 15 20 25-1
-0.6
-0.2
0.2
0.6
1
Residual plots for SES model show stable
variance, no significant autocorrelation… model
appears “OK”
36
Even the (vertical) probability plot looks good.* This is a “pane option” behind the “residual plots”.
Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
prop
ortio
n
-18 -8 2 12 220.1
15
2050809599
99.9
*This result validates the use of normal distribution theory to compute the confidence intervals from the forecast standard errors.
What’s the best forecast?• The main issue here is what to infer from the recent
jump in seasonally adjusted housing starts.
• Our modeling results do not really answer this question for us—they merely show the consequences of different assumptions we may wish to make.
• Ideally, “domain knowledge” should shed additional light on the appropriateness of the assumptions.
• The SES model is clearly the most “conservative” choice, because its forecasts are less radically affected by one recent observation.
37
Winter’s Seasonal Smoothing• The logic of Holt’s model can be extended to
recursively estimate time-varying seasonal indices as well as level and trend.
• Let Lt, Tt, and St denote the estimated level, trend, and seasonal index at period t.
• Let s denote the number of periods in a season.
• Let α, β, and γ denote separate smoothing constants* for level, trend, and seasonality
*numbers between 0 and 1: smaller values → more smoothing
Winters’ model formulas
1. Updated level Lt is an interpolation between the seasonally adjusted value of the most recent data point and the previous forecast of the level:
))(( 111 −−−
+α−+α= ttst
tt TL
SYL
Seasonally adjusted value of Yt
Forecast of Ltmade at period t-1
38
Winters’ model formulas
2. Updated trend Tt is an interpolation between the change in the estimated level and the previous estimate of the trend:
11 1 −− β−+−β= tttt TLLT )()(
Just-observed change in the level
Previous trend estimate
Winters’ model formulas
3. Updated seasonal index St is an interpolation between the ratio of the data point to the estimated level and the previous estimate of the seasonal index:
stt
tt S
LYS −γ−+γ= )(1
“Ratio to moving average” of
current data point
Last estimate of seasonal index in the same season
39
Winters’ model formulas
4. k-step ahead forecast from period t:
Extrapolation of level and trend from period t
Most recent estimate of the seasonal index for kth
period in the future
kstttkt SkTLY +−+ += )(ˆ
Estimation issues
• Estimation of Winters’ model is tricky, and not all software does it well: sometimes you get crazy results.
• There are three separate smoothing constants to be jointly estimated by nonlinear least squares (α, β, γ).
• Initialization is also tricky, especially for the seasonal indices.
40
Estimation issues• Some common initialization schemes:
– Naïve approach: set initial level = 1st data point, trend = 0, seasonal indices = 1.0
– More sophisticated: perform a seasonal decomposition to obtain initial seasonal indices & fit trend line to obtain initial trend
– Even more sophisticated: use backforecasting
• Calculation of confidence intervals is also complicated & not always done correctly.
Time Sequence Plot for HousesNSAWinter's exp. smoothing with alpha = 0.4454, beta = 0.0146, gamma = 0.2843
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Winter’s model fitted to housing starts
Results of fitting Winters’ model
In this case, the Winters forecasts
& confidence intervals look
similar to those of the Holt’s model
with seasonal adjustment (alpha and beta are very similar as should
be expected)
41
Model comparison report shows that
Winters’ fits a little less well than SES or Holt’s model, but is otherwise
“OK”
Winters’ model in practice• The Winters model is popular in “automatic
forecasting” software, because it has a little of everything (level, trend, seasonality).
• Sometimes it works well, but difficulties in initialization & estimation can lead to strange results in other cases.
• In principle it is similar to linear exponential smoothing and can produce similarly unstable long-term trend projections.
42
DATE
VariablesRW+driftSESLESHOLTWINTERSACTUAL
2002 2003 2004 2005 2006 200770
100
130
160
190
220
All models overpredicted housing starts for the rest of 1992 and 1993, over-responding to the Feb. ‘02 jump, but later values were in the middle range of predictions until recent plunge
What really happened in last 5 years?
Class 4 recap• Averaging and smoothing models enable you to
estimate time-varying levels and trends.
• SMA, SES, and LES models can be combined with seasonal adjustment to forecast seasonal data (...but beware of changing seasonal patterns and possibility of overfitting)
• Winters’ estimates time-varying seasonal indices.
• You need to exercise judgment in model selection in order to make appropriate assumptions about changing levels and trends & unusual events.