Rob J Hyndman
State space models
2: Structural models
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 2
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 3
State space models
xt−1 yt
xt yt+1
xt+1 yt+2
xt+2 yt+3
xt+3 yt+4
xt+4 yt+5
ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)
State space models 2: Structural models 4
State space models
xt−1 yt
xt yt+1
xt+1 yt+2
xt+2 yt+3
xt+3 yt+4
xt+4 yt+5
ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)
ETS models
å yt depends on xt−1.
å The same errorprocess affectsxt|xt−1 and yt|xt−1.
State space models 2: Structural models 4
State space models
xt yt
xt+1 yt+1
xt+2 yt+2
xt+3 yt+3
xt+4 yt+4
xt+5 yt+5
ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)
Structural models
å yt depends on xt.
å A different errorprocess affectsxt|xt−1 and yt|xt.
State space models 2: Structural models 5
Local level model
Stochastically varying level (random walk)observed with noise
yt = `t + εt
`t = `t−1 + ξt
εt and ξt are independent Gaussian white noiseprocesses.
Compare ETS(A,N,N) where ξt = αεt−1.
Parameters to estimate: σ2ε and σ2
ξ .
If σ2ξ = 0, yt ∼ NID(`0, σ
2ε ).
State space models 2: Structural models 6
Local level model
Stochastically varying level (random walk)observed with noise
yt = `t + εt
`t = `t−1 + ξt
εt and ξt are independent Gaussian white noiseprocesses.
Compare ETS(A,N,N) where ξt = αεt−1.
Parameters to estimate: σ2ε and σ2
ξ .
If σ2ξ = 0, yt ∼ NID(`0, σ
2ε ).
State space models 2: Structural models 6
Local level model
Stochastically varying level (random walk)observed with noise
yt = `t + εt
`t = `t−1 + ξt
εt and ξt are independent Gaussian white noiseprocesses.
Compare ETS(A,N,N) where ξt = αεt−1.
Parameters to estimate: σ2ε and σ2
ξ .
If σ2ξ = 0, yt ∼ NID(`0, σ
2ε ).
State space models 2: Structural models 6
Local level model
Stochastically varying level (random walk)observed with noise
yt = `t + εt
`t = `t−1 + ξt
εt and ξt are independent Gaussian white noiseprocesses.
Compare ETS(A,N,N) where ξt = αεt−1.
Parameters to estimate: σ2ε and σ2
ξ .
If σ2ξ = 0, yt ∼ NID(`0, σ
2ε ).
State space models 2: Structural models 6
Local linear trend modelDynamic trend observed with noise
yt = `t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1
Parameters to estimate: σ2ε , σ
2ξ , and σ2
ζ .If σ2
ζ = σ2ξ = 0, yt = `0 + tb0 + εt.
Model is a time-varying linear regression.State space models 2: Structural models 7
Local linear trend modelDynamic trend observed with noise
yt = `t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1
Parameters to estimate: σ2ε , σ
2ξ , and σ2
ζ .If σ2
ζ = σ2ξ = 0, yt = `0 + tb0 + εt.
Model is a time-varying linear regression.State space models 2: Structural models 7
Local linear trend modelDynamic trend observed with noise
yt = `t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1
Parameters to estimate: σ2ε , σ
2ξ , and σ2
ζ .If σ2
ζ = σ2ξ = 0, yt = `0 + tb0 + εt.
Model is a time-varying linear regression.State space models 2: Structural models 7
Local linear trend modelDynamic trend observed with noise
yt = `t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1
Parameters to estimate: σ2ε , σ
2ξ , and σ2
ζ .If σ2
ζ = σ2ξ = 0, yt = `0 + tb0 + εt.
Model is a time-varying linear regression.State space models 2: Structural models 7
Local linear trend modelDynamic trend observed with noise
yt = `t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1
Parameters to estimate: σ2ε , σ
2ξ , and σ2
ζ .If σ2
ζ = σ2ξ = 0, yt = `0 + tb0 + εt.
Model is a time-varying linear regression.State space models 2: Structural models 7
Basic structural model
yt = `t + s1,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
s1,t = −m−1∑j=1
sj,t−1 + ηt
sj,t = sj−1,t−1, j = 2, . . . ,m− 1
εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2
ε , σ2ξ , σ
2ζ and σ2
η
Deterministic seasonality if σ2η = 0.
State space models 2: Structural models 8
Basic structural model
yt = `t + s1,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
s1,t = −m−1∑j=1
sj,t−1 + ηt
sj,t = sj−1,t−1, j = 2, . . . ,m− 1
εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2
ε , σ2ξ , σ
2ζ and σ2
η
Deterministic seasonality if σ2η = 0.
State space models 2: Structural models 8
Basic structural model
yt = `t + s1,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
s1,t = −m−1∑j=1
sj,t−1 + ηt
sj,t = sj−1,t−1, j = 2, . . . ,m− 1
εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2
ε , σ2ξ , σ
2ζ and σ2
η
Deterministic seasonality if σ2η = 0.
State space models 2: Structural models 8
Basic structural model
yt = `t + s1,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
s1,t = −m−1∑j=1
sj,t−1 + ηt
sj,t = sj−1,t−1, j = 2, . . . ,m− 1
εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2
ε , σ2ξ , σ
2ζ and σ2
η
Deterministic seasonality if σ2η = 0.
State space models 2: Structural models 8
Trigonometric models
yt = `t +
J∑j=1
sj,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t
s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗
j,t
λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗
j,t are independent Gaussian whitenoise processesωj,t and ω∗
j,t have same variance σ2ω,j
Equivalent to BSM when σ2ω,j = σ2
ω and J = m/2Choose J <m/2 for fewer degrees of freedom
State space models 2: Structural models 9
Trigonometric models
yt = `t +
J∑j=1
sj,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t
s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗
j,t
λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗
j,t are independent Gaussian whitenoise processesωj,t and ω∗
j,t have same variance σ2ω,j
Equivalent to BSM when σ2ω,j = σ2
ω and J = m/2Choose J <m/2 for fewer degrees of freedom
State space models 2: Structural models 9
Trigonometric models
yt = `t +
J∑j=1
sj,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t
s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗
j,t
λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗
j,t are independent Gaussian whitenoise processesωj,t and ω∗
j,t have same variance σ2ω,j
Equivalent to BSM when σ2ω,j = σ2
ω and J = m/2Choose J <m/2 for fewer degrees of freedom
State space models 2: Structural models 9
Trigonometric models
yt = `t +
J∑j=1
sj,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t
s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗
j,t
λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗
j,t are independent Gaussian whitenoise processesωj,t and ω∗
j,t have same variance σ2ω,j
Equivalent to BSM when σ2ω,j = σ2
ω and J = m/2Choose J <m/2 for fewer degrees of freedom
State space models 2: Structural models 9
Trigonometric models
yt = `t +
J∑j=1
sj,t + εt
`t = `t−1 + bt−1 + ξt
bt = bt−1 + ζt
sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t
s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗
j,t
λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗
j,t are independent Gaussian whitenoise processesωj,t and ω∗
j,t have same variance σ2ω,j
Equivalent to BSM when σ2ω,j = σ2
ω and J = m/2Choose J <m/2 for fewer degrees of freedom
State space models 2: Structural models 9
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.
State space models 2: Structural models 10
Structural models in R
StructTS(oil, type="level")StructTS(ausair, type="trend")StructTS(austourists, type="BSM")
fit <- StructTS(austourists, type = "BSM")decomp <- cbind(austourists, fitted(fit))colnames(decomp) <- c("data","level","slope",
"seasonal")plot(decomp, main="Decomposition of
International visitor nights")
State space models 2: Structural models 11
Structural models in R20
4060
data
2535
45
leve
l
−2.
0−
0.5
slop
e
−10
010
2000 2002 2004 2006 2008 2010
seas
onal
Time
Decomposition of International visitor nights
State space models 2: Structural models 12
ETS decomposition20
4060
obse
rved
2535
45
leve
l
0.50
750.
5090
slop
e
−10
05
2000 2002 2004 2006 2008 2010
seas
on
Time
Decomposition by ETS(A,A,A) method
State space models 2: Structural models 13
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 14
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Linear Gaussian SS models
Observation equation yt = f ′xt + εt
State equation xt = Gxt−1 +wt
State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).
Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],
xt =
[`tbt
]G =
[1 10 1
]W =
[σ2ξ 0
0 σ2ζ
]State space models 2: Structural models 15
Basic structural modelLinear Gaussian state space model
yt = f ′xt + εt, εt ∼ N(0, σ2)
xt = Gxt−1 +wt wt ∼ N(0,W)
f ′ = [1 0 1 0 · · · 0], W = diagonal(σ2ξ , σ
2ζ , σ
2η ,0, . . . ,0)
xt =
`tbts1,t
s2,t
s3,t...
sm−1,t
G =
1 1 0 0 . . . 0 00 1 0 0 . . . 0 00 0 −1 −1 . . . −1 −10 0 1 0 . . . 0 0
0 0 0 1 . . . ......
......
... . . . . . . 0 00 0 0 . . . 0 1 0
State space models 2: Structural models 16
Basic structural modelLinear Gaussian state space model
yt = f ′xt + εt, εt ∼ N(0, σ2)
xt = Gxt−1 +wt wt ∼ N(0,W)
f ′ = [1 0 1 0 · · · 0], W = diagonal(σ2ξ , σ
2ζ , σ
2η ,0, . . . ,0)
xt =
`tbts1,t
s2,t
s3,t...
sm−1,t
G =
1 1 0 0 . . . 0 00 1 0 0 . . . 0 00 0 −1 −1 . . . −1 −10 0 1 0 . . . 0 0
0 0 0 1 . . . ......
......
... . . . . . . 0 00 0 0 . . . 0 1 0
State space models 2: Structural models 16
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 17
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
Iterate for t = 1, . . . , T
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
Iterate for t = 1, . . . , T
Assume we know x1|0 andP1|0.
State space models 2: Structural models 18
Kalman filterNotation:
xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]
xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]
yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]
Forecasting:yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v
−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f
′Pt|t−1
State Predictionxt+1|t = Gxt|t
Pt+1|t = GPt|tG′ +W
Iterate for t = 1, . . . , T
Assume we know x1|0 andP1|0.
Just conditional expectations. So thisgives minimum MSE estimates.
State space models 2: Structural models 18
Kalman recursionsKALMANRECURSIONS
2. Forecasting
1. State Prediction 3. State Filtering
Forecast Observation
observation at time t
Filtered State Predicted State Filtered State
Time t-1 Time t Time t
y
x
State space models 2: Structural models 19
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Initializing Kalman filterNeed x1|0 and P1|0 to get started.
Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.
Lots of research papers on optimalinitialization choices for Kalman recursions.
ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.
A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.
This gives more realistic prediction intervals.State space models 2: Structural models 20
Local level model
yt = `t + εt εt ∼ NID(0, σ2)
`t = `t−1 + ut ut ∼ NID(0,q2)
Kalman recursions:
yt|t−1 = ˆt−1|t−1
vt|t−1 = pt|t−1 + σ2
ˆt|t = ˆ
t−1|t−1 + pt|t−1v−1t|t−1(yt − yt|t−1)
pt+1|t = pt|t−1(1− v−1t|t−1pt|t−1) + q2
State space models 2: Structural models 21
Local level model
yt = `t + εt εt ∼ NID(0, σ2)
`t = `t−1 + ut ut ∼ NID(0,q2)
Kalman recursions:
yt|t−1 = ˆt−1|t−1
vt|t−1 = pt|t−1 + σ2
ˆt|t = ˆ
t−1|t−1 + pt|t−1v−1t|t−1(yt − yt|t−1)
pt+1|t = pt|t−1(1− v−1t|t−1pt|t−1) + q2
State space models 2: Structural models 21
Handling missing values
Forecasting:
yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:
xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f
′Pt|t−1
State Prediction
xt|t−1 = Gxt−1|t−1
Pt|t−1 = GPt−1|t−1G′ +W
Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.
State space models 2: Structural models 22
Handling missing values
Forecasting:
yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:
xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f
′Pt|t−1
State Prediction
xt|t−1 = Gxt−1|t−1
Pt|t−1 = GPt−1|t−1G′ +W
Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.
Ignored greyed outsection if yt missing.
State space models 2: Structural models 22
Handling missing values
Forecasting:
yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:
xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f
′Pt|t−1
State Prediction
xt|t−1 = Gxt−1|t−1
Pt|t−1 = GPt−1|t−1G′ +W
Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.
Ignored greyed outsection if yt missing.
State space models 2: Structural models 22
Multi-step forecasting
Forecasting:
yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:
xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f
′Pt|t−1
State Prediction
xt|t−1 = Gxt−1|t−1
Pt|t−1 = GPt−1|t−1G′ +W
Iterate fort = T + 1, . . . , T + hstarting withxT|T and PT|T.
State space models 2: Structural models 23
Multi-step forecasting
Forecasting:
yt|t−1 = f ′xt|t−1
vt|t−1 = f ′Pt|t−1f + σ2
Updating or State Filtering:
xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f
′Pt|t−1
State Prediction
xt|t−1 = Gxt−1|t−1
Pt|t−1 = GPt−1|t−1G′ +W
Iterate fort = T + 1, . . . , T + hstarting withxT|T and PT|T.
Treat future values asmissing.
State space models 2: Structural models 23
Kalman filter
What’s so special about the Kalman filter
Very general equations for any model in statespace format.
Any model in state space format can easily begeneralized.
Optimal MSE forecasts
Easy to handle missing values.
Easy to compute likelihood.
State space models 2: Structural models 24
Kalman filter
What’s so special about the Kalman filter
Very general equations for any model in statespace format.
Any model in state space format can easily begeneralized.
Optimal MSE forecasts
Easy to handle missing values.
Easy to compute likelihood.
State space models 2: Structural models 24
Kalman filter
What’s so special about the Kalman filter
Very general equations for any model in statespace format.
Any model in state space format can easily begeneralized.
Optimal MSE forecasts
Easy to handle missing values.
Easy to compute likelihood.
State space models 2: Structural models 24
Kalman filter
What’s so special about the Kalman filter
Very general equations for any model in statespace format.
Any model in state space format can easily begeneralized.
Optimal MSE forecasts
Easy to handle missing values.
Easy to compute likelihood.
State space models 2: Structural models 24
Kalman filter
What’s so special about the Kalman filter
Very general equations for any model in statespace format.
Any model in state space format can easily begeneralized.
Optimal MSE forecasts
Easy to handle missing values.
Easy to compute likelihood.
State space models 2: Structural models 24
Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.
Likelihood
L(y1, . . . , yT;θ) =T∏
t=1
fθ(yt|y1, . . . , yt−1)
Gaussian log likelihood
log L = −T2
log(2π)− 1
2
T∑t=1
log vt|t−1 −1
2
T∑t=1
e2t /vt|t−1
where et = yt − yt|t−1.All terms obtained from Kalman filter equations.
State space models 2: Structural models 25
Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.
Likelihood
L(y1, . . . , yT;θ) =T∏
t=1
fθ(yt|y1, . . . , yt−1)
Gaussian log likelihood
log L = −T2
log(2π)− 1
2
T∑t=1
log vt|t−1 −1
2
T∑t=1
e2t /vt|t−1
where et = yt − yt|t−1.All terms obtained from Kalman filter equations.
State space models 2: Structural models 25
Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.
Likelihood
L(y1, . . . , yT;θ) =T∏
t=1
fθ(yt|y1, . . . , yt−1)
Gaussian log likelihood
log L = −T2
log(2π)− 1
2
T∑t=1
log vt|t−1 −1
2
T∑t=1
e2t /vt|t−1
where et = yt − yt|t−1.All terms obtained from Kalman filter equations.
State space models 2: Structural models 25
Structural models in RForecasts from Basic structural model
2000 2002 2004 2006 2008 2010 2012
2030
4050
6070
fit <- StructTS(austourists, type = "BSM")fc <- forecast(fit)plot(fc)
State space models 2: Structural models 26
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 27
Kalman smoothing
Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.
xt|T = xt|t + At
(xt+1|T − xt+1|t
)Pt|T = Pt|t + At
(Pt+1|T − Pt+1|t
)A′t
where At = Pt|tG′ (Pt+1|t
)−1.
Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.
State space models 2: Structural models 28
Kalman smoothing
Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.
xt|T = xt|t + At
(xt+1|T − xt+1|t
)Pt|T = Pt|t + At
(Pt+1|T − Pt+1|t
)A′t
where At = Pt|tG′ (Pt+1|t
)−1.
Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.
State space models 2: Structural models 28
Kalman smoothing
Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.
xt|T = xt|t + At
(xt+1|T − xt+1|t
)Pt|T = Pt|t + At
(Pt+1|T − Pt+1|t
)A′t
where At = Pt|tG′ (Pt+1|t
)−1.
Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.
State space models 2: Structural models 28
Kalman smoothing in R
fit <- StructTS(austourists, type = "BSM")sm <- tsSmooth(fit)
plot(austourists)lines(sm[,1],col=’blue’)lines(fitted(fit)[,1],col=’red’)legend("topleft",col=c(’blue’,’red’),lty=1,
legend=c("Filtered level","Smoothed level"))
State space models 2: Structural models 29
Kalman smoothing in R
Time
aust
ouris
ts
2000 2002 2004 2006 2008 2010
2030
4050
60 Filtered levelSmoothed level
State space models 2: Structural models 30
Kalman smoothing in R
fit <- StructTS(austourists, type = "BSM")sm <- tsSmooth(fit)
plot(austourists)
# Seasonally adjusted dataaus.sa <- austourists - sm[,3]lines(aus.sa,col=’blue’)
State space models 2: Structural models 31
Kalman smoothing in R
Time
aust
ouris
ts
2000 2002 2004 2006 2008 2010
2030
4050
60
State space models 2: Structural models 32
Kalman smoothing in R
x <- austouristsmiss <- sample(1:length(x), 5)x[miss] <- NAfit <- StructTS(x, type = "BSM")sm <- tsSmooth(fit)estim <- sm[,1]+sm[,3]
plot(x, ylim=range(austourists))points(time(x)[miss], estim[miss],
col=’red’, pch=1)points(time(x)[miss], austourists[miss],
col=’black’, pch=1)legend("topleft", pch=1, col=c(2,1),
legend=c("Estimate","Actual"))
State space models 2: Structural models 33
Kalman smoothing in R
Time
x
2000 2002 2004 2006 2008 2010
2030
4050
60
●
●
●
●
●●
●
●
●
●
●
●
EstimateActual
State space models 2: Structural models 34
Outline
1 Simple structural models
2 Linear Gaussian state space models
3 Kalman filter
4 Kalman smoothing
5 Time varying parameter models
State space models 2: Structural models 35
Time varying parameter modelsLinear Gaussian state space model
yt = f ′txt + εt, εt ∼ N(0, σ2t )
xt = Gtxt−1 +wt wt ∼ N(0,Wt)
Kalman recursions:
yt|t−1 = f ′txt|t−1
vt|t−1 = f ′tPt|t−1ft + σ2t
xt|t = xt|t−1 + Pt|t−1ftv−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1ftv−1t|t−1f
′tPt|t−1
xt|t−1 = Gtxt−1|t−1
Pt|t−1 = GtPt−1|t−1G′t +Wt
State space models 2: Structural models 36
Time varying parameter modelsLinear Gaussian state space model
yt = f ′txt + εt, εt ∼ N(0, σ2t )
xt = Gtxt−1 +wt wt ∼ N(0,Wt)
Kalman recursions:
yt|t−1 = f ′txt|t−1
vt|t−1 = f ′tPt|t−1ft + σ2t
xt|t = xt|t−1 + Pt|t−1ftv−1t|t−1(yt − yt|t−1)
Pt|t = Pt|t−1 − Pt|t−1ftv−1t|t−1f
′tPt|t−1
xt|t−1 = Gtxt−1|t−1
Pt|t−1 = GtPt−1|t−1G′t +Wt
State space models 2: Structural models 36
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Structural models with covariates
Local level with covariate
yt = `t + βzt + εt
`t = `t−1 + ξt
f ′t = [1 zt] xt =
[`tβ
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 0
]Assumes zt is fixed and known (as inregression)
Estimate of β is given by xT|T.
Equivalent to simple linear regression with timevarying intercept.
Easy to extend to multiple regression withadditional terms.
State space models 2: Structural models 37
Time varying regression
Simple linear regression with time varyingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 σ2ζ
]Allows for a linear regression with parametersthat change slowly over time.
Parameters follow independent random walks.
Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38
Time varying regression
Simple linear regression with time varyingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 σ2ζ
]Allows for a linear regression with parametersthat change slowly over time.
Parameters follow independent random walks.
Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38
Time varying regression
Simple linear regression with time varyingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 σ2ζ
]Allows for a linear regression with parametersthat change slowly over time.
Parameters follow independent random walks.
Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38
Time varying regression
Simple linear regression with time varyingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 σ2ζ
]Allows for a linear regression with parametersthat change slowly over time.
Parameters follow independent random walks.
Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38
Time varying regression
Simple linear regression with time varyingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[σ2ξ 0
0 σ2ζ
]Allows for a linear regression with parametersthat change slowly over time.
Parameters follow independent random walks.
Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38
Updating (“online”) regression
Same idea can be used to estimate aregression iteratively as new data arrives.
Simple linear regression with updatingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[0 00 0
]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.
State space models 2: Structural models 39
Updating (“online”) regression
Same idea can be used to estimate aregression iteratively as new data arrives.
Simple linear regression with updatingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[0 00 0
]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.
State space models 2: Structural models 39
Updating (“online”) regression
Same idea can be used to estimate aregression iteratively as new data arrives.
Simple linear regression with updatingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[0 00 0
]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.
State space models 2: Structural models 39
Updating (“online”) regression
Same idea can be used to estimate aregression iteratively as new data arrives.
Simple linear regression with updatingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[0 00 0
]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.
State space models 2: Structural models 39
Updating (“online”) regression
Same idea can be used to estimate aregression iteratively as new data arrives.
Simple linear regression with updatingparameters
yt = `t + βtzt + εt
`t = `t−1 + ξt
βt = βt−1 + ζt
f ′t = [1 zt] xt =
[`tβt
]G =
[1 00 1
]Wt =
[0 00 0
]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.
State space models 2: Structural models 39