+ All Categories
Home > Documents > Component estimation for electricity prices: Procedures and comparisons

Component estimation for electricity prices: Procedures and comparisons

Date post: 25-Dec-2016
Category:
Upload: fany
View: 213 times
Download: 0 times
Share this document with a friend
17
Component estimation for electricity prices: Procedures and comparisons Francesco Lisi a, , Fany Nan b a Department of Statistical Sciences, University of Padua, Via Cesare Battisti 241, 35121 Padua, Italy b Department of Economics, University of Verona, Via dell'Artigliere 19, 37129 Verona, Italy abstract article info Article history: Received 16 July 2012 Received in revised form 15 March 2014 Accepted 16 March 2014 Available online 5 April 2014 JEL classication: C01 C02 C14 C22 C53 Q4 Keywords: Component estimation Filtering procedures Electricity prices Long-term dynamics Nonparametric methods Electricity price time series usually exhibit some form of nonstationarity, corresponding to long-term behavior, one or more periodic components as well as dependence on calendar effects. As a result, modeling electricity prices re- quires accounting for both long-term and periodic components. In the literature, several ltering procedures have been proposed but a standard has not yet been found. Furthermore, since different procedures are applied in con- texts that are not homogeneous with respect to data, periods and nal goals, a fair comparison is difcult. This work considers several methods for component estimation in a homogeneous framework and compares them ac- cording to specic criteria. The nal purpose is to nd an estimation procedure that performs well, independently of the intended market and that can be proposed as a reference for electricity price time series ltering. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Modeling and forecasting power prices are important issues for trad- ing and risk management in the liberalized electricity markets and, con- sequently, many studies in this eld have appeared in literature over the last decade (see Aggarwal et al., 2009; Bunn, 2004; Weron, 2006 for reviews). Electricity price time series usually exhibit some form of nonstationarity, corresponding to long-term behavior, one or more pe- riodic components, as well as dependence on calendar effects and spikes. One way to consider these components is to view them as sto- chastic processes. Stochastic trends are often modeled by Brownian mo- tion or random walk, assuming the presence of unit roots (Bosco et al., 2007; Bosco et al., 2010) or referring to long-memory (Koopman et al., 2007). Sometimes, also the seasonal component is treated as stochastic (Koopman et al., 2007), allowing joint estimation of the components. Jumps are also often considered as stochastic and treated by using diffu- sion models with Poisson jump components (for example, see Fanone et al., 2013; Pirino and Renò, 2010), by Markov-Switching models or assuming that the jump size is governed by a normal distribution (Hellström et al., 2012). Only very few works with a focus on prediction, also model spikes (Christensen et al., 2012). A second way to model electricity prices requires a preliminary esti- mation of the long-term and periodic behavior to lter out these com- ponents in order to achieve stationarity. Also, a good ltering is important because it reduces distorting effects on forecasting and en- ables a better identication of spikes. Although with different focus, there are a number of works on modeling and prediction electricity prices, which consider this issue. For example, Erlwein et al. (2012), de Jong (2006), Kosater and Mosler (2006), Misiorek et al. (2006), Pilipovic (1998) and Weron et al. (2004) estimate long-term behavior by means of polynomial (usually linear) trends together with to sine or cosine functions. Bosco et al. (2007) use a linear trend and model pe- riodicity with state space models. As a variant, Crespo Cuaresma et al. (2004), Escribano et al. (2011), Lucia and Schwartz (2002) and some of the aforementioned authors, consider monthly dummy variables, sometimes together with a linear trend, to approximate long-run dy- namics. Most of these authors describe the weekly periodic component and the daily periodicity through daily and (semi-) hourly dummy var- iables. Janczura and Weron (2009), Janczura and Weron (2010), Trück et al. (2007) and Weron (2009), use wavelet low-pass lters for the Energy Economics 44 (2014) 143159 Corresponding author. Tel.: +39 049 8274182; fax: +39 049 8274170. E-mail addresses: [email protected] (F. Lisi), [email protected] (F. Nan). http://dx.doi.org/10.1016/j.eneco.2014.03.018 0140-9883/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Energy Economics journal homepage: www.elsevier.com/locate/eneco
Transcript

Energy Economics 44 (2014) 143–159

Contents lists available at ScienceDirect

Energy Economics

j ourna l homepage: www.e lsev ie r .com/ locate /eneco

Component estimation for electricity prices: Proceduresand comparisons

Francesco Lisi a,⁎, Fany Nan b

a Department of Statistical Sciences, University of Padua, Via Cesare Battisti 241, 35121 Padua, Italyb Department of Economics, University of Verona, Via dell'Artigliere 19, 37129 Verona, Italy

⁎ Corresponding author. Tel.: +39 049 8274182; fax: +E-mail addresses: [email protected] (F. Lisi), fany

http://dx.doi.org/10.1016/j.eneco.2014.03.0180140-9883/© 2014 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 16 July 2012Received in revised form 15 March 2014Accepted 16 March 2014Available online 5 April 2014

JEL classification:C01C02C14C22C53Q4

Keywords:Component estimationFiltering proceduresElectricity pricesLong-term dynamicsNonparametric methods

Electricity price time series usually exhibit some formof nonstationarity, corresponding to long-termbehavior, oneormore periodic components as well as dependence on calendar effects. As a result, modeling electricity prices re-quires accounting for both long-term and periodic components. In the literature, several filtering procedures havebeen proposed but a standard has not yet been found. Furthermore, since different procedures are applied in con-texts that are not homogeneous with respect to data, periods and final goals, a fair comparison is difficult. Thiswork considers several methods for component estimation in a homogeneous framework and compares them ac-cording to specific criteria. The final purpose is to find an estimation procedure that performs well, independentlyof the intended market and that can be proposed as a reference for electricity price time series filtering.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

Modeling and forecasting power prices are important issues for trad-ing and riskmanagement in the liberalized electricitymarkets and, con-sequently,many studies in thisfield have appeared in literature over thelast decade (see Aggarwal et al., 2009; Bunn, 2004; Weron, 2006 forreviews).

Electricity price time series usually exhibit some form ofnonstationarity, corresponding to long-term behavior, one or more pe-riodic components, as well as dependence on calendar effects andspikes. One way to consider these components is to view them as sto-chastic processes. Stochastic trends are oftenmodeled by Brownianmo-tion or random walk, assuming the presence of unit roots (Bosco et al.,2007; Bosco et al., 2010) or referring to long-memory (Koopman et al.,2007). Sometimes, also the seasonal component is treated as stochastic(Koopman et al., 2007), allowing joint estimation of the components.Jumps are also often considered as stochastic and treated by using diffu-sion models with Poisson jump components (for example, see Fanoneet al., 2013; Pirino and Renò, 2010), by Markov-Switching models or

39 049 [email protected] (F. Nan).

assuming that the jump size is governed by a normal distribution(Hellströmet al., 2012). Only very fewworkswith a focus on prediction,also model spikes (Christensen et al., 2012).

A second way tomodel electricity prices requires a preliminary esti-mation of the long-term and periodic behavior to filter out these com-ponents in order to achieve stationarity. Also, a good filtering isimportant because it reduces distorting effects on forecasting and en-ables a better identification of spikes. Although with different focus,there are a number of works on modeling and prediction electricityprices, which consider this issue. For example, Erlwein et al. (2012),de Jong (2006), Kosater and Mosler (2006), Misiorek et al. (2006),Pilipovic (1998) and Weron et al. (2004) estimate long-term behaviorby means of polynomial (usually linear) trends together with to sineor cosine functions. Bosco et al. (2007) use a linear trend andmodel pe-riodicity with state space models. As a variant, Crespo Cuaresma et al.(2004), Escribano et al. (2011), Lucia and Schwartz (2002) and someof the aforementioned authors, consider monthly dummy variables,sometimes together with a linear trend, to approximate long-run dy-namics. Most of these authors describe the weekly periodic componentand the daily periodicity through daily and (semi-) hourly dummy var-iables. Janczura and Weron (2009), Janczura and Weron (2010), Trücket al. (2007) and Weron (2009), use wavelet low-pass filters for the

1 The indicated margin is the available capacity margin and is defined as the differencebetween the demand forecast and the sum of the maximum export limits nominated byeach generator prior to each trading period as its maximum available output capacity.

2 Lucas' PLR cointegration test, which is based on the Student-t density, can be used totest for a unit root in scalar time series; p-values are calculated through a bootstrap strat-egy based on Swensen's algorithms (Swensen, 2006).

144 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

long-run component andmoving average (ormedian) techniques to es-timate the periodic component. Wavelets were also considered bySchlueter (2010) for modeling daily average prices. In the electricitymarket literature, component estimation based on empirical mode de-composition has been produced by Kurbatsky and Tomin (2010) andQian et al. (2011). Dordonnat et al. (2010) use cubic splines and sinusoi-dal functions to approximate the trend and daily means, equivalent todummy variables, for the weekly component. Spline functions havealso been used by Bisaglia et al. (2010), to model the yearly periodiccomponent. Approaches based on local linear regression or local lineartrends for long-term and annual components have been considered byBordignon et al. (2013), Trapero and Pedregal (2009) and Veraart andVeraart (2012). The last two authors use trimmed means to estimatethe periodic daily coefficients. It is worth mentioning that in a fewcases the long-term component has been implicitly considered by dif-ferentiating prices or loads (Sigauke and Chikobvu, 2011; Weron,2005 and implicitly in Bosco et al., 2010). Gianfreda and Grossi (2012)consider fractionally differentiation. Lastly, authors who consider calen-dar effects, usually model them bymeans of dummy variables, account-ing for national holidays or other specific calendar conditions.

Which is the best method for filtering components in electricity mar-kets has not yet been assessed in the literature. From a theoretical view-point, none of existing methods is strictly preferable. Moreover, filteringprocedures have been applied in a wide variety of markets, sample pe-riods and with different final goals. Thus, also from an empirical view-point, a fair comparison among them is almost impossible. This workaims tofill this gap and to compare, in a homogeneous framework, sever-al procedures for component estimation with the goal of identifying aprocedure that can be used as a standard. We approach component esti-mation with the topic of prediction in mind; therefore we will look spe-cifically for methods that could lead to good predictive performances.However, a good estimation of components may also be useful forother issues such as spike identification and simulations. Of course, a pro-cedure that is standard does not imply that it is the best in all situations,but only that it can be viewed as a benchmark. Indeed other specific pro-cedures, or filters, may work better for certain issues.

To estimate the long-term component, eleven filtering techniqueswill be applied. They belong to the following groups: the polynomial–sinusoidal approach, local polynomial regressions, spline functions,wavelets, empirical mode decomposition, singular spectrum analysisand the well-known Kolmogorov–Zurbenko, Hodrick–Prescott andChristiano–Fitzgerald filters. For the periodic component three alterna-tive estimatorswill be considered. These are based on dummyvariables,trimmed means and centered moving medians. Mixing methods forlong-term and periodic component estimation leads us to compare 33different procedures.

Since true components are unobservable, how to compare thesefilter-ing methods becomes an issue. In Section 4 we propose three criteria torefer to for procedure evaluation. They are based on three features thatare expected of a good component estimation: after filtering there shouldnot be any time-depending pattern; there should not be residual period-icity; the procedure should positively affect the prediction accuracy oforiginal prices.

All methodswill be applied and compared using data from three im-portant markets: the British market, the Pennsylvania–New Jersey–Maryland market and the Nord Pool market. These markets have beenchosen because they differ substantially, in generationmodes, structureorganizations and land electricity demand. Indeed, the main fuels usedfor electricity generation are natural gas, coal and hydro, respectively.Since these factors influence price dynamics in different ways theyshould guarantee awide enough scope. Thus, even if there is no guaran-tee that the results of our study will extend to other markets, we thinkthat findings of our research apply beyond these specific markets andcan be considered generic.

The paper is organized as follows. In Section 2 we present somepreliminary analyses of our data, suggesting which components to

consider and how to model them. This leads us to define the refer-ence model for the components which is based on a deterministicpart and a stochastic term. Section 3 is devoted to the descriptionof component estimation. Evaluation criteria for comparing estima-tion methods are given in Section 4. Section 5 presents empirical re-sults and Section 6 concludes.

2. Preliminary analyses

In our analyses, we consider three main international electricitymarkets: the British market (APX Power UK, APX-PUK), Pennsylvania–New Jersey–Maryland market (PJM) and Nord Pool market (NP),which operates in Norway, Denmark, Sweden, Finland and Estonia.

The dataset related to the APX-PUK comprises the time seriesof prices (Pt), national day-ahead demand forecast (Dt) and indicatedmargin1 (Mt) for the period 1 April 2005 to 31 December 2010(100,848 data points, covering N = 2101 days). For PJM and NP mar-kets, only the time series of prices and actual demand were available(to us) and these time series were from 1 January 2005 to 31 December2010 (52,584 data points, covering N= 2191 days) for the PJMmarketand from1 January 2008 to 31December 2010 (26,304 data points, cov-ering N = 1096 days) for the NP market.

The data have a half-hourly frequency for APX-PUK and an hourlyfrequency for PJM and NP; therefore each day comprises 48 (for APX-PUK) or 24 (for PJM and NP) load periods with 00:00–00:30 am(00:00–01:00 am) defined as period 1. Spot price is denoted as Ptj,where t indicates the day and j indicates the load period (t = 1,2,…,N;j = 1,2,…,24 or 48). Analogously for Dtj and Mtj.

In this study, following a widespread practice in literature, each(half-)hourly time series is modeled separately, thereby eliminatingthe problem of modeling intra-daily periodicity.

Differences in load periods and markets can cause significant varia-tions in price time series. However, a first inspection, based on graphs,spectra and ACFs (see Figs. 1–4) indicates that the series show neitherawell-defined long-run behavior nor clear annual dynamics. A commoncharacteristic of price time series is the weekly periodic component (ofperiod 7), suggested by the spectra that show three peaks at the fre-quencies 1/7, 2/7 and 3/7, and a very persistent autocorrelation func-tion. This indicates that other analyses should be considered todetermine how the long-term components should be handled.

To investigate nonstationarity (i.e. deterministic vs. stochastictrends), we employed a robust unit root test based on Lucas' robustpseudo-likelihood ratio (PLR), as described in Bosco et al. (2010).2 Over-all, at the 5% significance level, thenull hypothesis of unit root is rejected91 times over 96 load periods (48+ 24+24) and,more specifically, 47times for APX-UK, 20 times for NP and 24 times for PJM. In the context ofelectricity prices, which are characterized by outliers, multiple seasonaleffects, volatilities and other messy features, such results should beinterpreted cautiously and may not be completely reliable. Neverthe-less, the use of robust tests leads us to focus on models that assumetrend-stationarity and tomodel the long-term component deterministi-cally. Thus, we assume that the dynamics of log prices can be represent-ed by a nonstationary level component Ltj, accounting for level changesand/or long-term or (semi-)periodic behavior as well as for calendar ef-fects, and a residual stationary stochastic component ptj, formally:

logPtj ¼ Ltj þ ptj: ð1Þ

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F0 10 20 30 40 50

−0.

20.

00.

20.

40.

60.

8

Lag

Par

tial A

CF

0.0 0.1 0.2 0.3 0.4 0.51e

−05

1e−

031e

−01

1e+

01frequency

spec

trum

Fig. 1. APX Power UK market, load period 14. log Ptj series, ACF, PACF and log-periodogram, with smoothed version superimposed.

145F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

This work aims at determining the best method for estimating Ltjand, consequently, ptj. In particular, the goal is to find a method thatworks effectively during all the load periods and across differentmarkets.

To this purpose, following the literature (among others, Janczura andWeron, 2010; Lucia and Schwartz, 2002; Misiorek et al., 2006; Pilipovic,

0 500 1000 1500

34

56

Days

Log

pric

e

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

Lag

Par

tial A

CF

Fig. 2. APX Power UK market, load period 36. log Ptj series, ACF, PAC

1998), Ltj is further decomposed into three additive components:

Ltj ¼ Ttj þWtj þ Ctj; ð2Þ

where Ttj describes the long-run/intra-annual dynamics, Wtj representsthe weekly periodic component and Ctj accounts for possible calendar

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

0.0 0.1 0.2 0.3 0.4 0.5

1e−

051e

−03

1e−

011e

+01

frequency

spec

trum

F and log-periodogram, with smoothed version superimposed.

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

0 10 20 30 40 50

−0.

20.

00.

20.

40.

60.

8

Lag

Par

tial A

CF

0.0 0.1 0.2 0.3 0.4 0.51e

−05

1e−

031e

−01

1e+

01

frequency

spec

trum

Fig. 3. PJMmarket, load period 18. Log Ptj series, ACF, PACF and log-periodogram, with smoothed version superimposed.

146 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

effects, such as bank holidays or specific periods of the year. It is assumedthat there is no correlation among Tt,Wt and Ct.

Note that models (1) and (2) do not include a jump component.With respect to jumps, we believe that an accurate estimation of thethree components mentioned above could indirectly improve thechances of the identification of spikes. Anyway, this approach has

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

Lag

Par

tial A

CF

Fig. 4. Nord Pool market, load period 18. Log Ptj series, ACF, PACF

been thought mainly for predictive purposes. Ltj is estimated by initiallyestimating Ttj and subsequently estimating Wtj and Ctj on the residualslogPtj−T tj. When all these components have been estimated, we obtainptj ¼ logPtj− T tj þ Wtj þ Ctj

� �.

In the next two sections, we estimate the components and evaluatethe results of the estimation procedure.

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

0.0 0.1 0.2 0.3 0.4 0.5

1e−

051e

−03

1e−

011e

+01

frequency

spec

trum

and log-periodogram, with smoothed version superimposed.

Den

sity

−0.5 0.5 1.0 1.5 2.0

0.0

0.5

1.0

1.5

2.0

Den

sity

−0.6 −0.2 0.0 0.2 0.4

02

46

8

Fig. 5. APX power UK, load periods 26, and Nord Pool, load period 13. Histograms and kernel densities.

147F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

3. Component estimation

3.1. Long-term component estimation

In this section, we deal with the estimation of the long-term compo-nent Ttj, which includes long-run and annual dynamics. Given the be-havior of the time series shown in Figs. 1–4, we expect that theregression techniques in which the trend is described by a low-degreepolynomial are not suitable for estimating the long-term componentTtj and more flexible methods are required for this purpose. Thus, wewill consider 11 methods, most of which are nonparametric, for esti-mating the long-term component: global parametric polynomial and si-nusoidal regression (PS), two variants of the local polynomial kernelregression, namely local linear regression (LLR) and loess (LO), regres-sion splines (RS) and smoothing splines (SS), theHodrick–Prescottfilter(HP), wavelets (Wav), empirical mode decomposition (EMD), theKolmogorov–Zurbenko (KZ) filter, the singular spectrum analysis(SSA) and the Christiano–Fitzgerald (CF) filter.

Now, we provide a brief description of these methods by omitting,for simplicity, the subscript j, which refers to the specific load period.

Polynomial and sinusoidal regression (PS): in this setting, we as-sume that long-term behavior can be described by a low-degree poly-nomial of time, say q, and annual periodicity can be described by msine–cosine functions. The general form is given by Tt = ∑ i = 0

q αiti +

∑ i = 1m (βisinωit + γicosωit), with ωi = 2πi/365. Parameters αi, βi and

γi are typically estimated by using ordinary least square (OLS)methods.Although this approach is not sufficiently flexible for estimating the

34

56

APX−PUK

Days

0 500 1000 1500

Log

pric

e

Fig. 6. Estimation of Tt using PS for APX Power UK ma

long-term component, we have included it in the analyses because ithas been often used in literature (Escribano et al., 2011; de Jong,2006; Kosater and Mosler, 2006; Lucia and Schwartz, 2002; Misioreket al., 2006; Pilipovic, 1998; Weron et al., 2004, among others).

Local polynomial regression is a flexible nonparametric techniquethat approximates Tt in a point t0 by a local polynomial of order q, name-ly Tt = ∑ j = 0

q βj(t − t0)j. Parameters βj are estimated by a weightedleast square method by minimizing ∑ t = 1

N (logPt − Tt)2Kh(t − t0),where Kh(t − t0) is a kernel weighting function that depends on asmoothing parameter h, called bandwidth (see Fan and Gijbels, 1996for details). Although we consider linear local approximations (q = 1)in this study, the parameters are estimated by following two alternativeapproaches. Thefirst approach is the standard kernel local linear regres-sion (LLR) with an Epanechnikov kernel (Fan et al., 1997). The secondapproach, a robust variant of the local polynomial kernel regression, iscalled Loess (LO) and it uses the tricubic kernel as a weighting function(Cleveland, 1979; Cleveland andDevlin, 1988). Local polynomial regres-sion has been used by Bordignon et al. (2013), Trapero and Pedregal(2009) and Veraart and Veraart (2012) the estimation of componentsand by Chen et al. (2008), Mendes et al. (2008) and Troncoso Loraet al. (2007) for price prediction in electricity markets.

Spline functions are a popular nonparametric regression technique,that approximates Tt by means of piecewise polynomials of order q,which are estimated in the sub-intervals of the sample delimited by a se-quence of K points called knots (for details, see Fan and Yao, 2003). Anyspline function s(t) of order q can be represented as a linear combinationof functions Sj(t) called basis functions and is expressed in the following

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

PJM

Days

Log

pric

e

rket (load period 36) and PJM (load period 18).

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

LLR

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

LO

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

RS

0 500 1000 15002.

02.

53.

03.

54.

04.

55.

0Days

Log

pric

e

SS

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

Wav

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

KZ

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

EMD

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

SSA

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

HP

0 500 1000 1500

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Days

Log

pric

e

CF

Fig. 7. APX Power UK market, load Period 14. Log price series, with fitted long-run components for different methods superimposed.

148 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

0 500 1000 1500

34

56

Days

Log

pric

e

LLR

0 500 1000 1500

34

56

Days

Log

pric

e

LO

0 500 1000 1500

34

56

Days

Log

pric

e

RS

0 500 1000 1500

34

56

DaysLo

g pr

ice

SS

0 500 1000 1500

34

56

Days

Log

pric

e

Wav

0 500 1000 1500

34

56

Days

Log

pric

e

KZ

0 500 1000 1500

34

56

Days

Log

pric

e

EMD

0 500 1000 1500

34

56

Days

Log

pric

e

SSA

0 500 1000 1500

34

56

Days

Log

pric

e

HP

0 500 1000 1500

34

56

Days

Log

pric

e

CF

Fig. 8. APX Power UK market, load Period 36. Log price series, with fitted long-run components for different methods superimposed.

149F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

LLR

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

LO

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

RS

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

SS

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

Wav

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

KZ

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

EMD

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

SSA

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

HP

0 500 1000 1500

3.5

4.0

4.5

5.0

5.5

Days

Log

pric

e

CF

Fig. 9. PJM market, load Period 18. Log price series, with fitted long-run components for different methods superimposed.

150 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

LLR

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

LO

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

RS

0 200 400 600

2.5

3.0

3.5

4.0

4.5

DaysLo

g pr

ice

SS

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

Wav

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

KZ

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

EMD

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

SSA

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

HP

0 200 400 600

2.5

3.0

3.5

4.0

4.5

Days

Log

pric

e

CF

Fig. 10. Nord Pool market, load Period 18. Log price series, with fitted long-run components for different methods superimposed.

151F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

0 500 1000 1500

−1.

0−

0.5

0.0

0.5

1.0

Days

Filt

ered

pric

e

0 10 20 30 40 50

−0.

20.

00.

20.

40.

60.

81.

0

Lag

AC

F

0 10 20 30 40 50

−0.

10.

00.

10.

20.

30.

4

Lag

Par

tial A

CF

0.0 0.1 0.2 0.3 0.4 0.51e

−05

1e−

031e

−01

1e+

01

frequency

spec

trum

Fig. 11. APX-PUK, load period 14. Filtered price series with SS, ACF, PACF and log periodogram; smoothed version superimposed.

152 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

manner: s(t)= ∑ j = 1K + q + 1βjSj(t) (see de Boor, 1978 for an exact defini-

tion of the functions Sj(t)). The unknown parameters βj are estimated byminimizing ∑ t = 1

N (logPt − s(t))2. We refer to this approach as

0 500 1000 1500

−1.

0−

0.5

0.0

0.5

1.0

1.5

Days

Filt

ered

pric

e

0 10 20 30 40 50

−0.

10.

00.

10.

20.

30.

4

Lag

Par

tial A

CF

Fig. 12. APX-PUK, load period 36. Filtered price series with SS, ACF,

regression splines (RS). In this case, the most important choice is thenumber of knots and their position that define the smoothness of theapproximation. In line with popular practice, we consider cubic splines

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

0.0 0.1 0.2 0.3 0.4 0.5

1e−

051e

−03

1e−

011e

+01

frequency

spec

trum

PACF and log periodogram; smoothed version superimposed.

0 500 1000 1500

−0.

50.

00.

51.

0

Days

Filt

ered

pric

e

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F0 10 20 30 40 50

−0.

20.

00.

20.

40.

6

Lag

Par

tial A

CF

0.0 0.1 0.2 0.3 0.4 0.5

1e−

051e

−03

1e−

011e

+01

frequency

spec

trum

Fig. 13. PJM market, load periods 18. Filtered price series with SS, ACF, PACF and log periodogram; smoothed version superimposed.

153F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

(q = 3) in this study, whereas knots are positioned at equal spaces, sothat only the number of knots need to be determined.

To overcome the requirement for fixing the number of knots,spline functions can alternatively be estimated by using the penal-ized least squares problem by minimizing the sum ∑ t = 1

N [logPt −s(t)]2 + λ ∫ [s ″ (t)]2dt, where s″(t) is the second derivative of s(t).

0 200 400 600

−0.

40.

00.

20.

40.

60.

8

Days

Filt

ered

pric

e

0 10 20 30 40 50

−0.

10.

00.

10.

20.

3

Lag

Par

tial A

CF

Fig. 14. Nord Pool market, load period 18. Filtered price series with SS,

The first term accounts for the degree of fitting and the second one pe-nalizes the roughness of the function through the parameter λ (seeGreen and Silverman, 1994). In this study, the smoothing spline ap-proach is denoted by SS.

The long-run component for both regression and smoothing splinesis given by T t ¼ s tð Þ.

0 10 20 30 40 50

−0.

20.

00.

20.

40.

60.

81.

0

Lag

AC

F

0.0 0.1 0.2 0.3 0.4 0.5

1e−

051e

−03

1e−

011e

+01

frequency

spec

trum

ACF, PACF and log periodogram; smoothed version superimposed.

Table 1Filtered series obtained with each method: number of times ANOVA and LAG7 tests accept the null hypothesis at the 5% significance level.

PS LO SS Wav EMD RS KZ LLR SSA HP CF

APX-PUK (48 load periods)ANOVA 0 48 48 48 45 48 48 48 48 48 48LAG7 0 40 43 42 38 39 42 40 43 43 43

PJM (24 load periods)ANOVA 0 24 24 24 24 24 24 24 24 24 24LAG7 0 23 22 22 21 21 23 23 9 23 23

NP (24 load periods)ANOVA 0 24 24 24 14 24 24 24 24 24 24LAG7 2 9 10 9 9 10 10 9 9 9 10

154 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

Spline functions have been used by Bisaglia et al. (2010) andDordonnat et al. (2010) to model the annual semi-periodic componentand the long-run component, respectively. Moreover, they have beenconsidered for electricity market prediction by Andalib and Atry(2009) and Zareipour et al. (2006).

The Hodrick–Prescott (HP) filter identifies Tt byminimizing the expres-sion∑ t= 1

N [logPt− Tt]2+λ∑ t= 2N− 1[(Tt + 1− Tt)− (Tt− Tt − 1)]2. Once

again, the first term accounts for the degree of fitting, whereas the sec-ond one penalizes the roughness of the function Tt through parameter λ(Hodrick and Prescott, 1997).

Wavelets (Wav) represent another approach to estimate Tt. This in-volves the projection of the signal, the time series log Pt, on an

Table 2Summaryof the one-tailed (Diebold andMariano, 1995) test (window lag= 2). Given amethoof methods in the rows is significantly higher at the 5% of level.

MSE error loss function

PS LLR LO RS SS Wav KZ EMD SSA HP CF

APX-PUKPS – 0 0 0 0 0 0 0 0 0 25.00LLR 93.75 – 0 2.08 0 0 0 33.33 6.25 0 54.17LO 93.75 2.08 – 0 0 0 2.08 39.58 6.25 0 62.50RS 93.75 0 0 – 0 0 0 31.25 4.17 0 58.33SS 95.83 2.08 16.67 0 – 0 6.25 43.75 25.00 0 70.83Wav 95.83 0 0 0 0 – 20.83 41.67 50.00 0 81.25KZ 93.75 0 0 0 0 8.33 – 39.58 81.25 0 89.58EMD 95.83 0 0 0 0 0 0 – 0 0 50.00SSA 91.67 0 0 0 0 0 0 12.50 – 0 79.17HP 95.83 4.17 18.75 4.17 4.17 2.08 6.25 50.00 33.33 – 75.00CF 50.00 0 0 0 0 0 0 0 2.08 0 –

PJMPS – 0 0 0 0 0 0 0 0 0 0LLR 29.17 – 0 0 0 0 0 8.33 0 0 0LO 58.33 0 – 0 0 0 0 25.00 0 0 0RS 50.00 0 0 – 0 0 0 12.50 0 0 0SS 66.67 0 0 0 – 0 0 45.83 0 0 0Wav 58.33 0 0 0 0 – 0 20.83 0 0 0KZ 25.00 0 0 0 0 0 – 4.17 37.50 0 0EMD 33.33 0 0 0 0 0 0 – 4.17 0 0SSA 8.33 0 0 0 0 0 0 0 – 0 0HP 66.67 4.17 0 0 0 0 0 45.83 0 – 0CF 50.00 0 0 0 0 0 0 20.83 0 0 –

NPPS – 0 0 0 0 0 0 0 0 0 41.67LLR 50.00 – 0 4.17 0 0 0 16.67 8.33 0 75.00LO 50.00 0 – 4.17 0 0 4.17 16.67 16.67 0 75.00RS 66.67 4.17 0 – 0 0 0 20.83 12.50 0 70.83SS 54.17 8.33 0 8.33 – 0 0 33.33 8.33 0 87.50Wav 45.83 0 16.67 16.67 0 – 16.67 12.50 70.83 0 91.67KZ 20.83 0 0 4.17 0 0 – 8.33 37.50 0 87.50EMD 41.67 0 0 4.17 0 0 0 – 8.33 0 66.67SSA 16.67 0 0 0 0 0 0 4.17 – 0 79.17HP 66.67 8.33 45.83 8.33 0 0 4.17 41.67 37.50 – 91.67CF 16.67 0 0 0 0 0 0 0 0 0 –

orthonormal set of basis functions called wavelets (Chui, 1992; Meyer,1993; Young, 1993; Härdle et al., 1998; Percival and Walden, 2000;Weron, 2006). In this approach, log Pt is described by a linear combinationof wavelet functions ϕi,k (providing a time-scale representation of thetime series when time location and scale are given by indices i and k, re-spectively) that are transformations of a given “mother” wavelet ϕ(·),and is expressed by the following equation: logPt ¼ ∑2m−1−1

k¼0 cm−1;k

ϕm−2;k tð Þ þ∑m−2i¼0 ∑2i−1

k¼0 ci;kϕi;k tð Þ ¼ D1 tð Þ þ A1 tð Þ, where ϕi,k(t) = 2i/2

ϕ(2it − k) and m is the level of approximation. D1(t) describes thefirst-level details, whereas A1(t) is the first-level approximationof the long-term dynamics. Similarly, one can determine details inthe first-level approximation and so on. At the m-th level, we have

d in the column, entries indicate the percentage of cases (loadperiods)where the accuracy

MAE error loss function

PS LLR LO RS SS Wav KZ EMD SSA HP CF

– 0 0 0 0 0 0 0 0 0 27.08100 – 8.33 12.50 0 4.17 4.17 58.33 18.75 0 60.42100 4.17 – 0 0 0 4.17 56.25 20.83 0 64.58100 2.08 0 – 0 0 4.17 41.67 4.17 0 60.42100 6.25 43.75 4.17 – 8.33 8.33 70.83 33.33 0 64.58100 0 0 0 0 – 14.58 52.08 58.33 0 60.42100 0 0 0 0 27.08 – 52.08 85.42 0 68.75100 0 0 0 0 0 0 – 0 0 56.25100 0 0 0 0 0 0 22.92 – 0 60.42100 10.42 39.58 10.42 12.50 14.58 12.50 77.08 47.92 – 66.6760.42 0 0 0 0 4.17 0 14.58 4.17 0 –

– 0 0 0 0 0 0 0 0 0 0100 – 0 0 0 0 0 20.83 0 0 25.00100 4.17 – 16.67 0 0 0 33.33 0 4.17 66.6795.83 4.17 0 – 0 0 0 16.67 0 0 16.67100 8.33 0 20.83 – 0 0 29.17 0 12.50 62.50100 4.17 4.17 4.17 0 – 0 41.67 0 0 70.83100 4.17 0 0 0 0 – 25.00 12.50 0 66.6787.50 0 0 0 0 0 0 – 0 0 20.8391.67 0 0 0 0 0 0 4.17 – 0 20.83100 8.33 0 16.67 0 0 0 33.33 0 – 54.1779.17 0 0 4.17 0 4.17 4.17 4.17 4.17 4.17 –

– 0 0 0 0 0 0 0 0 0 25.00100 – 16.67 8.33 12.50 20.83 33.33 25.00 50.00 4.17 83.3395.83 0 – 8.33 0 12.50 20.83 12.50 29.17 0 70.83100 0 12.50 – 0 16.67 25.00 16.67 41.67 0 75.00100 0 66.67 20.83 – 12.50 37.50 37.50 54.17 4.17 83.3391.67 0 25.00 20.83 0 – 45.83 20.83 70.83 0 95.8375.00 0 4.17 8.33 0 0 – 12.50 66.67 0 87.5083.33 0 0 0 0 0 8.33 – 25.00 0 70.8362.50 0 0 4.17 0 0 0 0 – 0 83.33100 0 79.17 50.00 0 33.33 37.50 37.50 58.33 – 83.3320.83 0 0 0 0 0 0 0 0 0 –

155F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

logPt = D1(t) + D2(t) +…+ Dm(t) + Am(t), and T t ¼ Am tð Þ. Waveletshave been used in many studies, including Janczura and Weron (2009,2010), Trück et al. (2007) and Weron (2009).

The Kolmogorov–Zurbenko (KZ) filter has been used in the environ-mental and meteorological literature (see, among others, Capilla, 2008;Eskridge et al., 1997; Rao and Zurbenko, 1994; Zurbenko et al., 1996). Itis a low-pass filter implemented through K iterations of simple movingaverages of length q of logPt.

The degree of smoothness of the estimated long-term componentdepends on the length of the moving average and T t ¼ KZq logPtð Þ.

The empirical mode decomposition (EMD) is a method proposedby Huang et al. (1998), which decomposes a time series into a finitenumber of oscillatory components based on a local scale and calledthe intrinsic mode functions (IMF). More specifically, the EMD proce-dure decomposes the series log Pt according to an iterative scheme.As a result, the original time series is expressed as follows: log Pt =∑ k = 1

K mt(k) + rt

(K), wheremt(k) is the k-th IMF and rt

(k) is the k-th resid-ual (for details, see Huang et al., 1998). K is the maximum number ofIMFs, usually log2 N. Increasing k reduces the local fluctuations in mt

(k)

and the IMFs become smoother. As in Moghtaderi et al. (2013), thelong-run component can be estimated as T t ¼ ∑K

k¼k� mkð Þt þ r Kð Þ

t ,where k⁎ is the best index for trend approximation. The EMD approachhas been used in the field of energy by An et al. (2012) and Zhang et al.(2008).

Singular spectrum analysis (SSA) is another way to solve the trendfiltering problem. Similar to wavelets, it works by decomposing the se-ries into several additive components ft

(j), the first of which can beinterpreted as the “long-term component” (Vautard and Ghil, 1989;Vautard et al., 1992; Golyandina et al., 2001). The basic version of SSAdepends on a data window of length L and comprises four steps(Golyandina et al., 2001) that lead to the decomposition of log Pt intologPt ≅ ∑ j = 1

m ft(j), where the series ft(j) can be suitably interpreted. In

our study, we consider T t ¼ f 1ð Þt .

Christiano–Fitzgerald (CF) filter. The Christiano and Fitzgerald (2003)filter is a finite data approximation to an ideal, and not feasible, bandpass

filter thatwould produce the trend component Tt. The approximation T t isobtained by a linear combination of the observed data log Pt whose

weights are chosen for minimizing E Tt−T t

h i2.

3.2. Periodic and calendar component estimation

The estimate of the long-term component Tt allows us to obtain theadjusted time series ept ¼ logPt−T t , which includes periodic and calen-dar components, characterized by a strongweekly periodicity and somecalendar effects.

The calendar component, Ct, is modeled by means of dummy vari-ables accounting for bank holidays (BH) and the days between Christ-mas and New Year's Day (CNY); therefore Ct = γ1 BHt + γ2 CNYt.

For the weekly periodic component,Wt, we consider three differentestimation methods.3 The first one is based on dummy variables (DVmethod) for each day of the week. Thus, Wt = ∑ i = 1

7 wiIit, with Iit =1 if t refers to the i-th day of the week and 0 otherwise. The daily coef-ficients wi are estimated, jointly with the coefficients γ1 and γ2, usingthe OLS method. This approach has been widely used in literature(Bisaglia et al., 2010; Crespo Cuaresma et al., 2004; de Jong, 2006;Misiorek et al., 2006, among others).

3 Classical techniques for seasonal adjustment, such as TRAMO-SEATS, ARIMA X11 andARIMA X12 have not been considered because they are specifically designed for macro-economic time series and their application is not suitable in our case. The existing softwareexpects the user to provide time series that are annual, biannual, quarterly ormonthly. Al-so, most of the features of these methods are not useful in this case: for example, there isno Easter effect and the number of trading days in a month is irrelevant. Moreover,methods based on SARIMA models consider the trend and/or the periodic componentsas stochastic, while we chose to consider them as “deterministic”.

Although the second method is a variant of the first, it is more flex-ible and robust to outliers. In this case, the daily coefficient, wi, is calcu-lated as the α-trimmed mean (TM method), computed on a rollingwindow of theept referring to kMondays, Tuesdays, and so on after hav-ing removed bank holidays andα% from the largest and the smallest ob-servations. This permits the weekly component to change according toseasons and market evolution and makes the estimates more robustto possible outliers. Calendar effects are subsequently estimatedthrough OLS regression on the series ept−wt , adjusted for the weeklycomponent.

The third approach was used by Trück et al. (2007) and is a modifiedversion of the procedure used in Janczura and Weron (2009, 2010),Weron (2006, 2009) and Weron et al. (2004). First, a possible residualtrend is estimated by applying a seven-day centered moving median4

(MMmethod),MMt ¼ median ept−3;…;eptþ3� �

, for t=4,…,N− 3, to elim-inate theweekly component and to dampen the noise. Then, the daily co-efficients are estimated as wi ¼ mean epiþ7k−MMiþ7k

� �−w;with 3 b i+7

k≤ N− 3 andw ¼ ∑7r¼1 mean eprþ7k−MMrþ7k

� �� �=7. In this manner we

can ensure that the sumof the average deviations is zero. Once again, cal-endar effects are estimated by the OLS method on the series adjusted forthe weekly components (the daily coefficients wi).

In a series of preliminary experiments, we found that a good esti-mate of the long-term component is more important than the estima-tion of the periodic component and that other methods, for exampleweighted moving average or Holt–Winter-like, give very similar orworse results. Thus, to simplify and better summarize the global resultswe decided to consider only the three simplest procedures and to useonly one for a global comparison of procedures.

Once the weekly and calendar components have been suitably esti-

mated, we expect that the residual seriespt ¼ ept− Wt þ Ct

� �¼ logPt−

T t þ Wt þ Ct

� �is stationary, without long-run and/or periodic dynam-

ics, that is it is neither dependent on time nor seven-lag autocorrelation.

4. Evaluation of the methods

In this section, we discuss the basis of evaluating the methods thatwere presented in the previous section. We cannot directly measurethe deviation from a true quantity, as is done in the fitting or predictioncontext, because the components are not observable. Thus, any evalua-tion criterion will be, to some extent, subjective. However, intuitively, agood component estimation procedure should be such that:

(i) the residual component pt is stationary. Thus, it should not showlevel changes or other kinds of dynamic dependence on time (t).This, in turn, would mean that the long-term component hasbeen correctly estimated;

(ii) pt does not show weekly periodicity or, in other words, pt is (atleast) uncorrelated with pt − 7;

(iii) given a simple predictionmodel and a set of explicative variables,the out-of-sample prediction error for spot prices, Pt, should belower than that derived from other competitive procedures.The underlying motivation is to select a method that positivelyinfluences the prediction accuracy of the original prices.

An initial analysis of the fulfillment of these requirements can beconducted by considering Figs. 11–14; however, a visual inspectionwill not suffice. Statistical tests for points (i), (ii) and (iii) can be con-ducted as follows.

To test point (i) we estimate the following two models5:

pt ¼ ϕ0 þ ϕ1 pt−1 þ ϕ2 pt−2 þ εt ; ð3Þ

4 This step is necessary only when seasonal and long-run components have not beenappropriately removed.

5 We always omit subscript j, which refers to the load period.

156 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

pt ¼ ϕ0 þ ϕ1 pt−1 þ ϕ2 pt−2 þ s tð Þ þ eεt ; ð4Þ

where theϕ0,ϕ1 andϕ2 are unknown constant parameters, s(t) is a non-parametric smoother and εt andeεt are error terms. Regressors pt − 1 andpt − 2 have been included to obtain uncorrelated errors εt to fulfill theusual requirements of a regression model.

Estimation ofmodels (3) and (4) has been performed both assumingεt as an homoscedastic and a GARCH(1,1) process, that is assuming thatεt = σtzt, with zt i.i.d.(0,1) and σt

2 = ω + αεt − 12 + δσt − 1

2 . Even ifheteroscedasticity is always present, it affects only quite marginallythe AR parameters and, thus, the short term prediction. The estimateof the AR(1) parameter is always significant while the significance ofthe AR(2) parameter estimate depends on the load period and on themarket and, lessmarkedly, on the component estimation procedure. Fo-cusing on the first two points, for the PJM market ϕ2 is always signifi-cant; for the APX-PUK market it is significant only from 18:00 to22:00; for the NP market it is not significant from 00:00 to 10:00/12:00 and from 18:00 to 20:00.

As far as volatility equation is concerned, the estimates α and δpointout a general high persistence in the conditional variance. The level ofpersistence, however, depends again on the market and, partially, onthe load period. In particular, for the NP market α þ δ is very close tounity and from00:00 to 10:00/12:00 and from18:00 to 20:00 it is great-er than one, suggesting nonstationarity. Persistence for PJM, and evenmore for APX-PUK, is less strong and the α and δ are always in the sta-tionary region of the parameter space.

Concerning the periodic component, the qualitative behavior ofdaily coefficients is the same for the three markets: they are positivefor thefirst four days of theweek, they are lessmarked positive at Fridayand they become negative in the week-end. The situation changes if weconsider also the hourly coefficients. In this case, NP and PJM marketshave basically the same behavior: from Monday (about) 7:00 to Friday13:00–15.00 the daily profiles of the hourly periodic coefficients ismore or less the same. From 15:00 of Friday to 5:00–6:00 of Monday,as expected, they assume lower andnegative values. Hourly periodic co-efficients for the APX-PUK market, instead, have the same qualitativebehavior but much more irregular with respect to the hours and tothe days and only Tuesday,Wednesday and Thursday have a quite sim-ilar daily profile.

The term s(t) allows us to consider the residual, deterministic andpossibly nonlinear dependence on time. Subsequently, we computethe analysis of variance (ANOVA) for the residuals of the models, bymeans of an F test. In this case, the null hypothesis to be tested is H0:s(t) = 0, which implies that there is no residual pattern depending ontime and consequently, models (3) and (4) are equivalent. The smooth-er s(t) will be a regression spline. The use of other smoothers, for exam-ple, smoothing splines or local polynomial regressions, does not alterthe final results; therefore, this choice is not crucial.

Point (ii) is tested by evaluating the significance of parameter ϕ7 inthe following model:

pt ¼ ϕ0 þ ϕ1 pt−1 þ ϕ2 pt−2 þ ϕ7 pt−7 þ εt ; ð5Þ

with the usual asymptotically normal test.6

To test point (iii) we refer to the Diebold andMariano (1995) test ofequal predictive accuracy. This test is based on the predictions of theoriginal spot prices that, according to models (1) and (2), are given by

Ptþ1 ¼ exp T tþ1 þ Wtþ1 þ Ctþ1 þ ptþ1

� �; ð6Þ

where T tþ1,Wtþ1, Ctþ1 and ptþ1 are predictions of the components,which

6 This requires a constrained estimation where coefficients for pt − 3, pt − 4, pt − 5, pt − 6

are set to zero.

are based only on the information available in t. Undoubtedly, predic-tions are required for each component.

Conditionally to the estimated components, Wtþ1 and Ctþ1 arestraightforward. For the prediction of the long-term component, wechoose to consider T tþ1 ¼ T t , that is, to use the estimated value in t asa forecast for t + 1. Besides its simplicity, the motivation to use thisequation comes from the fact that the long-term component, by defini-tion, should be basically the same for two contiguous days. Moreover, asimulation exercise, which is not presented here but is available fromthe authors upon request, highlighted that the one-day-ahead predic-tion accuracy of T t is better than that of other ‘natural’predictors.7More-over, some of our methods do not have any ‘natural’ predictor.

To predict the stationary component pt, we consider the followingregression model:

pt ¼ β1pt−1 þ β2pt−2 þ β3dt þ β4mt þ εt ; ð7Þ

where dt andmt, similar to pt, are the stationary residual componentsfor the day-ahead8 demand (Dt) and margin (Mt). Since margin wasavailable (to us) only for the APX-PUK market, this regressor has notbeen included for the PJM and NP markets. In general, pt, dt and mt aredifferent for different methods of computing the level component Lt.

Lastly, since the distribution of pt is very asymmetric (see, for exam-ple, Fig. 5), the error term εtwas assumed to follow a skew-t distribution(Azzalini and Capitanio, 2003). Thus, in general, E (εt) ≠ 0.

Note that we select the simple regression model (7) because we areinterested in comparing the forecasting performances for a givenmodeland not in finding the best model. Since the model is equal for all themethods, predictions should only be influenced by the method usedfor the estimation of the components. On the whole, this approach ac-counts for nonstationarity of prices and annual seasonal behavior,through the Tt component, for weekly periodic behavior through theWt component, for daily periodic behavior by considering a model foreach load period and for mean reversion through parameters β1 andβ2 of model (7). When the error term εt follows a GARCH dynamics, itconsiders also heteroscedasticity and volatility clustering. This method,instead, is not effective in capturing long-memory effects and jumps.With respect to spikes, however, we think that a good estimation ofthe above components can indirectly improve the possibility of a correctidentification of them (see Janczura et al., 2013).

Once all the components have been predicted for a given load peri-od, we obtain the spot price prediction, Ptj and, thus, the forecastingerror etj ¼ Ptj−Ptj, (t = 1,2,…,N; j = 1,2,…,24 or 48) for each methodusing transformation (6). The one-tailed Diebold andMariano test is ap-plied to the forecasting errors obtained for the estimationmethods. Thenull hypothesis is that the prediction accuracy of procedure (say) A isequal or lower than that of procedure B. The test has been performedwith two loss functions: the mean square error (MSE) and mean abso-lute error (MAE).

Furthermore, the usual diagnostic procedures based on the autocor-relation function of the prediction errors did not show any particularcritical point.

5. Empirical results

In this section, themethods described in Section 3 are applied to thedata obtained from the APX-PUK, PJM and NP markets and the resultsare compared according to the criteria defined in Section 4. Analysesare conducted for all load periods of eachmarket. Resultswill be provid-ed in an aggregated form for each market, whereas conclusions will bedrawn by referring to all three markets.

7 By ‘natural’ predictors, we mean the predictors that are related to the methods usedfor estimating Tt, for example, local linear regression and regression splines.

8 They are provided by the market management in t for t + 1.

157F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

Owing to space limitations, graphs are shown only for the peak-periods ending at 18:00. This corresponds to load period 36 for APX-PUK market and 18 for the PJM and NP markets. In addition, we showgraphs of load period 14 for the APX-PUK market as representative ofthe load periods from 6:00 to 8:00 a.m. Indeed, for the APX-PUKmarket,these load periods show a periodic component that is stronger than thatof the other periods.

There are 11 methods for the long-run component estimation andthree methods for periodic component estimation. Multiplying thesetwo categories of methods will yield 33 different ways to estimate Lt.

Not surprisingly, our analyses indicate that, within the deterministiccomponent Lt, the long-term behavior is themost important. Moreover,as expected, there is no correlation between the Tt andWt components.9

The same holds for Tt andWt+ Ct (theweekly and calendar effect com-ponents together). Moreover, comparing the procedures for the period-ic component estimation (with respect to the criteria indicated inSection 4) shows that although the results are not very different, theTM method provided in Section 3.2 is slightly better than DV and MM.Thus, for the sake of simplicity and because the qualitative conclusionsare the same, we consider all themethods for the long-term componentestimation and only the TMmethod for estimating the periodic compo-nentwhile describing the results. Therefore, in the following paragraph,we label the procedures with the methodology used to model Tt.

The procedure parameters have been selected by optimizing, insample, criteria (i) and (ii) that have been indicated in Section 4, begin-ning, possible, from some automatic selection criterion. In particular, inthe PS approach, the polynomial order q and the number of sine/cosinefunctionsm, were selected by a backward stepwise procedure based onthe AIC criterion starting from q = 5 and m = 2. For LLR we used anEpanechnikov kernel where the bandwidth h, controlling the local aver-aging, was selected by referring to the direct plug-in methodology pro-posed by Ruppert et al. (1995). However, since it is well known thatbandwidth selection procedures designed for independent data leadto small bandwidth when data is positively correlated (as in our case),we multiplied the selected bandwidth for a factor of approximately2.5. For LO, the fit is made by using points in a neighborhood, definedby a fraction of points, which define the bandwidth. This fraction,which is the smoothing parameter, was set to approximately 3–5%.When RS are involved, equally-spaced knots are selected, leading to in-tervals corresponding to approximately 25 days. For SS, using thesame logic that was used for LLR, λ was selected as approximately 2.5times the value selected by a cross-validation criterion. A similarprocess was followed for the parameter λ in the Hodrick–Prescott(HP) filter. In Wav, we considered the Daubechies least asymmetricwavelet family, LA(8), and the coefficients were estimated viathe maximal overlap discrete wavelet transform (MODWT) method(for details, see Percival and Walden, 2000). The smoothing parameteris the level of approximation m. The same role is played by windowlength L in SSA and by k⁎ in EMD. All these parameters, the length ofthemoving averages q for the KZ and themaximumperiod of oscillationpu for CF were chosen to optimize criteria (i) and (ii) indicated inSection 4. For the KZ filter we ran K = 3 iterations. Finally, after somepreliminary analyses, we decided to use the 10%-trimmed meansbased on a rolling window of 3 months (k ~ 12–13) for the estimationof the periodic components.

Let us now comment on the component estimation results. With re-spect to the estimation of Tt, the method based on global PS regressionsfared the worst (Fig. 6 provides two examples to support this). There-fore, and in the interest of space, we did not include this method in

9 Correlations between the Tt andWt components are not significant (for each load pe-riod and for the threemarkets, considering all the 11methods for the long-run componentestimation) except for hours from 00:00 to 4:00 in the PJMmarket; this is probably due tothe fact that during the night the periodic component is consistently reduced, loosingrelevance.

the graphical analyses. Examples of estimated Tt for the log priceswith the other methods are provided in Figs. 7–10.

For the same load periods, the resulting filtered series pt, togetherwith their autocorrelation (ACF) and partial autocorrelation (PACF)functions and periodograms, are shown in Figs. 11–14 for the methodbased on smoothing splines.

Table 1 lists the aggregated results of tests for residual dependenceon time (ANOVA) and on residual periodicity (LAG7). The number ofload periods for which the null hypothesis is accepted at 5% significancelevel is reported for each market and each method. Apart from themethod based on PS, which failed to work effectively, almost allmethods seemed to lead to stationary pt series, without a clear time-dependant pattern. Among the rest of the methods, however, EMDyielded the most unsatisfactory results.

The situation is less uniformwhen residual periodicity is considered.The following methods provided the best results across all the threemarkets: LO, SS, Wav, RS, KZ, LLR, HP and CF. SSA works well for APX-PUK but not for PJM.

In general, a good estimation of the components seems more diffi-cult for the NP market than for the APX-PUK and PJM market. Thismight also be owing to the sample size; the sample size for the NPmar-ket is half of that for the APX-PUK and PJM markets.

Table 2 addresses criteria (iii) and lists the results of the one-tailedDiebold and Mariano test for equal predictive accuracy. It is based on365 one-day-ahead predictions of spot prices for each market, referringto transformation (6) and model (7), always with the same regressors.Given a method in the column, say SS, the table lists the percentageof cases (load periods) where the prediction accuracy of the othermethods is significantly higher. For example, in the APX-PUK sub-table with MSE loss function, we can see that, apart from HP in4.17% of cases, SS provides the most accurate prediction. On the con-trary, the SS row indicates that the accuracy of the predictionsyielded by SS are significantly higher than those by PS in 95.83%of the cases (load periods), by LLR in 2.08% of cases, by LO in 16.67%of cases and so on.

Not surprisingly, because the respective target functions are similar,SS and HP yield very similar results. Within the APX-PUK market, apartfromHP in very few cases, no other method provides better predictionsthan SS for all periods, both with respect to MSE and MAE. On the con-trary, SS is at least sometimes better than other methods. In particular,for PJM, the prediction accuracy of the methods appear to be less dis-tinct: considering both MSE and MAE, predictions based on LO, SS,Wav, KZ and HP are equivalent and those based on RS and LLR areonly slightly worse. In the NP market, the best results are provided bySS, HP and LLR.

Overall, considering all the markets and criteria, the best results areprovided by the method based on SS. HP and local linear regressionform a set of competitivemethods. The other methods seem to performworse with respect to certain criteria or markets. Thus, in conclusion,we suggest the method based on SS as the reference for componentestimation.

6. Conclusions

In this paper, severalmethods have been analyzed for estimating thecomponents of an electricity prices time series. The final objective wasto find one method that would performwell independently of themar-ket being considered and that could be proposed as the reference forprice filtering.

For estimating the long-term component, 11 estimation techniqueswere considered: methods based on the polynomial–sinusoidal ap-proach, local polynomial regressions, spline functions, wavelets, theKolmogorov–Zurbenko filter, empirical mode decomposition, singularspectrum analysis, the Hodrick–Prescott filter and the Christiano–Fitzgerald filter. The periodic component was estimated in three ways,that is, by using dummy variables, trimmed means and centered

158 F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

moving medians. Calendar effects were always been modeled bydummy variables.

The issue of how different filtering methods should be comparedwas also considered, and three specific criteria for evaluating filteredtime series were proposed. These criteria were based on the consider-ation that a good filtered price series should not show level changes,time-dependent dynamics, or weekly periodicity. Moreover, a good fil-tering technique should positively influence the prediction accuracy ofthe original spot prices.

The analyses were conducted by using data from three importantmarkets: APX Power UK, PJM and Nord Pool. Our results indicated thatthe filtering technique based on smoothing splines and trimmed meansperformed the best for long-run and periodic component estimation.Therefore, we propose this technique as the reference for price filtering.Othermethods, such as theHodrick–Prescottfilter and local linear regres-sion, formed a set of competitive methods.

Acknowledgments

The authors would like to express their gratitude to two anonymousreferees, whose valuable remarks helped to improve the quality of thepaper.

References

Aggarwal, S.K., Saini, L.M., Kumar, A., 2009. Electricity price forecasting in deregulatedmarkets: a review and evaluation. Electr. Power Energy Syst. 31, 13–22.

An, X., Jiang, D., Zhao, M., Liu, C., 2012. Short-term prediction of wind power using EMDand chaotic theory. Commun. Nonlinear Sci. Numer. Simul. 17, 1036–1042.

Andalib, A., Atry, F., 2009. Multi-step ahead forecasts for electricity prices using narx: anew approach, a critical analysis of one-step ahead forecasts. Energy Convers.Manag. 50, 739–747.

Azzalini, A., Capitanio, A., 2003. Distributions generated by perturbation of symmetrywith emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Ser. B (StatMethodol.) 65, 367–389.

Bisaglia, L., Bordignon, S., Marzovilli, M., 2010. Modelling and forecasting hourly spot elec-tricity prices: some preliminary results. Working Paper No. 11. Department of Statis-tical Sciences, University of Padova.

Bordignon, S., Bunn, D.W., Lisi, F., Nan, F., 2013. Combining day-ahead forecasts for Britishelectricity prices. Energy Econ. 35, 88–103.

Bosco, B., Parisio, L., Pelagatti, M., 2007. Deregulated wholesale electricity prices in Italy:an empirical analysis. Int. Adv. Econ. Res. 13, 415–432.

Bosco, B., Parisio, L., Pelagatti, M., Baldi, F., 2010. Long-run relations in European electricityprices. J. Appl. Econ. 25, 805–832.

Bunn, D.W., 2004. Modelling Prices in Competitive Electricity Markets. Wiley.Capilla, C., 2008. Time series analysis and identification of trends in a Mediterranean

urban area. Glob. Planet. Chang. 63, 275–281.Chen, J., Deng, S.J., Huo, X., 2008. Electricity price curve modeling and forecasting byman-

ifold learning. IEEE Trans. Power Syst. 23, 877–888.Christensen, T.M., Hurn, A.S., Lindsay, K.A., 2012. Forecasting spikes in electricity prices.

Int. J. Forecast. 28, 400–411.Christiano, L.J., Fitzgerald, T.J., 2003. The band pass filter. Int. Econ. Rev. 44, 435–465.Chui, C.H., 1992. Wavelet Analysis and Its Applications. Academic press, San Diego.Cleveland, W.S., 1979. Robust locally weighted regression and smoothing scatterplots. J.

Am. Stat. Assoc. 74, 829–836.Cleveland, W.S., Devlin, S.J., 1988. Locally weighted regression: an approach to regression

analysis by local fitting. J. Am. Stat. Assoc. 83, 596–610.Crespo Cuaresma, J., Hlouskova, J., Kossmeier, S., Obersteiner,M., 2004. Forecasting electricity

spot-prices using linear univariate time-series models. Appl. Energy 77, 87–106.de Boor, C., 1978. A Practical Guide to Splines. Springer-Verlag.de Jong, C., 2006. The nature of power spikes: a regime-switch approach. Stud. Nonlinear

Dyn. Econom. 10 (Article 3).Diebold, F.X., Mariano, R.S., 1995. Comparing predictive accuracy. J. Bus. Econ. Stat. 13,

253–263.Dordonnat, V., Koopman, S.J., Ooms, M., 2010. Intra-daily smoothing splines for time-

varying regression models of hourly electricity load. J. Energy Markets 3, 17–52.Erlwein, C., Benth, F.E., Mamon, R., 2012. HMM filtering and parameter estimation of an

electricity spot price model. Energy Econ. 32, 1034–1043.Escribano, A., Peña, J.I., Villaplana, P., 2011. Modelling electricity prices: international ev-

idence. Oxf. Bull. Econ. Stat. 73, 622–650.Eskridge, R.E., Ku, J.Y., Rao, S.T., Porter, P.S., Zurbenko, I.G., 1997. Separating different

scales of motion in time series of meteorological variables. Bull. Am. Meteorol. Soc.78, 1473–1483.

Fan, J., Gijbels, I., 1996. Local Polynomial Modelling and Its Applications. Chapman & Hall,London.

Fan, J., Yao, Q., 2003. Nonlinear Time Series: Nonparametric and Parametric Methods.Springer-Verlag, New York.

Fan, J., Gasser, T., Gijbels, I., Brockmann, M., Engel, J., 1997. Local polynomial regression:optimal kernels and asymptotic minimax efficiency. Ann. Inst. Stat. Math. 49, 79–99.

Fanone, E., Gamba, A., Prokopczuk, M., 2013. The case of negative day-ahead electricityprices. Energy Econ. 35, 22–34.

Gianfreda, A., Grossi, L., 2012. Forecasting Italian electricity zonal prices with exogenous var-iables. Energy Econ. 34, 2228–2239.

Golyandina, N., Nekrutkin, V., Zhigljavsky, A.A., 2001. Analysis of Time Series Structure.SSA and Related Techniques (Chapman & Hall/CRC).

Green, P.J., Silverman, B.W., 1994. Nonparametric Regression and Generalized LinearModels: A Roughness Penalty Approach. Chapman & Hall, London.

Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A., 1998. Wavelets, Approximationand Statistical Applications. volume 129 of Lecture Notes in Statistics. Springer-Verlag,New York.

Hellström, J., Lundgren, J., Yu, H., 2012. Why do electricity prices jump? Empirical evi-dence from the nordic electricity market. Energy Econ. 34, 1774–1781.

Hodrick, R., Prescott, E.P., 1997. Postwar business cycles: an empirical investigation. J.Money Credit Bank. 29, 1–16.

Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.C., Tung, C.C., Liu, H.H., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinearand non-stationary time series analysis. Proc. R. Soc. Lond. A 454, 903–995.

Janczura, J., Weron, R., 2009. Regime-switching models for electricity spot prices: intro-ducing heteroskedastic base regime dynamics and shifted spike distributions. IEEE6th International Conference on the European Energy Market (EEM 2009).

Janczura, J., Weron, R., 2010. An empirical comparison of alternate regime-switchingmodels for electricity spot prices. Energy Econ. 32, 1059–1073.

Janczura, J., Trück, S., Weron, R., Wolff, R.C., 2013. Identifying spikes and seasonal compo-nents in electricity spot price data: a guide to robust modeling. Energy Econ. 38,96–110.

Koopman, S.J., Ooms, M., Carnero, M.A., 2007. Periodic seasonal reg-ARFIMA-GARCHmodels for daily electricity spot prices. J. Am. Stat. Assoc. 102, 16–27.

Kosater, P., Mosler, K., 2006. Can Markov regime-switching models improve power-priceforecasts? Evidence from German daily power prices. Appl. Energy 83, 943–958.

Kurbatsky, V., Tomin, N., 2010. Forecasting prices in the liberalized electricity marketusing the hybrid models. 2010 IEEE International Energy Conference and Exhibition(EnergyCon), pp. 363–368.

Lucia, J.J., Schwartz, E.S., 2002. Electricity prices and power derivatives: evidence from thenordic power exchange. Rev. Deriv. Res. 5, 5–50.

Mendes, E.F., Oxley, L., Reale, M., 2008. Some new approaches to forecasting the price ofelectricity: a study of Californian market. Working Paper No. 05. Department of Eco-nomics, College of Business and Economics, University of Canterbury.

Meyer, Y., 1993. Wavelets: Algorithms and Applications. Society for Industrial and AppliedMathematics, Philadelphia.

Misiorek, A., Trück, S., Weron, R., 2006. Point and interval forecasting of spot electricityprices: linear vs. non-linear time series models. Stud. Nonlinear Dyn. Econom. 10 (Ar-ticle 2).

Moghtaderi, A., Flandrin, P., Borgnat, P., 2013. Trend filtering via empirical mode decom-positions. Comput. Stat. Data Anal. 58, 114–126.

Percival, D., Walden, A., 2000. Wavelet Methods for Time Series Analysis. Cambridge Uni-versity Press.

Pilipovic, D., 1998. Energy Risk: Valuing and Managing Energy Derivatives. McGraw-Hill,New York.

Pirino, D., Renò, R., 2010. Electricity prices: a nonparametric approach. Int. J. Theor. Appl.Financ. 13, 285–299.

Qian, X.Y., Gub, G.F., Zhoua, W.X., 2011. Modified detrended fluctuation analysis based onempirical mode decomposition for the characterization of anti-persistent processes.Phys. A 390, 4388–4395.

Rao, S.T., Zurbenko, I.G., 1994. Detecting and tracking changes in ozone air quality. AirWaste 44, 1089–1092.

Ruppert, D., Sheather, S.J., P, W.M., 1995. An effective bandwidth selector for local leastsquares regression. J. Am. Stat. Assoc. 90, 1257–1270.

Schlueter, S., 2010. A long-term/short-termmodel for daily electricity prices with dynam-ic volatility. Energy Econ. 32, 1074–1081.

Sigauke, C., Chikobvu, D., 2011. Prediction of daily peak electricity demand in south africausing volatility forecasting models. Energy Econ. 33, 882–888.

Swensen, A.R., 2006. Bootstrap algorithms for testing and determining the cointegrationrank in var models. Econometrica 74, 1699–1714.

Trapero, J.R., Pedregal, D.J., 2009. Frequency domain methods applied to forecasting elec-tricity markets. Energy Econ. 31, 727–735.

Troncoso Lora, A., Riquelme Santos, J.M., Gómez Expósito, A., Martínez Ramos, J.L.,Riquelme Santos, J.C., 2007. Electricity market price forecasting based on weightednearest neighbors techniques. IEEE Trans. Power Syst. 22, 1294–1301.

Trück, S., Weron, R., Wolff, R., 2007. Outlier treatment and robust approaches for model-ing electricity spot prices. MPRA Paper No. 4711. Hugo Steinhaus Center, WroclawUniversity of Technology.

Vautard, R., Ghil, M., 1989. Singular-spectrum analysis in nonlinear dynamics, with appli-cations to paleoclimatic time series. Phys. D 35, 395–424.

Vautard, R., Yiou, P., Ghil, M., 1992. Singular-spectrum analysis: a toolkit for short, noisychaotic signals. Phys. D 58, 95–126.

Veraart, A.E.D., Veraart, L.A.M., 2012. Modelling electricity day-ahead prices by multivar-iate Lévy semi-stationary processes. CREATES Research Paper 2012-13. AarhusUniversity.

Weron, R., 2005. Heavy tails and electricity prices. Invited paper presented at theDeutsche Bundesbank's 2005 Annual Fall Conference (Eltville, 10–12 November2005).

Weron, R., 2006. Modelling and Forecasting Electricity Loads and Prices: A Statistical Ap-proach. Wiley, Chichester.

159F. Lisi, F. Nan / Energy Economics 44 (2014) 143–159

Weron, R., 2009. Heavy-tails and regime-switching in electricity prices. Math. Meth. Oper.Res. 69, 457–473.

Weron, R., Bierbrauer, M., Trück, S., 2004. Modeling electricity prices: jump diffusion andregime switching. Phys. A 336, 39–48.

Young, R.K., 1993. Wavelet Theory and Its Applications. Kluwer Academic Publishers.Zareipour, H., Bhattacharya, K., Cañizares, C.A., 2006. Forecasting the hourly Ontario ener-

gy price by multivariate adaptive regression splines. Proc. of the IEEE PES AnnualGeneral Meeting.

Zhang, X., Lai, K.K., Wang, S.Y., 2008. A new approach for crude oil price analysis based onempirical mode decomposition. Energy Econ. 30, 905–918.

Zurbenko, I.G., Porter, P.S., Rao, S.T., Ku, J.Y., Gui, R., Eskridge, R.E., 1996. Detecting discon-tinuities in time series of upper-air data: development and demonstration of anadaptive filter technique. J. Clim. 9, 3548–3560.


Recommended