IBM ResearchIBM Research
Nonparametric Time Series Analysis: A review of Peter Lewis’ contributions to the field
Bonnie RayIBM T. J. Watson Research Center
Joint Statistical Meetings 2012
© 2011 IBM Corporation
IBM Research
Outline
Background– My connection to Peter
Joint work: Nonparametric time series analysis– Motivation– Methodological foundation– Applications and Extensions
Current related work Impact of Peter’s work
© 2011 IBM Corporation2
IBM Research
How I knew Peter
IBM T. J. Watson Research CenterYorktown Heights, NY
Carmel BeachCarmel, CA
© 2011 IBM Corporation3
Naval Postgraduate SchoolMonterey, CA
IBM Research
Underlying motivation for much of our joint work
© 2011 IBM Corporation4
IBM Research
Key idea
Peter recognized that non-parametric regression techniques, under development in the late 80’s and early 90’s, could be applied in the time series context to model non-linear time series phenomenap Focused on Multivariate Adaptive Regression Splines technique (MARS)
– Fits truncated linear splines functions to the data with optimal knot points selected automatically
Extended MARS to TS-MARS Used MARS algorithm to modeled nonlinear univariate time series using lagged values of the series
itself and possible exogenous covariates– Results in nonlinear threshold models that are continuous in the domain of the predictor variables
(ASTAR, SMASTAR)– More general than self-exciting threshold-type models (SETAR, TARSO), which identify piecewise
linear functions over disjoint subregions and are discontinuous at the boundaries of the domain of interest
Main publications– Lewis, P., and J. Stevens (1991): “Nonlinear modeling of time series using multivariate adaptive
regression splines (MARS),” Journal of the American Statistical Association, 87, 864–877. (130+ citations)
– Lewis, P., and B. Ray (1997): “Modeling nonlinearity, long-range dependence, and periodic phenomena in sea surface temperatures using TSMARS ” Journal of the American Statistical
© 2011 IBM Corporation
phenomena in sea surface temperatures using TSMARS, Journal of the American Statistical Association, 92, 881-893. (50+ citations)
5
IBM Research
MARS Applied to Granite Canyon Data
Model for 5 Years of SSTs using Wind Direction and Wind Speed)
Suggests that when the wind blows fromX t=2.192(0.0036) + 0.878(0.0079)(Xt−1 − 2.13)++1.616(0.2770)(2.22 − Xt−34)+
Suggests that when the wind blows from the Northwest on the previous day, the SST tends to decrease
Reflects the fact that the average time b t t f t i th i i it f
+0.013(0.0018)(WSt−1 − 1.10)+I(WDt−1 {1, 2})−0.035(0.0018)(WSt−1 − 1.10)+I(WDt−1 {2, 3})−.499(0.0060)(Xt−1 − 2.13)+(2.75 − Xt−8)+(2.68 − Xt−17)+
between storm fronts in the vicinity of Granite Canyon in the winter is about 8 days
t 1 t 8 t 17
−0.584(0.0999)(2.27 − Xt−34)+(WSt−1 − 1.10)+I(WDt−1 {2, 3})−0.517(0.1174)(Xt−49 − 2.510)+(WSt−1 − 3.00)+I(WDt−1 {1, 4, 5})+4.665(1.0344)(2.51 − Xt−49)+(2.26 − Xt−24)+I(WDt−1 {2, 3})( )( t 49) ( t 24) ( t 1 { })
Suggests a coupling of SSTs with SSTs approximately 49 days previous, dependent on the wind direction and speed
© 2011 IBM Corporation6
IBM Research
Periodic Autoregressive Models: Characterizing River Flows
Periodic time series– Correlation structure does not change from cycle to cycle, but differs from period to period
within a cycle– For example, monthly data may have a yearly cycle, but the correlation between
observations in Jan and Feb is different from the correlation structure between observations in Feb and Mar
Scatter plots of the logarithm of mean monthly flow of the Fraser River over the time period March 1913–December 1991
© 2011 IBM Corporation7
IBM Research
Innovations in Modeling Periodic Time Series: P-CASTAR
Adapted nonlinearity tests for threshold-type behavior to the case of periodic time seriesp Applied MARS algorithm to time series exhibiting periodic behavior to
capture non-linear relationships– Initially modeled each subseries separately using MARS algorithmy p y g g– Introduced the use of categorical predictors representing each period within a
cycle to simultaneously model nonlinear behavior for each period – Each response weighted to adjust for heteroskedasticity of the residuals in
different periods and weights updated iteratively
Change the mean level ofmean level of the model only
Reduces the correlation between May and June riverflows and July and
© 2011 IBM Corporation8
riverflows and July and August riverflows
Lewis, P. and Ray, B. (2002). Modeling periodic threshold autoregressions using TSMARS, Journal of Time Series Analysis, 23, 459-471.
IBM Research
Impact of Peter’s work using MARS to model nonlinear structure in time seriesseries
TSMARS methodology has been used to model energy price series, mobile communication channels, foreign exchange rates, brain dynamics, ozone extremes, nuclear safeguards and non-proliferation, ….. For example,
– Krzyzscinki, J.W. Nonlinear (MARS) modeling of long-term variations of surface UV-B radiation as revealed from the analysis of Belsk, Poland data for the period 1976–2000. Annales Geophysicae (2003) 21: 1887–1896p y ( )
– De Gooijer, J., Ray, B. and Krager, H. (1998). Forecasting exchange rates using TSMARS, International Journal of Money and Finance, 17, 513-534.
Id t d d t lti i t ti i d li Ideas extended to multivariate time series modeling– Kooperberg Bose and Stone (JASA, 1997) developed PolyMARS (PMARS) algorithm to
extend the advantages of the MARS algorithm over simple recursive partitioning to the multiple classification problemp p
– DeGooijer and Ray (CSDA, 2003) applied PMARS algorithm to model vector threshold-type nonlinearity in multivariate time series
© 2011 IBM Corporation9
IBM Research
Using PMARS to model Electricity Load DataUsing PMARS to model Electricity Load Data
Used temperature data, available only for NSW, dditi l di t l ith Ti f D
Three weeks of half-hourly electricity load data (n=1008) from the Australian states of New South
Victoria
7000
7500
as an additional predictor, along with Time of Day (TOD) and Time of Week(TOW) indicator variables Results
Australian states of New South Wales (NSW) and Victoria (VIC)
3500
4000
4500
5000
5500
6000
6500
– Interactions between the TODt and lagged loads, suggesting that prior electricity usage acts to modulate TODt effects
– Several terms involving lagged loadsTime (in days)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 213500
New South Wales
9000
10000
Several terms involving lagged loads contained thresholds, indicating that electricity loads exhibit different behavior when usage is above or below certain levels
– Model showed a feedback relationship
4000
5000
6000
7000
8000– Model showed a feedback relationship
between the electricity loads of the two states
© 2011 IBM Corporation
Time (in days)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
4000
See De Gooijer, J. and Ray, B.(2003). Modeling vector nonlinear time series using POLYMARS, Computational Statistics and Data Analysis, 42,73-90.
IBM Research
Related work from IBM Research: Scalable Matrix-valued Kernel Learning and High dimensional Nonlinear Causal Inferenceand High-dimensional Nonlinear Causal Inference Innovations
– Propose a general matrix-valued multiple kernel learning framework to fit non-parametric models to multivariate time series i e kernels are selected dynamically from aparametric models to multivariate time series, i.e. kernels are selected dynamically from a library of kernels based on the local structure of the data
– Allow a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel HilbertSpaces (RKHS)
Resulting models– Non-parametric nonlinear, sparse temporal-causal models– May be viewed as non-parametric multivariate extension of Group Lasso and related
sparse learning modelssparse learning models Applications
– Weekly log returns of multiple related stocks– Time-course gene expression microarray dataTime course gene expression microarray data
• Modeled the expression levels of 2397 unique genes simultaneously measured at 66 time points corresponding to various developmental stages and grouped into 35 functional groups based on their gene to infer causal interactions between functional groups, as well obtain insight on within group relationships between genes.
© 2011 IBM Corporation11
IBM Research
Continuing impact of Peter’s work using MARS to model nonlinear structure in time seriesin time series
Motivated theoretical work on limiting properties, boosting, improved partitioning algorithms, etc., e.g.
– K. S. CHAN and RUEY S. TSAY. Limiting properties of the least squares estimator of a continuous threshold autoregressive model Biometrika (1998) 85(2): 413-426
– Thomas R. Boucher and Daren B. H. Cline, STABILITY OF CYCLIC THRESHOLD AND THRESHOLD-LIKE AUTOREGRESSIVE TIME SERIES MODELS. Statistica SinicaS O U O G SS S S O S Stat st ca S ca17(2007), 43-62
– P Bühlmann. Dynamic adaptive partitioning for nonlinear time series. Biometrika (1999) 86(3): 555-571.Robinzonov Nikolay Tutz Gerhard Hothorn Torsten Boosting techniques for nonlinear– Robinzonov, Nikolay, Tutz, Gerhard, Hothorn, Torsten. Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis. (2012). 96 (1). 99-122.
Ideas extended to nonlinear transfer function-type models– Liu J.M., Chen R., Yao Q. Nonparametric transfer function models (2010) Journal of , , p ( )
Econometrics, 157 (1), pp. 151-164 Most recent work found directly linked to work of Lewis and Ray
– Nonlinearity, Breaks, and Long-Range Dependence in Time-Series Models, Eric Hill b d d M l C M d i CREATES R h P 2012 30 D t t f
© 2011 IBM Corporation
Hillebrand and Marcelo C. Medeiros, CREATES Research Paper 2012-30, Department of Economics and Business, Aarhus University, Bartholins Allé 10, DK-8000 Aarhus C, Denmark
12
IBM Research
Conclusion
Operations Research Distinguished ProfessorEmeritus Peter Lewis,
1932—2011 A leader in the fields of computer simulation applied statistics andA leader in the fields of computer simulation, applied statistics and probability, and operations research. …..A common theme running through the comments about Peter Lewis by many of his colleagues and former students is his extraordinary influence on their professional careers and his steadfast encouragement and support of their work.
© 2011 IBM Corporation13
careers and his steadfast encouragement and support of their work.