+ All Categories
Home > Documents > IBM Research Nonparametric Time Series Analysis: A review...

IBM Research Nonparametric Time Series Analysis: A review...

Date post: 04-Feb-2021
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
13
IBM Research IBM Research Nonparametric Time Series Analysis: A review of Peter Lewis’ contributions to the field Bonnie Ray IBM T. J. Watson Research Center Joint Statistical Meetings 2012 © 2011 IBM Corporation
Transcript
  • IBM ResearchIBM Research

    Nonparametric Time Series Analysis: A review of Peter Lewis’ contributions to the field

    Bonnie RayIBM T. J. Watson Research Center

    Joint Statistical Meetings 2012

    © 2011 IBM Corporation

  • IBM Research

    Outline

    Background– My connection to Peter

    Joint work: Nonparametric time series analysis– Motivation– Methodological foundation– Applications and Extensions

    Current related work Impact of Peter’s work

    © 2011 IBM Corporation2

  • IBM Research

    How I knew Peter

    IBM T. J. Watson Research CenterYorktown Heights, NY

    Carmel BeachCarmel, CA

    © 2011 IBM Corporation3

    Naval Postgraduate SchoolMonterey, CA

  • IBM Research

    Underlying motivation for much of our joint work

    © 2011 IBM Corporation4

  • IBM Research

    Key idea

    Peter recognized that non-parametric regression techniques, under development in the late 80’s and early 90’s, could be applied in the time series context to model non-linear time series phenomenap Focused on Multivariate Adaptive Regression Splines technique (MARS)

    – Fits truncated linear splines functions to the data with optimal knot points selected automatically

    Extended MARS to TS-MARS Used MARS algorithm to modeled nonlinear univariate time series using lagged values of the series

    itself and possible exogenous covariates– Results in nonlinear threshold models that are continuous in the domain of the predictor variables

    (ASTAR, SMASTAR)– More general than self-exciting threshold-type models (SETAR, TARSO), which identify piecewise

    linear functions over disjoint subregions and are discontinuous at the boundaries of the domain of interest

    Main publications– Lewis, P., and J. Stevens (1991): “Nonlinear modeling of time series using multivariate adaptive

    regression splines (MARS),” Journal of the American Statistical Association, 87, 864–877. (130+ citations)

    – Lewis, P., and B. Ray (1997): “Modeling nonlinearity, long-range dependence, and periodic phenomena in sea surface temperatures using TSMARS ” Journal of the American Statistical

    © 2011 IBM Corporation

    phenomena in sea surface temperatures using TSMARS, Journal of the American Statistical Association, 92, 881-893. (50+ citations)

    5

  • IBM Research

    MARS Applied to Granite Canyon Data

    Model for 5 Years of SSTs using Wind Direction and Wind Speed)

    Suggests that when the wind blows fromX t=2.192(0.0036) + 0.878(0.0079)(Xt−1 − 2.13)++1.616(0.2770)(2.22 − Xt−34)+

    Suggests that when the wind blows from the Northwest on the previous day, the SST tends to decrease

    Reflects the fact that the average time b t t f t i th i i it f

    +0.013(0.0018)(WSt−1 − 1.10)+I(WDt−1 {1, 2})−0.035(0.0018)(WSt−1 − 1.10)+I(WDt−1 {2, 3})−.499(0.0060)(Xt−1 − 2.13)+(2.75 − Xt−8)+(2.68 − Xt−17)+

    between storm fronts in the vicinity of Granite Canyon in the winter is about 8 days

    t 1 t 8 t 17

    −0.584(0.0999)(2.27 − Xt−34)+(WSt−1 − 1.10)+I(WDt−1 {2, 3})−0.517(0.1174)(Xt−49 − 2.510)+(WSt−1 − 3.00)+I(WDt−1 {1, 4, 5})+4.665(1.0344)(2.51 − Xt−49)+(2.26 − Xt−24)+I(WDt−1 {2, 3})( )( t 49) ( t 24) ( t 1 { })

    Suggests a coupling of SSTs with SSTs approximately 49 days previous, dependent on the wind direction and speed

    © 2011 IBM Corporation6

  • IBM Research

    Periodic Autoregressive Models: Characterizing River Flows

    Periodic time series– Correlation structure does not change from cycle to cycle, but differs from period to period

    within a cycle– For example, monthly data may have a yearly cycle, but the correlation between

    observations in Jan and Feb is different from the correlation structure between observations in Feb and Mar

    Scatter plots of the logarithm of mean monthly flow of the Fraser River over the time period March 1913–December 1991

    © 2011 IBM Corporation7

  • IBM Research

    Innovations in Modeling Periodic Time Series: P-CASTAR

    Adapted nonlinearity tests for threshold-type behavior to the case of periodic time seriesp Applied MARS algorithm to time series exhibiting periodic behavior to

    capture non-linear relationships– Initially modeled each subseries separately using MARS algorithmy p y g g– Introduced the use of categorical predictors representing each period within a

    cycle to simultaneously model nonlinear behavior for each period – Each response weighted to adjust for heteroskedasticity of the residuals in

    different periods and weights updated iteratively

    Change the mean level ofmean level of the model only

    Reduces the correlation between May and June riverflows and July and

    © 2011 IBM Corporation8

    riverflows and July and August riverflows

    Lewis, P. and Ray, B. (2002). Modeling periodic threshold autoregressions using TSMARS, Journal of Time Series Analysis, 23, 459-471.

  • IBM Research

    Impact of Peter’s work using MARS to model nonlinear structure in time seriesseries

    TSMARS methodology has been used to model energy price series, mobile communication channels, foreign exchange rates, brain dynamics, ozone extremes, nuclear safeguards and non-proliferation, ….. For example,

    – Krzyzscinki, J.W. Nonlinear (MARS) modeling of long-term variations of surface UV-B radiation as revealed from the analysis of Belsk, Poland data for the period 1976–2000. Annales Geophysicae (2003) 21: 1887–1896p y ( )

    – De Gooijer, J., Ray, B. and Krager, H. (1998). Forecasting exchange rates using TSMARS, International Journal of Money and Finance, 17, 513-534.

    Id t d d t lti i t ti i d li Ideas extended to multivariate time series modeling– Kooperberg Bose and Stone (JASA, 1997) developed PolyMARS (PMARS) algorithm to

    extend the advantages of the MARS algorithm over simple recursive partitioning to the multiple classification problemp p

    – DeGooijer and Ray (CSDA, 2003) applied PMARS algorithm to model vector threshold-type nonlinearity in multivariate time series

    © 2011 IBM Corporation9

  • IBM Research

    Using PMARS to model Electricity Load DataUsing PMARS to model Electricity Load Data

    Used temperature data, available only for NSW, dditi l di t l ith Ti f D

    Three weeks of half-hourly electricity load data (n=1008) from the Australian states of New South

    Victoria

    7000

    7500

    as an additional predictor, along with Time of Day (TOD) and Time of Week(TOW) indicator variables Results

    Australian states of New South Wales (NSW) and Victoria (VIC)

    3500

    4000

    4500

    5000

    5500

    6000

    6500

    – Interactions between the TODt and lagged loads, suggesting that prior electricity usage acts to modulate TODt effects

    – Several terms involving lagged loadsTime (in days)

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 213500

    New South Wales

    9000

    10000

    Several terms involving lagged loads contained thresholds, indicating that electricity loads exhibit different behavior when usage is above or below certain levels

    – Model showed a feedback relationship

    4000

    5000

    6000

    7000

    8000– Model showed a feedback relationship

    between the electricity loads of the two states

    © 2011 IBM Corporation

    Time (in days)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

    4000

    See De Gooijer, J. and Ray, B.(2003). Modeling vector nonlinear time series using POLYMARS, Computational Statistics and Data Analysis, 42,73-90.

  • IBM Research

    Related work from IBM Research: Scalable Matrix-valued Kernel Learning and High dimensional Nonlinear Causal Inferenceand High-dimensional Nonlinear Causal Inference Innovations

    – Propose a general matrix-valued multiple kernel learning framework to fit non-parametric models to multivariate time series i e kernels are selected dynamically from aparametric models to multivariate time series, i.e. kernels are selected dynamically from a library of kernels based on the local structure of the data

    – Allow a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel HilbertSpaces (RKHS)

    Resulting models– Non-parametric nonlinear, sparse temporal-causal models– May be viewed as non-parametric multivariate extension of Group Lasso and related

    sparse learning modelssparse learning models Applications

    – Weekly log returns of multiple related stocks– Time-course gene expression microarray dataTime course gene expression microarray data

    • Modeled the expression levels of 2397 unique genes simultaneously measured at 66 time points corresponding to various developmental stages and grouped into 35 functional groups based on their gene to infer causal interactions between functional groups, as well obtain insight on within group relationships between genes.

    © 2011 IBM Corporation11

  • IBM Research

    Continuing impact of Peter’s work using MARS to model nonlinear structure in time seriesin time series

    Motivated theoretical work on limiting properties, boosting, improved partitioning algorithms, etc., e.g.

    – K. S. CHAN and RUEY S. TSAY. Limiting properties of the least squares estimator of a continuous threshold autoregressive model Biometrika (1998) 85(2): 413-426

    – Thomas R. Boucher and Daren B. H. Cline, STABILITY OF CYCLIC THRESHOLD AND THRESHOLD-LIKE AUTOREGRESSIVE TIME SERIES MODELS. Statistica SinicaS O U O G SS S S O S Stat st ca S ca17(2007), 43-62

    – P Bühlmann. Dynamic adaptive partitioning for nonlinear time series. Biometrika (1999) 86(3): 555-571.Robinzonov Nikolay Tutz Gerhard Hothorn Torsten Boosting techniques for nonlinear– Robinzonov, Nikolay, Tutz, Gerhard, Hothorn, Torsten. Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis. (2012). 96 (1). 99-122.

    Ideas extended to nonlinear transfer function-type models– Liu J.M., Chen R., Yao Q. Nonparametric transfer function models (2010) Journal of , , p ( )

    Econometrics, 157 (1), pp. 151-164 Most recent work found directly linked to work of Lewis and Ray

    – Nonlinearity, Breaks, and Long-Range Dependence in Time-Series Models, Eric Hill b d d M l C M d i CREATES R h P 2012 30 D t t f

    © 2011 IBM Corporation

    Hillebrand and Marcelo C. Medeiros, CREATES Research Paper 2012-30, Department of Economics and Business, Aarhus University, Bartholins Allé 10, DK-8000 Aarhus C, Denmark

    12

  • IBM Research

    Conclusion

    Operations Research Distinguished ProfessorEmeritus Peter Lewis,

    1932—2011 A leader in the fields of computer simulation applied statistics andA leader in the fields of computer simulation, applied statistics and probability, and operations research. …..A common theme running through the comments about Peter Lewis by many of his colleagues and former students is his extraordinary influence on their professional careers and his steadfast encouragement and support of their work.

    © 2011 IBM Corporation13

    careers and his steadfast encouragement and support of their work.


Recommended