+ All Categories
Home > Documents > Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS,...

Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS,...

Date post: 04-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
11
2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal Learning for Online Modeling of Distributed Parameter Systems Zhi Wang and Han-Xiong Li, Fellow, IEEE Abstract—An incremental spatiotemporal learning scheme is proposed for online modeling of distributed parameter systems (DPSs). A novel incremental learning method is developed to recursively update the spatial basis functions and the correspond- ing temporal model based on the Karhunen–Loève decomposition for time-space separation. The time-space synthesis continually evolves by adding new increment data with more updated infor- mation and revising the existing parameters of the dynamic system. In this way, the spatiotemporal structure is inherited and updated efficiently as output data increases over time. The adaptive nature of this evolving structure makes it promising for online modeling of DPSs under streaming data environment. The proposed incremental modeling scheme is evaluated on the classical benchmark of a catalytic rod problem. The simulation results demonstrate the viability and efficiency of the proposed method for online modeling of DPSs. Index Terms—Distributed parameter systems (DPSs), incre- mental learning, Karhunen–Loève decomposition (KLD), online spatiotemporal modeling. I. I NTRODUCTION D ISTRIBUTED parameter systems (DPSs) are a com- mon kind of industrial processes where the input and output may vary in both time and space dimension [1]. Despite of the difficulty, modeling such complex systems is essential to industrial simulation, control, and optimiza- tion [2]–[4]. Modeling and control of such spatiotemporal systems has been widely investigated in practice due to recent developments in sensor, actuator, and computing technology. The first-principle description for known DPS convention- ally leads to the mathematical partial differential equation (PDE). Since the PDE system is infinite-dimensional, the model reduction complements are always indispensable for real implementation. The time-space separation methods have been verified to be an efficient model reduction method in modeling of unknown DPSs [5]–[11]. In these spatiotemporal model- ing methods, Karhunen–Loève decomposition (KLD) is first Manuscript received December 11, 2017; accepted February 14, 2018. Date of publication March 19, 2018; date of current version November 19, 2019. This work was supported by the General Research Fund Project from Research Grant Council of Hong Kong SAR under Grant CityU: 11205615. This paper was recommended by Associate Editor A. H. Tan. (Corresponding author: Han-Xiong Li.) The authors are with the Department of Systems Engineering and Engineering Management, City University of Hong Kong, Hong Kong (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMC.2018.2810447 utilized for the time-space separation, where the spatiotem- poral output is decomposed into a set of dominant spatial basis functions (BFs) with corresponding temporal coeffi- cients. Second, a reduced-order temporal model is identified from the decomposed low-dimensional data. The temporal structure can be approximated by various identification tech- niques, such as nonlinear autoregressive with exogenous input (NARX) model [12] Hammerstein model [5], neural networks (NNs) [7], [8], and so on. Finally, the spatiotemporal dynamics can be reconstructed and predicted over the whole time-space domain through the time-space pairwise data reconstruction of the reduced-order model. In traditional spatiotemporal modeling, the KLD process and temporal structure identification are performed in the so-called batch-mode. The output data over the whole time domain has to be ready for time-space separation during the model training stage. The modeling procedure stops once the whole batch of spatiotemporal outputs has been fully pro- cessed. These methods assume that all the output data is available and accessible at the beginning of the modeling pro- cess. Therefore, they are feasible for offline implementations only. Nevertheless, in online settings, new streaming data will be available continually, even after the spatiotemporal model having been identified at a certain moment. If we want to incorporate additional new output data into the existing time- space synthesis, the time-space separation process should be restarted from scratch with all the new and the old train- ing data, which is called as “batch-mode” shown in Fig. 1. Since the number of training data is growing constantly, the batch-mode method is only feasible at the cost of retraining the whole time-space synthesis with time-consuming proce- dures and great storage burden. Although some DPSs may have relatively slow dynamics, making such retraining scheme feasible. It is difficult to characterize it as adaptation, espe- cially with respect to the model structure of the time-space synthesis. In fact, it is a procedure where completely new reduced-order models are repeatedly generated from scratch given the accumulated data with growing length. From the aspect of computational efforts, calculating the Karhunen–Loève basis for L time steps of N spatial measure- ments requires roughly O(NL) memory units and O(L 3 ) flops. The growing data length L results in superlinearly increas- ing computational complexity and linearly increasing storage capacity for batch-mode method. In many real applications, this large storage requirements and computational demands may be prohibitive. Moreover, acquisition of representative training data is expensive and time-consuming. It is common 2168-2216 c 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.
Transcript
Page 1: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

Incremental Spatiotemporal Learning for OnlineModeling of Distributed Parameter Systems

Zhi Wang and Han-Xiong Li, Fellow, IEEE

Abstract—An incremental spatiotemporal learning scheme isproposed for online modeling of distributed parameter systems(DPSs). A novel incremental learning method is developed torecursively update the spatial basis functions and the correspond-ing temporal model based on the Karhunen–Loève decompositionfor time-space separation. The time-space synthesis continuallyevolves by adding new increment data with more updated infor-mation and revising the existing parameters of the dynamicsystem. In this way, the spatiotemporal structure is inheritedand updated efficiently as output data increases over time. Theadaptive nature of this evolving structure makes it promisingfor online modeling of DPSs under streaming data environment.The proposed incremental modeling scheme is evaluated on theclassical benchmark of a catalytic rod problem. The simulationresults demonstrate the viability and efficiency of the proposedmethod for online modeling of DPSs.

Index Terms—Distributed parameter systems (DPSs), incre-mental learning, Karhunen–Loève decomposition (KLD), onlinespatiotemporal modeling.

I. INTRODUCTION

D ISTRIBUTED parameter systems (DPSs) are a com-mon kind of industrial processes where the input and

output may vary in both time and space dimension [1].Despite of the difficulty, modeling such complex systemsis essential to industrial simulation, control, and optimiza-tion [2]–[4]. Modeling and control of such spatiotemporalsystems has been widely investigated in practice due to recentdevelopments in sensor, actuator, and computing technology.The first-principle description for known DPS convention-ally leads to the mathematical partial differential equation(PDE). Since the PDE system is infinite-dimensional, themodel reduction complements are always indispensable forreal implementation.

The time-space separation methods have been verifiedto be an efficient model reduction method in modeling ofunknown DPSs [5]–[11]. In these spatiotemporal model-ing methods, Karhunen–Loève decomposition (KLD) is first

Manuscript received December 11, 2017; accepted February 14, 2018. Dateof publication March 19, 2018; date of current version November 19, 2019.This work was supported by the General Research Fund Project from ResearchGrant Council of Hong Kong SAR under Grant CityU: 11205615. This paperwas recommended by Associate Editor A. H. Tan. (Corresponding author:Han-Xiong Li.)

The authors are with the Department of Systems Engineering andEngineering Management, City University of Hong Kong, Hong Kong (e-mail:[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMC.2018.2810447

utilized for the time-space separation, where the spatiotem-poral output is decomposed into a set of dominant spatialbasis functions (BFs) with corresponding temporal coeffi-cients. Second, a reduced-order temporal model is identifiedfrom the decomposed low-dimensional data. The temporalstructure can be approximated by various identification tech-niques, such as nonlinear autoregressive with exogenous input(NARX) model [12] Hammerstein model [5], neural networks(NNs) [7], [8], and so on. Finally, the spatiotemporal dynamicscan be reconstructed and predicted over the whole time-spacedomain through the time-space pairwise data reconstruction ofthe reduced-order model.

In traditional spatiotemporal modeling, the KLD processand temporal structure identification are performed in theso-called batch-mode. The output data over the whole timedomain has to be ready for time-space separation during themodel training stage. The modeling procedure stops once thewhole batch of spatiotemporal outputs has been fully pro-cessed. These methods assume that all the output data isavailable and accessible at the beginning of the modeling pro-cess. Therefore, they are feasible for offline implementationsonly. Nevertheless, in online settings, new streaming data willbe available continually, even after the spatiotemporal modelhaving been identified at a certain moment. If we want toincorporate additional new output data into the existing time-space synthesis, the time-space separation process should berestarted from scratch with all the new and the old train-ing data, which is called as “batch-mode” shown in Fig. 1.Since the number of training data is growing constantly, thebatch-mode method is only feasible at the cost of retrainingthe whole time-space synthesis with time-consuming proce-dures and great storage burden. Although some DPSs mayhave relatively slow dynamics, making such retraining schemefeasible. It is difficult to characterize it as adaptation, espe-cially with respect to the model structure of the time-spacesynthesis. In fact, it is a procedure where completely newreduced-order models are repeatedly generated from scratchgiven the accumulated data with growing length.

From the aspect of computational efforts, calculating theKarhunen–Loève basis for L time steps of N spatial measure-ments requires roughly O(NL) memory units and O(L3) flops.The growing data length L results in superlinearly increas-ing computational complexity and linearly increasing storagecapacity for batch-mode method. In many real applications,this large storage requirements and computational demandsmay be prohibitive. Moreover, acquisition of representativetraining data is expensive and time-consuming. It is common

2168-2216 c© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 2: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

WANG AND LI: INCREMENTAL SPATIOTEMPORAL LEARNING FOR ONLINE MODELING OF DPSs 2613

for such data available only in small increments over a periodof time, and the previously visited data may be unaccessiblein consideration of online storage. Under the circumstances,either we are not capable of collecting all the training data fortime-space separation, or the time-space synthesis is identifiedfrom scratch inefficiently using all the data.

We can see that the batch calculation nature of the spa-tiotemporal modeling methods has limited their applications.It is an important obstacle in designing online modeling meth-ods for distributed processes, since the traditional methods arestill not adaptive. In turn, it is hard to scale up the developedmodeling systems. An adaptive method of modeling DPSsis needed for online settings to overcome the above chal-lenges of streaming data and computational limits. A newmodeling scheme should be developed with the capabilityof evolving the time-space synthesis as the process contin-ues on. The model structure is supposed to be inherited andupdated whenever the new data increment is available. Thenew information carried by the new data should be addedinto the existing model structure in an incremental way, i.e.,incremental learning.

Recent years have witnessed an increasing interest in thetopic of incremental learning from both academia and industry.Incremental learning has been widely addressed in machinelearning and intelligent control communities to cope withlearning tasks, where the training data becomes availableover time or the learning environment is ever-changing [13].Various methods have been suggested for incremental learn-ing regarding various problems in different areas, includingunsupervised learning [14], supervised learning [15], rein-forcement learning [16], machine vision [17], evolutionaryalgorithms [18], and human–robot interaction [19].

For model reduction of DPSs, there are several resultsreported regarding the concept of recursive, or adaptive, orincremental methods. Li et al. [20] proposed a recursiveprinciple component analysis approach based on updatingthe correlation matrix recursively. Varshney et al. [21] andPourkargar and Armaou [22], [23] developed a kind ofadaptive proper orthogonal decomposition on the base ofupdating the BFs through orthonormalization of the dominanteigenspace of the covariance matrix. The algorithm requiresthe dimensionality of the covariance matrix to remain con-stant by discarding the oldest snapshots, which leads to acertain loss of the system’s dynamics. Xu et al. [24] pro-posed a recursive proper orthogonal decomposition approachthrough gradient search of the new eigenspace, which aimsat minimizing the approximation error. Sequently, they pro-posed a rank-1 incremental proper orthogonal decompositionmethod [25] and [26] based on expansion and transforma-tion of the eigenspace by the normalized residue vector. Thesemethods are mostly based on analysis of the covariance matrix,which requires access to all the historical data. Meantime,some of them may have deficiencies, such as information loss,limitation to rank-1 updating, and local minimum.

Regarding to spatiotemporal modeling of DPSs, there arefew results reported concerning the incremental learningmethod, which is exactly needed for online modeling in envi-ronments of streaming data. Although there are several works

Fig. 1. Traditional batch-mode modeling versus proposed incremental-modemodeling.

proposing the concept of incremental modeling of DPSs [27],they refer in particular to adding the hierarchical spatiotempo-ral kernels incrementally, which is completely different fromour proposed incremental algorithm for online modeling. Thepurpose of this paper is to present such incremental modelingmethodology and results of their applications to a number oftest cases.

The intuitive concept of proposed incremental learningmethodology for online spatiotemporal modeling is as shownin Fig. 1, along with the comparison to conventional batch-mode modeling. In online settings, assume that the mod-eling procedure is processed continually at set time steps(. . . , ti−1, ti, ti+1, . . .). For the batch-mode method, completelynew time-space syntheses (. . . , T/S(i−1), T/S(i), T/S(i+1), . . .)are trained from scratch after collecting the whole batchdata (. . . , BD(i−1), BD(i), BD(i+1), . . .). While for our pro-posed incremental modeling method, the time-space synthesisis inherited and updated in a computationally effective way byadding the new increment data (. . . , ID(i−1), ID(i), ID(i+1), . . .)into the existing model structure incrementally. It is notrequired to store the entire time series of training data beforeproceeding to the time-space separation, and this evolvingstructure is capable of approximating and adapting to thesystem’s dynamics well in real-time.

In order to demonstrate the performances of the proposedincremental modeling algorithm, simulated experiments arecarried out on the benchmark of a catalytic rod problem.We compare the incremental modeling algorithm with con-ventional batch-mode method to illustrate the feasibility andadvantages of the incremental learning property. Both the-oretical analysis and experimental results will demonstratethat the proposed incremental modeling methodology achievesgood online performances, as well as being computationallyeffective.

The rest of this paper is organized as follows. In Section II,the problem description of online modeling is introduced. InSection III, we present the concept and technical details ofthe proposed incremental spatiotemporal learning scheme foronline modeling, accompanied by illustrations of complex-ity analysis and main advantages. Experimental results are

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 3: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2614 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

demonstrated in Section IV, and the conclusions are presentedin Section V.

II. PROBLEM DESCRIPTION

In this paper, a general class of DPSs is considered, whichcan be represented by the following nonlinear PDE:

∂y(x, t)

∂t= L

(y,

∂y

∂x,∂2y

∂x2, . . . ,

∂n0y

∂xn0

)+ B(x)u(t) (1)

subject to the mixed-type boundary conditions

q

(y,

∂y

∂x,∂2y

∂x2, . . . ,

∂n0−1y

∂xn0−1

)|x=xa or x=xb= 0 (2)

and the initial condition

y(x, 0) = y0(x) (3)

where t ∈ [0,∞) is the temporal variable, x ∈ [xa, xb] ⊂ R isthe spatial coordinate, y(x, t) = [y(x1, t), . . . , y(xN, t)]T ∈ R

N

is the spatiotemporal output, and u(t) ∈ Rp is the temporal

input. L ∈ RN is a complex vector function which contains

a nonlinear spatial differential operator of order n0, B(x) isa matrix function of appropriate dimensions which describeshow the temporal inputs are distributed in spatial domains, qis a nonlinear vector function, and y0(x) is a smooth vectorfunction referring to the initial output.

A common approach to modeling the unknown nonlinearDPSs leads to the time-space separation framework [1], wherethe spatiotemporal output y(x, t) can be decoupled into a setof orthogonal spatial BFs ϕ(x) with corresponding temporalcoefficients a(t) as

y(x, t) =∞∑

i=1

ϕi(x)ai(t). (4)

In practice, a finite nth-order of BFs {ϕi(x)}ni=1 extracted

by KLD is used for capturing the most relevant dynamicsof the system. Then, the low-order temporal model F isidentified from the decomposed low-dimensional coefficients{ai(t)}n

i=1 as

a(t) = F(a(t − 1), . . . , a(t − da), u(t − 1), . . . , u(t − du))

+ e(t) (5)

where du and da denote the maximum input and output lags,respectively, and e(t) denotes the residual error. The detaileddescription of spatiotemporal modeling can be found in theAppendix.

Nevertheless, traditional spatiotemporal modeling methodsare only feasible for offline implementations since the time-space synthesis is computed only once and remains fixedafterwards. In an online environment, the time-space synthesisis supposed to be retrained from scratch repeatedly when thenew data is available, which leads to a high computationalburden in real applications. For online modeling of DPSs,an incremental learning mechanism is needed to inherit andupdate the model structure efficiently whenever the new dataincrement is available.

III. INCREMENTAL SPATIOTEMPORAL MODELING

A. Framework

In the online environment, the output data for modelingis collected continuously, instead of being a fixed set. Someparts of the new collected data may confirm and reinforcethe knowledge learned from the previous data; while otherparts may bring new information that is sufficiently differentfrom the learned knowledge, which could indicate complexdynamics such as abnormal interference or changes in oper-ating conditions. Online methods are supposed to be adaptiveto such dynamics of DPSs during their whole life cycles.

We present the technical details of the proposed incre-mental spatiotemporal modeling scheme in this section. Thewhole framework is shown in Fig. 2. The continuous streamingdata is collected into data increments (. . . , ID(i), ID(i+1), . . .)at certain time steps (. . . , ti, ti+1, . . .). First, we proposean efficient method that incrementally updates the spa-tial BFs when a new data increment arrives. Second,the temporal model is reidentified using the correspondingupdated temporal coefficients. Finally, we use the time-space synthesis (. . . , T/S(i), T/S(i+1), . . .) with updated spatialBFs and temporal model to reconstruct the historical data(. . . , HD(i), HD(i+1), . . .), and to predict the future outputs(. . . , ˆID(i+1), ˆID(i+2), . . .). Then we repeat the above proce-dures whenever the next new increment of output data arrives.In this incremental way, the new increment data is added tothe existing time-space synthesis continually. The modelingstructure and parameters are inherited and updated recursivelyover time.

B. Online Updating of Time-Space Synthesis

Suppose that the output data at time tj(j = 1, . . . , L) isan N-dimensional vector y(x, tj) = [y(x1, tj), . . . , y(xN, tj)]T ,which is measured at N spatial locations. For simplicity, markyj = y(x, tj). The n-order spatial BFs, denoted as {ϕi}n

i=1,are typically learned by time-space separation from a set oftraining data Y1 = [y1, . . . , yL] for time steps of L.

The output data is generated continually, even after the time-space synthesis has been learned at time step tL. Supposethat the time-space synthesis should be processed at a newtime step tL+M , and Y2 = [yL+1, . . . , yL+M] is the new dataincrement, for new time steps of M. For batch-mode method,the time-space separation is reperformed from scratch byKLD of the augmented data matrix Y = [Y1 Y2]. Thismethod is computationally expensive as the online processgenerates more and more historical data. Instead, we derivethe concrete procedure on how the proposed method inher-its and updates the time-space synthesis efficiently throughincremental learning.

According to (31), the original temporal correlation matrixC can be written as

C = 1

LYT

1 Y1. (6)

By singular value decomposition (SVD), the matrix YT1 can be

decomposed into

YT1 = U�VT . (7)

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 4: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

WANG AND LI: INCREMENTAL SPATIOTEMPORAL LEARNING FOR ONLINE MODELING OF DPSs 2615

Fig. 2. Incremental spatiotemporal modeling scheme for online modeling of DPSs.

Then C can be rewritten as

C = 1

LU�VTV�UT = 1

LU��TUT = U�UT (8)

where � = (1/L)��T is an L × L diagonal matrix. By KLD,we choose the dominant n features which capture more than99% of the system’s information according to (32). Then thebest rank-n approximation of C is

Cn = Un�nUTn (9)

where Un is formed by the first n columns of U, and �n isthe nth leading principal submatrix of �. According to (28),we can construct the n dominant BFs � = [ϕ1(x), . . . , ϕn(x)]as

� = (UT

n YT1

)T = Y1Un. (10)

After identifying the dominant spatial BFs {ϕi(x)}ni=1, the

corresponding temporal coefficients {ai(t)}n,Li=1,t=1 of the spa-

tiotemporal output y(x, t) can be obtained using (23). Assumethat the acquired temporal coefficients matrix is An×L =[a(1), . . . , a(L)], where a(t) = [a1(t), . . . , an(t)]T , t =1, . . . , L, it can be verified that the output data Y1 is recon-structed, as (Y1)n, using spatial BFs � and the correspondingtemporal coefficients A (

Y1)

n = �A. (11)

When the new data Y2 is added and Y = [Y1, Y2], the newtemporal correlation matrix C is

C = 1

L + MYTY = 1

L + M

[YT

1 Y1 YT1 Y2

YT2 Y1 YT

2 Y2

]. (12)

Suppose that the previous data Y1 is not accessible any more,the new C cannot be computed directly. Instead, we update theeigenvectors of data matrix Y to compute the new BFs basedon the SVD-updating algorithm [37], [38] in an incrementalway.

As we known, YT1 ∈ R

L×N , (YT1 )L×N = U�VT , and its best

rank-n approximation (YT1 )n = Un�nVT

n , where Un and Vn are

formed by the first n columns of U and V , respectively, and�n is the nth leading principal submatrix of �. Next, we want

to carry out the SVD of a larger matrix

[(YT

1 )L×N

(YT2 )M×N

], where

YT2 is an M × N matrix consisting of M additional rows.Let the QR decomposition of (I − VnVT

n )Y2 be(I − VnVT

n

)Y2 = QR (13)

where Q is orthonormal and R is the m × M upper triangular,where m (m ≤ min(N, M)) is the rank of (I − VnVT

n )Y2. Thisstep projects the new rows YT

2 to the orthogonal complementof the old right eigenvector subspace, i.e., span{Vn}. It can beverified that

YT =[(

YT1

)L×N(

YT2

)M×N

]=

[Un 00 IM

][�n 0

YT2 Vn RT

][Vn Q

]T

(14)

noticing that [Vn Q] is orthonormal. Now, obtain the SVDof the (n + M) × (n + m) matrix[

�n 0YT

2 Vn RT

]= U�VT (15)

where U ∈ R(n+M)×(n+M) and � ∈ R

(n+M)×(n+m), and V ∈R

(n+m)×(n+m).Then, the new temporal correlation matrix C can be rewrit-

ten as

C = 1

L + MYTY

= 1

L + M

[Un 00 IM

]U�VT[

Vn Q]T

× [Vn Q

]V�TUT

[Un 00 IM

]T

=([

Un 00 IM

]U

)(1

L + M��T

)([Un 00 IM

]U

)T

= U�UT (16)

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 5: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2616 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

where the updated diagonal matrix � = (1/L + M)��T , and

the updated eigenvectors U =[

Un 00 IM

]U. By KLD, we

choose the new dominant n′ features which capture more than99% of the system’s information according to (32). The bestrank-n′ approximation of C is

Cn′ = Un′�n′UTn′ (17)

where Un′ is formed by the first n′ columns of U, and �n′ isthe n′th leading principal submatrix of �.

In accordance with (28), we can update the previous n-order BFs � = [ϕ1(x), . . . , ϕn(x)] to a new n′-order one as� = [ϕ1(x), . . . , ϕn′(x)]

� = (UT

n′YT)T = [Y1 Y2

]Un′ . (18)

Since the complete information about the original data Y1 isnot accessible due to online storage, we use the best rank-napproximation (11) to reconstruct the original data. Then thenew n′-order dominant BFs can be computed as

� = [�A Y2

]([Un 00 IM

]Un′

)(19)

where Un′ is formed by the first n′ columns of U. In this incre-mental way, the old BFs � is transformed to an updated one� when the new increment of output data Y2 arrives, with-out requirement to store the previous data Y1. This functionenables recursive calculation, which is important for onlineimplementation of modeling methods.

After the spatial BFs being updated, the corresponding tem-poral coefficients can be updated according to (34), followingby reidentification of the low-order temporal model in (35). Atthis point, the whole time-space synthesis has been inheritedand updated online for reconstructing the system dynamics andpredicting future outputs in real-time. By incremental learning,the modeling structure evolves continually as new incrementsof spatial measurements are generated through the whole lifecycle of DPSs. Hence, this evolving structure is capable oftracking and adapting to the system’s dynamics online.

C. Computational Complexity of Incremental Modeling

The first step of the proposed incremental modeling isQR decomposition of [(I − VnVT

n )Y2]N×M in (13), whichrequires approximately O(NM2) flops. The following is SVD

of the smaller matrix

[�n 0

YT2 Vn RT

](n+M)×(n+m)

that requires

approximately O((n + m)(n + M)2) flops. In many appli-cations, the number of dominant BFs n is much smallerthan other parameters. That is to say, n � {m, N, M}, andm ≤ min{N, M}. Neglecting the contribution of initializationstep, the total time complexity of the incremental learning pro-cedure is at the level of O(NM2), depending on the lengthM of the new data increment. Nevertheless, in batch-modemethod, it requires approximately O((L + M)3) flops to pro-ceed the KLD of the new correlation matrix C(L+M)×(L+M).Hence, the computational complexity of incremental model-ing will be much lower than the batch-mode method, sincethe historical data length L in online mode is continuouslygrowing, which leads to {N, M} � L in many practical cases.

D. Main Advantages of Incremental Modeling

1) Online Computation and Database Update: It deals witha continuous sequence of spatial measurements, pro-cesses the streaming data as it arrives in real-time ratherthan waiting for the end of the sequence, without anyrequirement to keep the previous measurements as well.

2) Reduced Complexity and Memory Requirements: Theincremental learning process requires approximatelyO(NM2) flops and O(NM) memory units in compari-son to O((L + M)3) flops and O(N(L + M)) memoryunits required by the batch-mode method.

3) Adaptiveness: It develops a continually inherited andupdated time-space synthesis according to new incre-ments of output data, which can track and adapt to thesystem’s dynamics in real-time.

IV. SIMULATION EXPERIMENTS

In order to evaluate the proposed incremental spatiotemporalmodeling methodology, the benchmarked distributed processof a catalytic rod is studied. At each time step tj, the spa-tial measurements vector yj = y(x, tj) is acquired. Supposethat during the time period between step tL and tL+M , we col-lect the new increment of output data Y2 = [yL+1, . . . , yL+M].The existing time-space synthesis learned from the histori-cal data Y1 = [y1, . . . , yL] at tL shall be transformed to anupdated version at tL+M through incremental learning of thenew data set Y2. The up-to-date synthesis is used to recon-struct the system’s output and predict the system’s dynamicsin the future. Then, the updating process is repeated wheneverthe next new increment of output data arrives. In this incre-mental way, the time-space synthesis is inherited and updatedrecursively, resulting in implementation of online modelingand prediction in real-time.

In order to demonstrate the performance of our proposedincremental modeling methodology, we compare it to the tra-ditional batch-mode modeling method. In batch-mode, thespatial BFs and temporal coefficients are computed directlyusing all the training data from the initial state to the presentof the process, assuming that the previous data Y1 was accessi-ble. All the algorithms are implemented in MATLAB R2013arunning on Windows 7 with Intel core i5-4590 3.30 GHz and4 GB RAM. And all the experimental results presented in thispaper are averaged over 100 runs.

Let y(x, t) and yn(x, t) denote the measured output and thepredicted output. The three performance indexes for evaluatingthe modeling accuracy is defined as follows.

1) Spatiotemporal error e(x, t) = y(x, t) − yn(x, t).2) Spatial normalized absolute error, SNAE(t) =

(1/N)∑N

i=1 |e(xi, t)|.3) Root of mean squared error, RMSE =

(∫ ∑

e(x, t)2dx/∫

dx∑

�t)1/2.

A. Case: Catalytic Rod

The benchmark PDE system of a catalytic rod, which con-sists a long thin rod in a reactor, is shown in Fig. 3. It is aclassical and widely investigated transport-reaction process inchemical industry [39]. A zeroth-order exothermic chemical

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 6: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

WANG AND LI: INCREMENTAL SPATIOTEMPORAL LEARNING FOR ONLINE MODELING OF DPSs 2617

Fig. 3. Catalytic rod.

reaction is produced inside in the form of A → B, where Ais the pure species fed into the reactor. A cooling medium intouch with the catalytic rod is used for cooling the exothermicprocess.

Assume the species A in the furnace is excess, and thefollowing parameters of the catalytic rod are constant: den-sity, heat capacity, conductivity, and temperature at both sides.The mathematical model of the following parabolic PDE canbe used to describe the spatiotemporal evolution of the rodtemperature [39]:

∂y(x, t)

∂t= ∂2y(x, t)

∂x2+ βT

(e− γ

1+y − e−γ)

+ βu(bT(x)u(t) − y(x, t)

)(20)

subject to the Dirichlet boundary and initial conditions

y(0, t) = 0, y(π, t) = 0, y(x, 0) = y0(x)

where y(x, t) is the rod temperature, u(t) is the temporal inputfunction, and b(x) is the spatial distribution of input actuators.βT is the heat of reaction, βu is the heat transfer coefficient,and γ denotes the activation energy. The process parametersare often set as

βT = 50, βu = 2, γ = 4.

There are four input actuators u(t) = [u1(t), . . . , u4(t)]T

with the spatial distribution function b(x) = [b1(x), . . . ,b4(x)]T , bi(x) = H(x−(i−1)π/4)−H(x−iπ/4), (i = 1, . . . , 4)

and H(·) is the standard Heaviside function. For gatheringinformative data and persistently exciting the full spectrum ofthe nonlinear system’s dynamics, the input signals are imple-mented with a series of sinusoidal functions with differentfrequencies as ui(t) = 1.1 + 5sin(t/2 + i/10), (i = 1, . . . , 4).The number of required sensors for modeling depends onboth the intrinsic physical system and the extrinsic model-ing accuracy needed in practice. In this case, the system’soutput y(xi, t), (i = 1, . . . , N) is collected from 18 identicalsensors that are uniformly distributed in the spatial domain(N = 18).

The noise-free streaming data is generated from (20) contin-ually, which is sampled at time interval �t = 0.01. The initialcondition y0(x) is set to be the steady state with the inputui(t) = 1.1, (i = 1, . . . , 4). The white Gaussian noise withmean zero and standard deviation σ(xi) = Ad(xi)nd, whereAd(xi) = (max(y(xi, t)) − min(y(xi, t)))/3, (i = 1, . . . , N) andnd = 2% is added additively to the noise-free streaming datato derive the noisy output. The streaming output data is col-lected for updating the time-space synthesis at time interval

Fig. 4. Measured output for a period of the online process.

�ct = 10. That is, the original spatial BFs and the tempo-ral model is computed when the first 1000 output data iscollected at time t = 10. Then the new data is added tothe existing time-space synthesis in increments of 1000 attime t = 20, 30, . . . Subsequently, the time-space synthesisis inherited and updated through incremental learning everytime the next new 1000 data is collected. In these moments,the up-to-date time-space synthesis is used for reconstruct-ing the system’s output from the initial state to the present.And for verifying the online modeling performance. It is usedto predict the 1000 output data during the future time inter-val �c(t). In this way, the incremental spatiotemporal modelis trained and tested online in real-time, which is capable ofbeing adaptive to the system’s dynamics.

As a short example, the measured output y(x, t) for t ∈(0, 100) is shown in Fig. 4. In the experiment, three dominantspatial BFs are selected since they can capture more than 99%of the system’s energy all the time. As shown in Fig. 5, thethree BFs {ϕi(x)}3

i=1 are updated every �ct = 10. It can beobserved that the first spatial basis oscillates between two setsof values, while the changes of both the second and the thirdbasis are getting smaller and smaller along with the onlineprocess.

For intuitive comparison, we use RMSE as performanceindex regarding to modeling accuracy of the proposed incre-mental modeling and the traditional batch-mode method. Ateach time step when the new 1000 data is added in (t =20, 30, ...), we compute the RMSE on the training data fromthe initial state to the present, and RMSE on the 1000 testingdata during the future time interval �c(t). We also computethe modeling errors when the BFs are not updated all the wayto illustrate the necessity of updating the BFs in the onlineenvironment. Together, the modeling accuracy comparison isshown in Table I. From the table, we can clearly see that thereconstruction error on the training data of the incrementalmethod has always been very close to the batch-mode method.By calculation, the difference of the reconstruction errorbetween these two methods is less than 1%. Meantime thetesting errors for predicting the future outputs of the two meth-ods are equally comparable to each other. These two indexesindicate that the incremental modeling method achieves almostas good performance as the traditional batch-mode method interms of modeling accuracy.

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 7: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2618 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

(a)

(b)

(c)

Fig. 5. First three dominant BFs derived by the proposed incrementalmodeling. (a) ϕ1(x). (b) ϕ2(x). (c) ϕ3(x).

TABLE IMODELING ACCURACY COMPARISON BETWEEN THE TRADITIONAL

BATCH UPDATING AND THE PROPOSED INCREMENTAL

UPDATING METHODS

On the other hand, the running time for updating the time-space synthesis is considered as the performance index forevaluating the computational efficiency. As shown in Fig. 6,

Fig. 6. Running time (s) comparison between the traditional batch-modemodeling and the proposed incremental-mode modeling.

Fig. 7. Predicted output of incremental modeling on training data.

the running time of the batch-mode method increases super-linearly over time, since its time complexity is O((L +M)3) as the online process resulting in a growing num-ber of historical output data with length L. Nevertheless,the running time of the incremental modeling increasesvery slowly. This attracting advantage should benefit fromits time complexity being O(NM2), which depends on thedata increment length M instead of the historical datalength L. This index shows that the incremental modelingis computationally much more effective than the batch-modemethod.

For more intuitive performance demonstration and contrastof model training, we present the predicted output yn(x, t),the spatiotemporal error e(x, t) and spatial normalized abso-lute error SNAE(t) on the training data t ∈ (0, 100), as shownin Figs. 7–9, respectively, Similarly, for further verifying theperformance of model testing, the measured output y(x, t), thepredicted output yn(x, t), the spatiotemporal error e(x, t), andspatial normalized absolute error SNAE(t) on a new set of2000 testing data are also illustrated in Figs. 10–13, respec-tively. Obviously, it can be found that the proposed incrementalmodeling performs equally good as the traditional batch-modemethod, and can provide an extremely close approximation tothe original system.

Combined with the theoretical analysis in Section III, itcan be summarized that the expected modeling accuracy andcomputational gains are indeed achieved. From the perspec-tive of modeling accuracy, the proposed incremental methodon inheriting and updating the time-space synthesis gives an

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 8: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

WANG AND LI: INCREMENTAL SPATIOTEMPORAL LEARNING FOR ONLINE MODELING OF DPSs 2619

(a)

(b)

Fig. 8. Comparison on spatiotemporal error of (a) incremental-mode and(b) batch-mode methods on training data.

Fig. 9. Comparison on spatial normalized absolute error of incremental-modeand batch-mode methods on training data.

Fig. 10. Measured output for the testing.

extremely close approximation to the traditional batch-modemethod. At the same time, the proposed incremental modelinghas the advantages of saving much computational effort, being

Fig. 11. Predicted output of incremental modeling on testing data.

(a)

(b)

Fig. 12. Comparison on spatiotemporal error of (a) incremental-mode and(b) batch-mode methods on testing data.

adaptive to online processes, and no requirement to store theprevious data.

Remark 1: The time interval for updating the time-spacesynthesis, denoted as �c(t), is a hyperparameter in the incre-mental modeling algorithm. In the experiment, we evaluate themodel training and testing performances of the incrementalmodeling algorithm at time t = 100 with respect to differ-ent settings of updating time interval. As shown in Fig. 14,it can be observed that the incremental modeling algorithmachieves almost the same good performances regarding to dif-ferent updating time intervals. In practice, the updating timeinterval can be adjusted according to the process requirements.

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 9: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2620 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

Fig. 13. Comparison on spatial normalized absolute error of incremental-mode and batch-mode methods on testing data.

Fig. 14. Modeling error of the proposed incremental algorithm with respectto the updating time interval.

V. CONCLUSION

An incremental spatiotemporal learning scheme is proposedfor online modeling of DPSs in this paper. It is based onrecursive updating of the spatial BFs and the correspond-ing temporal model through incremental learning of new setsfrom streaming data. In this way, the time-space synthesisis inherited and updated through the whole life cycle ofthe online process. The proposed incremental method canachieve almost the same modeling accuracy as the tradi-tional batch-mode method. Meantime, it is computationallymuch more effective, since it does not require to retrain thewhole model structure from scratch when new output dataarrives. Besides, it does not require to store the entire setof the process data. The adaptive nature of this methodol-ogy makes it promising for online modeling of DPSs for thewhole life cycle. The proposed concept of incremental learningwill have broad applications in many fields, including model-ing, optimal sensor placement, and predictive control of DPSs.Experimental results demonstrate the viability, efficiency, andpotential of this incremental-mode approach for online model-ing of distributed processes. Future implementation in variousengineering problems is under consideration.

APPENDIX

SPATIOTEMPORAL MODELING

A. Time-Space Separation

For time-space separation of the PDE system (1),KLD [28]–[29], as a data-based model reduction method forrepresenting a stochastic field with the lowest dimension, is

widely utilized for calculating the empirical eigenfunctionsand deriving accurate reduced-order approximations of manyPDE systems [5]–[11]. For simplicity, assume the systemoutput {y(xi, t)}N,L

i=1,t=1, denoted as “snapshots,” is uniformlysampled in both the time and space coordinates, where L isthe time length. Define the inner product, norm and ensem-ble average as (f1(x), f2(x)) = ∫

�f1(x)f2(x)dx, ||f1(x)|| =

(f1(x), f1(x))1/2 and 〈f1(x, t)〉 = (1/L)∑L

t=1 f1(x, t).Motivated by Fourier series, the spatiotemporal variable

y(x, t) can be expanded onto an infinite number of orthonormalspatial BFs {ϕi(x)}∞i=1 with temporal coefficients {ai(t)}∞i=1

y(x, t) =∞∑

i=1

ϕi(x)ai(t). (21)

Because the spatial BFs are orthonormal, i.e.,

(ϕi(x), ϕj(x)

) =∫

ϕi(x)ϕj(x)dx ={

0, i �= j1, i = j

(22)

the temporal coefficients can be obtained from

ai(t) = (ϕi(x), y(x, t)), i = 1, . . . ,∞. (23)

For practical use, it can be truncated into a finite-dimensionalversion

yn(x, t) =n∑

i=1

ϕi(x)ai(t) (24)

where yn(x, t) denotes the nth-order approximation.Time-space separation aims to compute the most domi-

nant spatial BFs {ϕi(x)}ni=1 among the spatiotemporal output

{y(xi, t)}N,Li=1,t=1 using KLD. Finding the typical {ϕi(x)}n

i=1 canbe performed by minimizing the following objective function:

minϕi(x)

〈‖y(x, t) − yn(x, t)‖2〉 (25)

subject to (ϕi, ϕi) = 1, ϕi ∈ L2(�), i = 1, . . . , n. Theorthonormal constraint (ϕi, ϕi) = 1 is imposed to restrict thatthe function ϕi(x) is unique. The Lagrangian function withregard to this constrained optimization problem is

J = 〈‖y(x, t) − yn(x, t)‖2〉 +n∑

i=1

λi((ϕi, ϕi) − 1) (26)

the necessary condition of this problem can be computed as∫�

R(x, ξ)ϕi(ξ)dζ = λiϕi(x), (ϕi, ϕi) = 1, i = 1, . . . , n

(27)

where R(x, ξ) = 〈y(x, t)y(ξ, t)〉 is denoted as the spatial two-point correlation function.

The solution of (27) can be obtained by a computationallyefficient method of snapshots [28]. The eigenfunction (spatialBFs) ϕi(x) can be transformed into a linear combination ofthe snapshots as

ϕi(x) =L∑

t=1

γity(x, t). (28)

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 10: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

WANG AND LI: INCREMENTAL SPATIOTEMPORAL LEARNING FOR ONLINE MODELING OF DPSs 2621

After substituting (28) into (27), the necessary condition iscomputed as

∫�

1

L

L∑t=1

y(x, t)y(ζ, t)L∑

k=1

γiky(ζ, k)dζ = λi

L∑t=1

γity(x, t).

(29)

Then this eigenvalue problem is transformed to a simplifiedform of an L × L matrix eigen-decomposition problem as

Cγi = λiγi (30)

where γi = [γi1, . . . , γiL]T is the ith eigenvector, and

Ctk = 1

L

∫�

y(ζ, t)y(ζ, k)dζ (31)

is defined as the temporal two-point correlation function. Thesolution of problem (30) yields the eigenvectors γ1, . . . , γL,which in turn can be used for constructing the eigenfunctionsϕ1(x), . . . , ϕL(x) in (28). Since the matrix C is symmetricand positive semidefinite, the computed eigenfunctions areorthogonal.

Denote the maximum number of nonzero eigenvalues asK ≤ min(N, L). Let the eigenvalues λ1 > λ2 > . . . > λK

and the corresponding eigenfunctions ϕ1(x), ϕ2(x), . . . , ϕK(x)in the descending order of the magnitude of the eigenval-ues. The eigenfunction corresponding to the first eigenvalueis supposed to be the most “energetic.” The total “energy”of the PDE system is considered as the sum of the eigenval-ues. The energy percentage to each eigenfunction based on theassociated eigenvalue is assigned as

Ei = λi/

K∑j=1

λj, i = 1, . . . , K. (32)

Usually, the sufficient set of eigenfunctions that capture morethan 99% of the system’s energy can be used to determinethe reduced-order degree of n in (24). By experience, only asmall set of dominant spatial BFs expansion can approximatemost of the dynamics of many spatiotemporal systems. Forany arbitrary set of spatial BFs {φi(x)}n

i=1, the following resultholds [30]:

n∑i=1

〈(y(·, t), ϕi)2〉 =

n∑i=1

λi ≥n∑

i=1

〈(y(·, t), φi)2〉. (33)

It shows that KLD is optimal on average in the class of rep-resentations by linear combination. That is why KLD canprovide the lowest dimension n.

B. Temporal Model Identification

After learning the optimal spatial BFs {ϕi(x)}ni=1 by time-

space separation, the low-order temporal model ai(t) is identi-fied from the decomposed low-dimensional data. The temporalcoefficients ai(t) corresponding to the spatiotemporal outputy(x, t) are computed from (21) as

ai(t) = (ϕi(x), y(x, t)), i = 1, . . . , n. (34)

The time series data a(t) is usually approximated by adeterministic NARX model [31]

a(t) = F(a(t − 1), . . . , a(t − da), u(t − 1), . . . , u(t − du))

+ e(t) (35)

where du and da denote the maximum input and output lags,respectively, and e(t) denotes the residual error. The unknownfunction F can be estimated from the low-dimensional input–output data set {u(t), a(t)}L

t=1 using various function approx-imators, such as radial BFs (RBFs), polynomial functions,wavelets and kernel functions [32]. After identification, themodel (35) can provide a prediction a(t) at any time t if theinitial conditions are given. Combined with (24), this reduced-order model can reconstruct and predict the spatiotemporaldynamics over the entire time-space domain.

In this paper, the temporal model is assumed to be asimplified form as

a(t) = Ba(t − 1) + F(a(t − 1)) + Du(t − 1)

+ e(t) (36)

where the matrices B ∈ Rn×n and D ∈ R

n×m donate the linearpart and the transform function F : R

n → Rn donates the

nonlinear part. NNs are capable of approximating any contin-uous function to an arbitrary accuracy and have been widelyinvestigated for various industrial processes [7], [8], [33]–[36].In the temporal identification stage, F is estimated as an RBFnetwork, then the model (36) is rewritten as

a(t) = Ba(t − 1) + WK(a(t − 1)) + Du(t − 1) + e(t) (37)

where W = [W1, . . . , Wl] ∈ Rn×l denotes the weight, K(·) =

[K1(·), . . . , Kl(·)]T : Rn → R

l denotes RBF, and l is the num-ber of neurons. The RBF is usually selected as the Gaussiankernel Ki(a) = exp−(a − ci)

T�−1i (a − ci)/2, (i = 1, . . . , l)

with proper center vector ci ∈ Rn and norm matrix �i ∈ R

n×n.With the KLD as a preprocessor, the size of the temporalmodel can be greatly reduced. The unknown parameters A, B,and W of the hybrid RBF network can be estimated by therecursive least square method [7]. Finally, this time-space syn-thesis can be used to reconstruct the spatiotemporal dynamicsand predict the future outputs of the system.

REFERENCES

[1] H.-X. Li and C. Qi, “Modeling of distributed parameter systems forapplications—A synthesized review from time-space separation,” J.Process Control, vol. 20, no. 8, pp. 891–901, Sep. 2010.

[2] J.-W. Wang and H.-N. Wu, “Exponential pointwise stabilization of semi-linear parabolic distributed parameter systems via the Takagi–Sugenofuzzy PDE model,” IEEE Trans. Fuzzy Syst., vol. 26, no. 1, pp. 155–173,Feb. 2018.

[3] J.-W. Wang, S.-H. Tsai, H.-X. Li, and H.-K. Lam, “Spatially piecewisefuzzy control design for sampled-data exponential stabilization of semi-linear parabolic PDE systems,” IEEE Trans. Fuzzy Syst., to be published,doi: 10.1109/TFUZZ.2018.2809686.

[4] J.-W. Wang, H.-X. Li, and H.-N. Wu, “A membership-function-dependent approach to design fuzzy pointwise state feedback controllerfor nonlinear parabolic distributed parameter systems with spatially dis-crete actuators,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 47, no. 7,pp. 1486–1499, Jul. 2017.

[5] C. Qi and H.-X. Li, “A time/space separation-based Hammerstein mod-eling approach for nonlinear distributed parameter processes,” Comput.Chem. Eng., vol. 33, no. 7, pp. 1247–1260, Jul. 2009.

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.

Page 11: Incremental Spatiotemporal Learning for Online Modeling of ......2612 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 Incremental Spatiotemporal

2622 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

[6] M. Wang, H.-X. Li, X. Chen, and Y. Chen, “Deep learning-based modelreduction for distributed parameter systems,” IEEE Trans. Syst., Man,Cybern., Syst., vol. 46, no. 12, pp. 1664–1674, Dec. 2016.

[7] C. Qi and H.-X. Li, “Nonlinear dimension reduction based neural mod-eling for distributed parameter processes,” Chem. Eng. Sci., vol. 64,no. 19, pp. 4164–4170, Oct. 2009.

[8] Z. Liu and H.-X. Li, “A spatiotemporal estimation method for temper-ature distribution in lithium-ion batteries,” IEEE Trans. Ind. Informat.,vol. 10, no. 4, pp. 2300–2307, Nov. 2014.

[9] K.-K. Xu, H.-X. Li, and H.-D. Yang, “Kernel-based random vectorfunctional-link network for fast learning of spatiotemporal dynamicprocesses,” IEEE Trans. Syst., Man, Cybern., Syst., to be published,doi: 10.1109/TSMC.2017.2694018.

[10] K.-K. Xu, H.-X. Li, and Z. Liu, “ISOMAP-based spatiotemporal model-ing for lithium-ion battery thermal process,” IEEE Trans. Ind. Informat.,vol. 14, no. 2, pp. 569–577, Feb. 2018.

[11] R. Zhang, J. Tao, R. Lu, and Q. Jin, “Decoupled ARX and RBF neu-ral network modeling using PCA and GA optimization for nonlineardistributed parameter systems,” IEEE Trans. Neural Netw. Learn. Syst.,vol. 29, no. 2, pp. 457–469, Feb. 2018.

[12] D. Coca and S. A. Billings, “Identification of finite dimensional modelsof infinite dimensional dynamical systems,” Automatica, vol. 38, no. 11,pp. 1851–1865, Nov. 2002.

[13] H. He, S. Chen, K. Li, and X. Xu, “Incremental learning from streamdata,” IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1901–1914,Dec. 2011.

[14] G. A. Carpenter and S. Grossberg, “The ART of adaptive pattern recog-nition by a self-organizing neural network,” Computer, vol. 21, no. 3,pp. 77–88, Mar. 1988.

[15] R. Elwell and R. Polikar, “Incremental learning of concept drift in non-stationary environments,” IEEE Trans. Neural Netw., vol. 22, no. 10,pp. 1517–1531, Oct. 2011.

[16] Z. Wang, C. Chen, H.-X. Li, D. Dong, and T.-J. Tarn, “A novelincremental learning scheme for reinforcement learning in dynamic envi-ronments,” in Proc. 12th World Congr. Intell. Control Autom., Guilin,China, Jun. 2016, pp. 2426–2431, doi: 10.1109/WCICA.2016.7578530.

[17] D. A. Ross, J. Lim, R.-S. Lin, and M.-H. Yang, “Incremental learn-ing for robust visual tracking,” Int. J. Comput. Vis., vol. 77, nos. 1–3,pp. 125–141, May 2008.

[18] S. Yang and X. Yao, “Population-based incremental learning with asso-ciative memory for dynamic environments,” IEEE Trans. Evol. Comput.,vol. 12, no. 5, pp. 542–561, Oct. 2008.

[19] D. Kulic, C. Ott, D. Lee, J. Ishikawa, and Y. Nakamura, “Incrementallearning of full body motion primitives and their sequencing throughhuman motion observation,” Int. J. Robot. Res., vol. 31, no. 3,pp. 330–345, Mar. 2012.

[20] W. Li, H. H. Yue, S. Valle-Cervantes, and S. J. Qin, “Recursive PCAfor adaptive process monitoring,” J. Process Control, vol. 10, no. 5,pp. 471–486, Oct. 2000.

[21] A. Varshney, S. Pitchaiah, and A. Armaou, “Feedback control of dissi-pative PDE systems using adaptive model reduction,” AIChE J., vol. 55,no. 4, pp. 906–918, Apr. 2009.

[22] D. B. Pourkargar and A. Armaou, “Modification to adaptive modelreduction for regulation of distributed parameter systems with fasttransients,” AIChE J., vol. 59, no. 12, pp. 4595–4611, Dec. 2013.

[23] D. B. Pourkargar and A. Armaou, “APOD-based control of lineardistributed parameter systems under sensor/controller communicationbandwidth limitations,” AIChE J., vol. 61, no. 2, pp. 434–447, Feb. 2015.

[24] C. Xu, L. Luo, and E. Schuster, “On recursive proper orthog-onal decomposition via perturbation theory with applications todistributed sensing in cyber-physical systems,” in Proc. Amer.Control Conf., Baltimore, MD, USA, Jun./Jul. 2010, pp. 4905–4910,doi: 10.1109/ACC.2010.5530923.

[25] C. Xu and E. Schuster, “Model order reduction for high dimensionallinear systems based on rank-1 incremental proper orthogonal decom-position,” in Proc. Amer. Control Conf., San Francisco, CA, USA,Jun./Jul. 2011, pp. 2975–2981, doi: 10.1109/ACC.2011.5991522.

[26] C. Xu and E. Schuster, “Low-dimensional modeling of linear heattransfer systems using incremental the proper orthogonal decomposi-tion method,” Asia–Pac. J. Chem. Eng., vol. 8, no. 4, pp. 473–482,Jul./Aug. 2013.

[27] H.-X. Li and C. Qi, “Incremental modeling of nonlinear distributedparameter processes via spatiotemporal kernel series expansion,” Ind.Eng. Chem. Res., vol. 48, no. 6, pp. 3052–3058, Jan. 2009.

[28] L. Sirovich, New Perspectives in Turbulence, 1st ed. New York, NY,USA: Springer, 1991.

[29] J. Baker and P. D. Christofides, “Finite-dimensional approximation andcontrol of non-linear parabolic PDE systems,” Int. J. Control, vol. 73,no. 5, pp. 439–456, Nov. 2000.

[30] P. Holmes, J. L. Lumley, and G. Berkooz, Turbulence, CoherentStructures, Dynamical Systems, and Symmetry. New York, NY, USA:Cambridge Univ. Press, 1998.

[31] I. J. Leontaritis and S. A. Billings, “Input–output parametric modelsfor non-linear systems part I: Deterministic non-linear systems,” Int. J.Control, vol. 41, no. 2, pp. 303–328, 1985.

[32] J. Sjöberg et al., “Nonlinear black-box modeling in system identifica-tion: A unified overview,” Automatica, vol. 31, no. 12, pp. 1691–1724,Dec. 1995.

[33] X.-G. Zhou, L.-H. Liu, Y.-C. Dai, W.-K. Yuan, and J. L. Hudson,“Modeling of a fixed-bed reactor using the K-L expansion and neuralnetworks,” Chem. Eng. Sci., vol. 51, no. 10, pp. 2179–2188, May 1996.

[34] N. Smaoui and S. Al-Enezi, “Modelling the dynamics of nonlinearpartial differential equations using neural networks,” J. Comput. Appl.Math., vol. 170, no. 1, pp. 27–58, Sep. 2004.

[35] H. Deng, H.-X. Li, and G. Chen, “Spectral-approximation-based intelli-gent modeling for distributed thermal processes,” IEEE Trans. ControlSyst. Technol., vol. 13, no. 5, pp. 686–700, Sep. 2005.

[36] E. Aggelogiannaki and H. Sarimveis, “Nonlinear model predictive con-trol for distributed parameter systems using data driven artificial neuralnetwork models,” Comput. Chem. Eng., vol. 32, no. 6, pp. 1225–1237,Jun. 2008.

[37] H. Zha and H. D. Simon, “On updating problems in latent semanticindexing,” SIAM J. Sci. Comput., vol. 21, no. 2, pp. 782–791, 1999.

[38] A. Levy and M. Lindenbaum, “Sequential Karhunen–Loéve basis extrac-tion and its application to images,” IEEE Trans. Image Process., vol. 9,no. 8, pp. 1371–1374, Aug. 2000.

[39] P. D. Christofides, Nonlinear and Robust Control of PDE Systems:Methods and Applications to Transport-Reaction Processes. Boston,MA, USA: Birkhäuser, 2001.

Zhi Wang received the B.E. degree in automa-tion from the Department of Control and SystemsEngineering, Nanjing University, Nanjing, China,in 2015. He is currently pursuing the Ph.D.degree in machine learning and intelligent model-ing with the Department of Systems Engineeringand Engineering Management, City University ofHong Kong, Hong Kong.

His current research interests include machinelearning, deep learning, reinforcement learning, andsystem modeling.

Han-Xiong Li (S’94–M’97–SM’00–F’11) receivedthe B.E. degree in aerospace engineering fromthe National University of Defense Technology,Changsha, China, in 1982, the M.E. degree inelectrical engineering from the Delft University ofTechnology, Delft, The Netherlands, in 1991, andthe Ph.D. degree in electrical engineering from theUniversity of Auckland, Auckland, New Zealand,in 1997.

He is a Professor with the Department of SystemsEngineering and Engineering Management, City

University of Hong Kong, Hong Kong. He has a broad experience in bothacademia and industry. He has authored two books and seven patents, and pub-lished over 200 SCI journal papers with an H-index of 37 (Web of Science).His current research interests include process modeling and control, systemintelligence, distributed parameter systems, and battery management systems.

Dr. Li was a recipient of the Distinguished Young Scholar (overseas) by theChina National Science Foundation in 2004, the Chang Jiang Professorshipby the Ministry of Education, China, in 2006, and the National Professorshipin China Thousand Talents Program in 2010. He serves as an Associate Editorfor the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS:SYSTEMS, IEEE TRANSACTIONS ON CYBERNETICS from 2002 to 2016, andIEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS from 2009 to 2015.He serves as a Distinguished Expert for Hunan Government and the ChinaFederation of Returned Overseas Chinese.

Authorized licensed use limited to: Nanjing University. Downloaded on March 17,2020 at 13:13:08 UTC from IEEE Xplore. Restrictions apply.


Recommended