Prediction · 2020-01-27 · Large-Scale Optical Reservoir Computing for Spatiotemporal Chaotic...

Large-Scale Optical Reservoir Computing for Spatiotemporal Chaotic SystemsPrediction

Mushegh Rafayelyan,1 Jonathan Dong,1, 2 Yongqi Tan,1 Florent Krzakala,2 and Sylvain Gigan1

1Laboratoire Kastler Brossel, Sorbonne Universite, Ecole Normale Superieure-ParisSciences et Lettres (PSL) Research University, Centre National de la Recherche

Scientifique (CNRS) UMR 8552, College de France, 24 rue Lhomond, 75005 Paris, France2Laboratoire de Physique de l’Ecole Normale Superieure, Universite Paris Sciences et

Lettres (PSL), Centre National de la Recherche Scientifique (CNRS), Sorbonne Universite,Universite Paris-Diderot, Sorbonne Paris Cite, 24 rue Lhomond, 75005 Paris, France.

Reservoir computing is a relatively recent computational paradigm that originates from a re-current neural network, and is known for its wide-range of implementations using different physicaltechnologies. Large reservoirs are very hard to obtain in conventional computers as both the compu-tation complexity and memory usage grows quadratically. We propose an optical scheme performingreservoir computing over very large networks of up to 106 fully connected photonic nodes thanksto its intrinsic properties of parallelism. Our experimental studies confirm that in contrast to con-ventional computers, the computation time of our optical scheme is only linearly dependent on thenumber of photonic nodes of the network, which is due to electronic overheads, while the opticalpart of computation remains fully parallel and independent of the reservoir size. To demonstrate thescalability of our optical scheme, we perform for the first time predictions on large multidimensionalchaotic datasets using the Kuramoto-Sivashinsky equation as an example of a spatiotemporal chaoticsystem. Our results are extremely challenging for conventional Turing-von Neumann machines, andthey significantly advance the state-of-the-art of unconventional reservoir computing approaches ingeneral.

I. INTRODUCTION

Recent studies in machine learning have shown thatlarge neural networks can dramatically improve the net-work performance, however, their realization with con-ventional computing technologies is to date a significantchallenge. Towards this end, a number of alternativecomputing approaches have emerged recently. Amongthem, one of the most studied approaches is reservoircomputing (RC). RC is a relatively recent computa-tional framework [1, 2] derived from independently pro-posed Recurrent Neural Network (RNN) models, suchas echo state networks (ESNs) [3] and liquid state ma-chines (LSMs) [4]. The main objective of ESN and LSMwas the significant simplification of the RNN trainingalgorithm by using fixed random injection and fixed in-ternal connectivity matrices. However, it was rapidlyunderstood that the temporally fixed connections allowsfor the straightforward implementation of RC in optics,electronics, spintronics, mechanics, biology, and in otherfields [5–12]. Optics is one of the most promising fields torealize large and efficient neural networks due to its in-trinsic properties of parallelism and its ability to processthe data at the speed of light and low energy consump-tion.

There are many interesting approaches to realize pho-tonic reservoir networks based on both time- and spatial-multiplexing of photonic nodes. The first approach isbased on a single nonlinear node with a time-delayed op-toelectronic or all-optical feedback in order to get time-multiplexed virtual nodes in the temporal domain [12–24]. Such architectures can reach supercomputer perfor-mances, e.g., gigabyte per second data rates for chaotic

time-series prediction tasks [25] or million words per sec-ond classification for speech recognition tasks [26]. How-ever, their information processing rate is inherently lim-ited as it is inversely proportional to the number of vir-tual nodes of the reservoir. Furthermore, a preprocessingof the input information is required, according to the ini-tially defined virtual nodes, which can bring additionalcomplexity to the problem, especially for the large mul-tidimensional inputs. To this end, multi-channel delay-based RC architectures consisting of several nonlinearnodes are of special interest [27–31].

Another popular approach of photonic RC is basedon spatially distributed nonlinear nodes. The latter isendowed by its intrinsic property to process large-scaleinput information without sacrificing the computationspeed. Several theoretical and experimental studies havebeen performed using on-chip silicon photonics reservoirsconsisting of optical waveguides, optical splitters, andoptical combiners [32–35]. As reported in [35], 16-nodereservoir network of modest sizes can reach high infor-mation processing bitrates, up to speeds > 100 Gbit s−1.Another approach towards the spatially extended pho-tonics reservoir is based on a network of vertical-cavitysurface-emitting lasers (VCSEL) and a standard diffrac-tive optical element (DOE) providing the complex inter-connections between the reservoir nodes [36]. The biascurrent of each laser can be controlled individually, whichallows the encoding of the input data.

Recently, a new approach to spatially scalable photon-ics reservoir has been introduced based on both liquidcrystal spatial light modulators (SLM) and digital mi-cromirror device (DMD) [37–40]. In particular, Bueno etal. in [37] demonstrated a reservoir network of up to 2500

arX

iv:2

001.

0913

1v1

[ph

ysic

s.op

tics]

24

Jan

2020

2

diffractively coupled photonic nodes using a liquid crystalSLM coupled with a DOE and a camera. The input andoutput information in their network is provided via sin-gle nodes. This last limitation has been waived by Donget al. in [38] using a DMD to encode both the reservoirand the input information through the binary intensitymodulation of the light. Later, Dong et al. in [39] im-plemented the same approach to get large-scale opticalreservoir networks using a phase-only SLM that couldprovide an 8-bit encoding of the reservoir and the inputinformation through the spatial phase profile of the lightinstead of the former binary encoding option. We stressout that the key element in both aforementioned opticalnetworks was the strongly scattering medium that guar-anteed a random coupling weights of very large numberof photonic nodes and their parallel processing. Such net-works practically can host as many nodes as the numberof pixels provided by the DMD and the camera [41, 42].

In this work we exploit the potential of the platformprovided by [38, 39] to extend our recent achievementstowards multidimensional large chaotic systems predic-tions. Accordingly, we report on the first experimentalrealization of recently introduced state-of-the-art bench-mark test [43], performing recursive predictions on theKuramoto-Sivashinsky (KS) chaotic systems. To high-light the scalability of our approach, we measure thecomputation time of similar reservoir networks providedeither by an high-end conventional computer or by ouroptical scheme. In contrast to conventional computers,where the time of the computation scales quadraticallyagainst the size of the network, the computation time ofour optical scheme is almost independent of the numberof photonic nodes. More precisely, we observe a rela-tively mild linear dependence due to electronic overheads,while the optical computation remains fully parallel andindependent of the reservoir size. Our results are hardlyreachable by the conventional Turingvon Neumann ma-chines, and they significantly advance the state of the artof the unconventional reservoir computing approaches ingeneral.

II. CONVENTIONAL RESERVOIRCOMPUTING

We now briefly introduce the concept of conventionalRC. An input vector i(t) of dimension Din is injected toa high-dimensional dynamical system called the “reser-voir” (see Fig. 1(a)). The reservoir is described by avector r(t) of dimension Dres that is the number of reser-voir nodes. The initial state of the reservoir is definedrandomly. Let Wres matrix defines the internal connec-tions of the reservoir nodes and Win matrix defines theconnections between the input and the reservoir nodes.Both matrices are initialized randomly and fixed duringthe whole RC process. The state of each reservoir nodeis a scalar rj(t) which evolves according to the followingrecursive relation

FIG. 1. The sketch of the conventional reservoir computingparadigm in (a) training and (b) predicting phases. The vec-tors i(t), r(t) and o(t) describe the injected input, the corre-sponding reservoir states and the trained output, respectively.All three layers of the network are described by Win, Wres

and Wout interconnection matrices. The first two ones areinitialized randomly and are held fixed throughout the wholecomputation process, while the last one is trained by linearregression. In the prediction phase, the feedback loop fromthe predicted output defines the next injected input.

r(t+ ∆t) = f [Wini(t) + Wresr(t)] , (1)

where ∆t is the discrete time-step of the input, f isan element-wise nonlinear function. According to theEq. (1), the reservoir is defined as a high-dimensionaldynamical system endowed with a unique memory prop-erty, namely, each consequent state of the reservoir con-tains some exponentially decaying information about itsprevious states and about the inputs injected until thatmoment. Noteworthy, the memory size of the reservoiris mainly defined by the number of reservoir nodes andthe nonlinear activation function f .

During the training phase, the input i(t), defined inthe time-interval −T ≤ t ≤ 0, is fed to the reservoir, andthe corresponding reservoir states are recursively calcu-lated. The final step of the information processing is toperform a simple linear regression that adjusts the Wout

weights so that their linear combination with the calcu-lated reservoir states makes the actual output o(t) to beas close as possible to the desired output o(t)

RMSE =

√√√√ 1

DoutT

0∑t=−T

‖o(t)− o(t)‖2 , (2)

where

o(t) = Wout · r(t), (3)

Wout = argmin (RMSE) . (4)

RMSE is the root mean square error, and Dout is thenumber of the output nodes, i.e., the dimension of thevector o(t). An additional regularization term λ‖Wout‖2(λ is a scalar) can be used to find the solution of Eq. (4)to avoid overfitting, especially when the number of reser-voir nodes is larger than the number of training examples.Note, the output weights are the only parameters thatare modified during the training. The random input andreservoir weights are fixed throughout the whole compu-tational process and they serve to randomly project theinput into a high-dimensional space, which increases thelinear separability of inputs.

3

In order to perform predictions about t > 0 futureevolution of i(t) using the calculated reservoir states r(t)in −T ≤ t ≤ 0, one needs to train the output weightsWout to predict the next time-step of the input, namelyo(t) = i(t + ∆t). Afterwards, the future evolution ofi(t) for t > 0 can be predicted replacing the input bythe subsequent prediction o(t), as shown in Fig. 1(b).Consequently, during the prediction the reservoir evolvesstep by step, by replacing the subsequent input with thelast prediction every time.

III. OPTICAL RESERVOIR COMPUTING

The experimental setup to perform the optical RC isshowed in Fig. 2 and detailed in the Appendix. Thekey optical components in the setup are the phase-onlySLM, the scattering medium and the camera. The SLMprovides both the encoding of the input vector i(t) of di-mension Din and the encoding of the subsequent reservoirstate r(t) of dimension Dres (total dimension Din +Dres)into the phase spatial profile of the light. The scatter-ing medium ensures their random linear mixing whichis equivalent to their linear multiplications with largedense random matrices consisting of independent andidentically distributed (i.i.d.) random complex vari-ables [44, 45] (see more details about light scattering inthe Appendix). Finally, the camera performs a nonlinearreadout of the complex field intensity for the next reser-voir state r(t + ∆t), that is sent back by the computerto the SLM in order to be displayed with new input, andthe process repeats. The upper and the lower insets inFig. 2 are respective examples of images displayed on theSLM and detected by the camera.

There are a number of tunable parameters regardingthe encoding of the input and the reservoir states ontothe SLM that we will describe here. Without loss of gen-erality, we assume that the the number of grey levels ofthe camera and the SLM are equal to 256. The SLM iscalibrated such that the grey levels from 0 to 255 linearlymap to phase delays of 0 to 2π. Furthermore, we assumewithout loss of generality that the whole input dataset isinitially scaled from 0 to 255 and the acquisition time ofthe camera is initially adjusted to provide unsaturatedreservoir states again ranging from 0 to 255. Accord-ingly, the encoding of the input and the reservoir statesonto the SLM can be described by i(t) → sini(t) andr(t) → sresr(t) with two scaling factors 0 ≤ sin/res ≤ 1.These modifications are performed in the computer, ev-ery time before sending the input and reservoir states tothe SLM. Additionally, each scalar value from the inputand reservoir states can be encoded into multiple num-ber of SLM pixels forming a macropixel. The number ofpixels in one macropixel is denoted by pin for the inputencoding, and pres for the reservoir states encoding. Ac-cordingly, the reservoir computing in our optical schemecan be described by the following recursive relation

r(t+ ∆t) = F [sresr(t)⊕ Jpres, sini(t)⊕ Jpin

] , (5)

FIG. 2. Experimental setup to perform an optical reservoircomputing. The SLM receives from the computer the con-sequent input i(t) concatenated with the reservoir state r(t)and imprints it into the spatial phase profile of the reflectedbeam (see the upper inset as a typical example). The scat-tering medium (SM) provides a complex linear mixing of thewhole encoded information. Finally, the camera performs anonlinear readout for the next reservoir state r(t + ∆t) (seethe lower inset as a typical example), which is sent by thecomputer back to the SLM to be displayed with new input,and the process repeats. LP1, LP2: linear polarizers; HWP:half-wave plate; BE: beam expander; BS: beam splitter; O1,O2: objectives.

where the function F stands for the whole optical setup,i.e, it takes the encoded matrices corresponding to theinput and the reservoir state as two arguments, sendsthem to the SLM, and returns the next reservoir statedetected by the camera. The symbol ⊕ refers to theKronecker product and Jpin/res

refers to the all-ones ma-trix with pin/res number of rows and columns in order toensure the macropixel encoding of the SLM.

In order to get a more detailled description of our opti-cal scheme, we also provide a mathematical relation thatmodels the light propagation and the consequent RC withwell-known mathematical functions:

r(t+ ∆t) = f [Wres g (r(t)) + Win g (i(t))] , (6)

where Wres and Win are random dense matrices describ-ing the scattering of the light in the setup. f and g, arenonlinear functions associated with the intensity readoutby the camera and the phase encoding by the SLM, re-spectively. Namely, for a vector q = [q1, q2, ...]

T , f(q) =[|q1|2, |q2|2, ...]T and g(q) = [exp (iπsq1), exp (iπsq2), ...]T

with 0 ≤ s ≤ 2. Note that all above mentioned opera-tions are implicitly included in the function F in Eq. (5).

The mathematical framework describing our opticalnetwork is very similar to the conventional RC network

4

provided by Eq. (1). The main difference is that an ad-ditional nonlinear function, a complex exponent, is ap-plied in Eq. (6) to account for the phase encoding of theSLM. One can also note, that Wres and Win are complex-valued matrices here in contrast to the conventional RCwhere the connection matrices are real valued. Accord-ingly, Eq. (5) and Eq. (6) together give the whole pictureof information processing in our optical scheme.

During the training phase, as soon as the reservoirstates for the given time interval −T ≤ t ≤ 0 are opti-cally calculated, a simple linear regression is executed inthe conventional computer to adjusts the Wout weightssuch that their linear combination with the calculatedreservoir states makes the actual output to be as close aspossible to the next time step of the input i(t+ ∆t) (seeEqs. (2)–(4)). Finally, to predict the future evolution ofi(t) for t > 0, we make a feedback loop from the outputto the input by replacing the next input i(t+ ∆t) on theSLM with the one-step prediction Woutr(t), as it wasdone in conventional RC in Fig. 1(b).

In general, the RC and its different optical imple-mentations have proven to be very successful for var-ious tasks, such as spoken digits recognition, Tempo-ral XOR task, Santa Fe, MG or NARMA time seriesprediction [5, 9, 11, 13, 17, 27, 46]. Recently, Pathaket al. [43, 47] proposed a new state-of-the-art bench-mark test performing predictions on KS spatiotemporalchaotic datasets with the conventional RC (see more de-tails about the KS equation in the Appendix). In thenext section, we will use the optical RC setup of Fig. 2to predict the dynamical evolution of KS spatiotemporalchaotic systems.

IV. EXPERIMENTAL RESULTS

Initially, we apply the optical RC to the spatiotem-poral KS datasets with a similar set of parameters asreported in [43]. Namely, the spatial domain size L ofthe scalar field u(x, t) is L = 22 in the KS equation (seeEq. (7) in the appendix), which is integrated on the gridof Nx = 64 equally spaced spatial points and Nt = 90500equally spaced time steps with ∆t = 0.25 using an open-source code from [48]. The first 9 · 105 time steps of thedataset are used to train the optical reservoir, while theremaining 500 time steps are kept in order to be com-pared with predicted data. The input and reservoir sizesare Din = 64 and Dres = 104, respectively.

In general, it is believed that the optimum predic-tion performance of RC schemes is reached when thereservoir computer parameters are tuned to the edge ofchaos [49]. Accordingly, before starting the actual ex-periment, we perform a grid search to optimize a setof tunable parameters in our optical scheme. It turnsout that the optimal prediction performance is observedwhen sres = sin = 0.5, i.e., the input and reservoir statesare encoded between 0 and 128 thus providing a phasemodulation of the light from 0 to π. Furthermore, the

FIG. 3. Experimental Kuramoto-Sivashinsky spatiotemporalchaotic datasets prediction by optical reservoir computing.The spatial domain size of the chaotic system is L = 22. Thenumber of the photonic nodes in the reservoir is Dres = 104.(a) Actual data. (b) Reservoir prediction. (c) Error: panel(a) minus panel (b). t = 0 corresponds to the start of theprediction in the test phase. Each unit on the temporal axisrepresents the Lyapunov time defined by the largest Lyapunovexponent Λmax and detailed in the Appendix.

macropixel sizes are taken pres = 64 and pin = 10000to ensure equal importance ratios between the input andreservoir states encoded on the SLM. Consequently, dur-ing the RC process, the total number of pixels occupiedon the SLM by the input and the reservoir states to-gether is equal to presDres + pinDin = 128 · 104. We alsoapply a slight regularization with λ = 0.07 during thelinear regression process (see Eqs. (2)-(4)). Noteworthy,the nonlinear activation function provided by the cameraintensity readout may easily be further tuned through-out the grid search process as well. There are two rela-tively simple options to tune the nonlinear readout thatwe could explore in the future: to change the camera gainparameter as an analog solution or to apply an additionalnonlinear function in the computer on the detected cam-era image as a numerical solution. Both approaches mayimprove the performance of our optical scheme, but forthe sake of simplicity, we remained with the basic non-linearity provided by the system that already providesgood results.

Fig. 3 shows an example of the true KS dataset (seepanel (a)), the corresponding prediction (see panel (b)),and their difference (see panel (c)). As it is seen, the opti-cal reservoir network can predict with excellent accuracythe dynamical change of the KS dataset up to two Lya-punov time. Lyapunov time is a characteristic quantityof the dynamical chaotic systems defining the minimumamount of the time for two infinitesimally close states

5

FIG. 4. (a) Normalized root mean square errors (NRMSE)calculated for 100 set of training and testing KS datasets hav-ing the same parameters of the problem as in Fig. 3. (b) themean NRMSE as a result of averaging the panel (a) along itsvertical axis.

of the system to diverge by a factor of e. The latter isdefined by the largest Lyapunov exponent Λmax, and inthis particular case Λmax = 0.043 (see the Appendix andTable I). Furthermore, for quantitative analyses, we re-peat the same experiment of Fig. 3 for 100 different setsof training and testing datasets. The RMSE values foreach testing sample is calculated and normalized accord-ing to the RMSE of a random prediction, namely o(t) isa random matrix having the same dimensions as o. Ac-cordingly, the normalized RMSE (NRMSE) value closeto one means that the network does not perform betterthan a random prediction. Fig. 4 shows the NRMSE de-pendencies for each testing sample (see panel (a)) and themean NRMSE curve averaged over all the 100 samples(see panel(b)). We note that the prediction performancevaries significantly depending on the test sample, as seenfrom Fig. 4(a). This effect is related to the RC algorithmin general which is addressed in [50].

Although the prediction results of Fig. 3 and Fig. 4indicate the potential of the optical RC to predict largespatiotemporal chaos, we emphasize that, for larger sizesof the problem, i.e., for larger values of L, in order to getqualitatively similar prediction performances, one needsto increase the size of the reservoir. To this end, we per-formed experiments applying the same reservoir networkhosting Dres = 104 photonic nodes on KS datasets withthe spatial sizes of L = 12, 22, 36, 60, and 100. Asseen in Fig. 5(a), the prediction performance of opticalRC decreases rapidly as the system size L increases. Onthe other hand, for the given KS dataset of spatial sizeL = 60, Fig. 5(b) shows that the prediction performanceof our optical scheme is recovered back by increasing thesize of the network. In both plots, the temporal axis isnormalized according to the Λmax = 0.043 correspondingto L = 22, however, we note that the value of the largestLyapunov exponent is dependent on the spatial domainsize L of the system (see Table. I in Appendix). Finally,the different reservoir dimensions in Fig. 5(a) imply dif-ferent macropixel sizes of encoding in order to maintain

FIG. 5. (a) The mean NRMSE in the predictions of the KSsystem as a function of time using the same optical networkas in Fig. 3 and Fig. 4 but for different system sizes L =12, 22, 36, 60, and 100. (b) For the case of L = 60, we observeimprovement of the prediction performance as the numberof photonic nodes in reservoir increases from Dres = 104 toDres = 5 · 104.

the same overall encoding number of pixels on the SLMcorresponding to the reservoir states.

Note that, the realization of large reservoir networksin conventional computing is not an easy task since thecomputation time grows quadratically with respect to thenumber of network nodes. Therefore, Pathak et al. pro-posed in [43] a new scheme consisting of a large set ofparallel reservoirs of moderate sizes, each of which pre-dicts a local region of the spatiotemporal chaos. However,in optical RC we are able to realize large networks dueto its intrinsic properties of parallelism. As a proof ofprinciple, we performed a number of experiments on ouroptical scheme for different reservoir sizes and recordedthe average time of the reservoir updating process. Weuse the same parameters of the problem as in Fig. 3, butwithout applying a Kronecker product in Eq. (5), i.e,taking pres = pin = 1. Consequently, each pixel of theSLM is one node in the optical network. We also per-formed numerical computations with conventional RCfor the same reservoir sizes. Fig. 6 shows that the op-

FIG. 6. The time of the one reservoir updating period in theconventional computer compared with the proposed opticalscheme for different reservoir sizes. The inset shows a zoomaround the turning point of the reservoir size Dres = 0.25·105,where the conventional RC starts to be slower than the opticalRC.

6

tical RC is relatively slower than the conventional RConly for small reservoir sizes, Dres < 25000. The situ-ation changes rapidly for large network sizes, since thecomputation time of optical RC scales with a mild lin-ear dependence with respect to the number of nodes ofthe reservoir, in contrast to the conventional RC, whichexhibits a quadratic growth in time. Hence, for largereservoir sizes, our optical network is much faster thanconventional reservoir computers. Noteworthy, the opti-cal computation in our setup is inherently parallel, andthe linear slope is only due to the limited communicationbandwidth from the camera to the SLM. Furthermore,the large reservoirs require tremendous sizes of operat-ing memory from the conventional computers to storethe large random connection matrices Wres and Win,while our optical scheme can leverage large networks of106 photonic nodes without using large sizes of operatingmemory. We note that the conventional RC tests havebeen performed on a high-end computer with one of thelatest generation processors of Intel having 14 cores andsupported by 64 GB operative memory [51]. We empha-size that presently faster SLMs and cameras are availablethat can considerably lower the absolute time of compu-tation in our optical scheme, maintaining its linear de-pendence on the size of the reservoir (see more informa-tion about the SLM and the camera used in our setup inthe Appendix).

Finally, we stress that the advantage of our opticalscheme over other optical realizations is not only due tothe possibility of using a large number of pixels from thecamera and SLM as nodes in the optical network. An im-portant advantage lies in using the complex of the multi-ply scattering medium, that corresponds a random mix-ing of millions of SLM modes to millions of CCD pixels,which allows to reach such large network sizes [41]. Wave-front shaping techniques have already reached the millionmode milestone, e.g. in [42], where authors achieved alight focusing through the scattering medium with anunprecedented enhancement factor. Relatively large net-work sizes are also reachable using diffractive optics, forinstance, the possibility to reach up to 30000 nodes hasbeen claimed in [52], however, without all-to-all randomconnectivity allowed by the complex mixing process.

V. DISCUSSION AND CONCLUSION

To estimate the computing performances our simplesetup can reach, we can estimate the average number ofoperations per second performed during the process ofthe RC. As a rough estimate, the optical scheme we pro-pose can host 106 photonic nodes in the network (limitedby the pixel numbers on SLM and CCD respectively).One iteration of the network approximately correspondsto ·1012 trivial mathematical operations in Eq. (1), suchas multiplication, sums, etc. Assuming that the SLMand the camera have typical speeds of 100 Hz, our opticalsetup will perform on the order of 1014 OPS (operations

per second). This is not far from the current state-of-the-art of supercomputers, which ranges from 1015 OPS to1017 OPS. Consequently, without significant energy con-sumption and nor a large number of processing units, theoptical setup we propose can perform an RC close to theperformances of the supercomputers of current state-of-the-art technology. Note that similar calculation havebeen performed using the hardware of LightOn, withdifferent modulation scheme (binary amplitude modula-tion) in [38, 53].

Although light propagation in our optical setup pro-vides fully parallel information processing independentlyof the size of the network, Fig. 6 shows that the electronicfeedback from the camera to SLM is a bottleneck result-ing in a slight linear growth of the overall computationtime as the amount of the data increases. One way toovercome this might be the use of the field-programmablegate arrays (FPGAs) instead of the computer in the setupto provide the information transfer in much larger band-widths. Furthermore, FPGAs contain an array of pro-grammable logic blocks that can be configured to applya given complex operation on the data transferred fromthe camera to the SLM. Another approach that can im-pact the overall computation speed is based on nonlinearlight-matter interactions, where the naturally generatedresponse from the matter can be used as feedback of theRC network [54–58].

In conclusion, we proposed an optical reservoir com-puting network that can perform, for the first time to ourknowledge, predictions on large multidimensional chaoticdatasets. We used the Kuramoto-Sivashinsky equationas an example of a spatiotemporal chaotic system. Ourpredictions on the chaotic systems of large spatial sizesconfirm that in order to have comparable prediction per-formances one has to increase the optical network sizestoo. Finally, we experimentally demonstrated that ouroptical network can be scaled to a million of nodes. Itscomputation time only grows linearly with the the num-ber of nodes increases, due to electronic overheads, whilethe speed of the optical part (the matrix multiplication)is independent of the reservoir size and does not requireany memory storage. Our results, that are very hard toachieve by conventional Turingvon Neumann machines,open the prospect to achieve predictions on very largedatasets of practical interest, such as turbulence, at highspeed and low energy consumption.

ACKNOWLEDGEMENTS

We Acknowledge funding from the Defense AdvancedResearch Projects Agency (DARPA) under AgreementNo. HR00111890042. Sylvain Gigan and Jonathan Dongalso acknowledge partial support from H2020 EuropeanResearch Council (ERC) (Grant 724473).

7

VI. APPENDIX:METHODS

A. Experimental setup

The laser beam with 532 nm wavelength is expandedusing a beam expander (BE) with 10× optical magnifi-cation. The linear polarizer LP1 and the half-wave plate(HWP) are used to polarize the light parallel to the ex-traordinary axis of the liquid crystal SLM to ensure apure phase shaping of the light. The SLM receives fromthe computer the consequent input information i(t) con-catenated with the reservoir state r(t) at the given mo-ment and imprints it into the phase spatial profile of thereflected beam. The light propagates further through thefirst objective O1 with 10× optical magnification and nu-merical aperture NA = 0.1. Furthermore, the light getsfocused on the strongly scattering medium (SM) with ap-proximately 0.5 mm thickness and the scattered light iscollected by the second objective O2 with 20× opticalmagnification and numerical aperture NA = 0.4. The re-sulted intensity speckle pattern is detected by the CMOScamera. We use a second linear polarizer LP2 in front ofthe camera, with the polarization axis oriented orthog-onal to the initial polarization of the beam in order toenhance the contrast of detected speckle pattern. In thefinal stage, the camera sends back to the computer thedetected speckle pattern as a new state of the reservoir,that is going to be displayed on the SLM with new in-put, and the process repeats. We have used in our ex-perimental setup a liquid crystal SLM from MeadowlarkOptics (model: HSP192-532) and CMOS camera fromBasler (model: acA2040-55um) respectively having 1920x 1152 and 2048 x 1536 spatially distributed pixels andrespectively providing ∼50 Hz and 64 Hz speeds at fullyfunctioning regimes.

B. Light scattering

When light encounters refractive index inhomo-geneities, it gets scattered and its direction of propa-gation is modified. Light scattering through the thickscattering medium is a complex process accompanied bya tremendously high number of scattering events and atthe exit of the scattering medium, one typically observesa speckle pattern. The speckle pattern is the total inter-ference between all complex scattering paths. Thanks toa large number of scattering events, the speckle image isseemingly random and its statistical properties are wellcharacterized [59]. It represents a signature of the par-ticular disordered medium and for a given incident fieldwill be different from one scattering sample to another.

Light propagation in the multiple scattering regimestill is a linear process. Therefore, the output over aset of detectors for the given set of input sources can bedescribed as the product between the incident electricfield and the Transmission Matrix (TM). So, the TMis a characteristic for the particular setup including the

input sources, output detectors, and all the optical ele-ments with the scattering medium used inside the setup.As shown in [41, 45], the TM is a dense random matrixwhen a thick disordered medium is placed between theSpatial Light Modulator (SLM) and the camera, and itcan be measured experimentally. Nowadays, SLMs andcameras based on silicon photonics can afford a few mil-lions of pixels, thus the TM in conventional computerscan reach gigantic sizes. We cannot possibly hope tomeasure such a large matrix, as it will require very largecomputation time, and it would be impossible to store itin the memory of a computer. However, we can leveragethe very large dimensionality of TM without measuringit by using in well-developed algorithms where the ex-plicit form of the TM is not required [44]. One of thosealgorithms is reservoir computing (RC), which requireslarge random matrices held fixed throughout the wholecomputation process.

C. Kuramoto-Sivashinsky time series

KuramotoSivashinsky (KS) equation is a model ofnonlinear partial differential equation frequently encoun-tered in the study of nonlinear chaotic systems with in-trinsic instabilities, such as wave propagation in chem-ical reaction-diffusion systems, the velocity of laminarflame front instabilities, thin fluid film flow down in-clined planes and hydrodynamic turbulence [60]. Veryinterestingly, a chimera state which is an unexpectedsolution arising in the electro-optic delayed dynamicalsystems can also be described by the KS equation [18].The one-dimensional KuramotoSivashinsky partial differ-ential equation is

ut = −uux − uxx − uxxxx , (7)

where we assume that the scalar field u = u(x, t) is peri-odic with period L, u(x+L, t) = u(x, t), thus the solutionis defined in the interval [0, L). Note that the dimensionof the attractor is defined by the value of L and the de-pendence is linear for large values of L. We integrate theEq. (7) on a grid of Q = 64 equally spaced spatial pointswith ∆t = 0.25 time-step as in [43]. The obtained solu-tion will contain Q time series, which we denote by thevector u(t) and use as the reservoir input.

The dynamics of chaotic systems can be described bya quantity called Lyapunov exponent that measures theexponential divergence of initially close trajectories in thephase space of the system. In dynamical system theory, aphase space is a space in which all possible states of a sys-tem are represented as unique points. As is known, the

TABLE I. The largest Lyapunov exponent for different spa-tial domain sizes.

L 12 22 36 60 100Λmax 0.003 0.043 0.080 0.089 0.088

8

spatial domain size L of the KS system strongly affects itsdynamics thus changing the corresponding largest Lya-

punov exponent. We provide in Table I the Λmax valuesfor typical domain sizes as measured in [61].

[1] D. Verstraeten, B. Schrauwen, M. dHaene, andD. Stroobandt, An experimental unification of reservoircomputing methods, Neural networks 20, 391 (2007).

[2] M. Lukosevicius and H. Jaeger, Reservoir computing ap-proaches to recurrent neural network training, ComputerScience Review 3, 127 (2009).

[3] H. Jaeger, The echo state approach to analysing andtraining recurrent neural networks-with an erratum note,Bonn, Germany: German National Research Center forInformation Technology GMD Technical Report 148, 13(2001).

[4] W. Maass, T. Natschlager, and H. Markram, Real-timecomputing without stable states: A new framework forneural computation based on perturbations, Neural com-putation 14, 2531 (2002).

[5] G. Tanaka, T. Yamane, J. B. Heroux, R. Nakane,N. Kanazawa, S. Takeda, H. Numata, D. Nakano, andA. Hirose, Recent advances in physical reservoir comput-ing: A review, Neural Networks (2019).

[6] P. Antonik, A. Smerieri, F. Duport, M. Haelterman, andS. Massar, Fpga implementation of reservoir computingwith online learning, in 24th Belgian-Dutch Conferenceon Machine Learning (2015).

[7] C. Donahue, C. Merkel, Q. Saleh, L. Dolgovs, Y. K.Ooi, D. Kudithipudi, and B. Wysocki, Design and analy-sis of neuromemristive echo state networks with limited-precision synapses, in 2015 IEEE Symposium on Com-putational Intelligence for Security and Defense Applica-tions (CISDA) (IEEE, 2015) pp. 1–6.

[8] M. Dale, J. F. Miller, S. Stepney, and M. A. Trefzer,Evolving carbon nanotube reservoir computers, in In-ternational Conference on Unconventional Computationand Natural Computation (Springer, 2016) pp. 49–61.

[9] C. Fernando and S. Sojakka, Pattern recognition ina bucket, in European conference on artificial life(Springer, 2003) pp. 588–597.

[10] S. Ghosh, A. Opala, M. Matuszewski, T. Paterek, andT. C. Liew, Quantum reservoir processing, npj QuantumInformation 5, 35 (2019).

[11] J. Moon, W. Ma, J. H. Shin, F. Cai, C. Du, S. H. Lee,and W. D. Lu, Temporal data classification and forecast-ing using a memristor-based reservoir computing system,Nature Electronics 2, 480 (2019).

[12] G. Van der Sande, D. Brunner, and M. C. Soriano, Ad-vances in photonic reservoir computing, Nanophotonics6, 561 (2017).

[13] L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danck-aert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mi-rasso, and I. Fischer, Information processing using a sin-gle dynamical node as complex system, Nature commu-nications 2, 468 (2011).

[14] Y. Paquot, J. Dambre, B. Schrauwen, M. Haelterman,and S. Massar, Reservoir computing: a photonic neuralnetwork for information processing, in Nonlinear Opticsand Applications IV, Vol. 7728 (International Society forOptics and Photonics, 2010) p. 77280B.

[15] Y. Paquot, F. Duport, A. Smerieri, J. Dambre,

B. Schrauwen, M. Haelterman, and S. Massar, Opto-electronic reservoir computing, Scientific reports 2, 287(2012).

[16] L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M.Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer,Photonic information processing beyond turing: an opto-electronic implementation of reservoir computing, Opticsexpress 20, 3241 (2012).

[17] D. Brunner, B. Penkovsky, B. A. Marquez, M. Jacquot,I. Fischer, and L. Larger, Tutorial: Photonic neural net-works in delay systems, Journal of Applied Physics 124,152004 (2018).

[18] J. D. Hart, L. Larger, T. E. Murphy, and R. Roy, De-layed dynamical systems: networks, chimeras and reser-voir computing, Philosophical Transactions of the RoyalSociety A 377, 20180123 (2019).

[19] F. Duport, A. Smerieri, A. Akrout, M. Haelterman, andS. Massar, Fully analogue photonic reservoir computer,Scientific reports 6, 22381 (2016).

[20] F. Duport, A. Smerieri, A. Akrout, M. Haelterman, andS. Massar, Virtualization of a photonic reservoir com-puter, Journal of lightwave technology 34, 2085 (2016).

[21] R. Martinenghi, S. Rybalko, M. Jacquot, Y. K. Chembo,and L. Larger, Photonic nonlinear transient computingwith multiple-delay wavelength dynamics, Physical re-view letters 108, 244101 (2012).

[22] S. Ortın, M. C. Soriano, L. Pesquera, D. Brunner, D. San-Martın, I. Fischer, C. Mirasso, and J. Gutierrez, A unifiedframework for reservoir computing and extreme learningmachines based on a single time-delayed neuron, Scien-tific reports 5, 14945 (2015).

[23] Q. Vinckier, F. Duport, A. Smerieri, K. Vandoorne,P. Bienstman, M. Haelterman, and S. Massar, High-performance photonic reservoir computer based on a co-herently driven passive cavity, Optica 2, 438 (2015).

[24] B. Schneider, J. Dambre, and P. Bienstman, Using digitalmasks to enhance the bandwidth tolerance and improvethe performance of on-chip reservoir computing systems,IEEE transactions on neural networks and learning sys-tems 27, 2748 (2015).

[25] D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer,Parallel photonic information processing at gigabyte persecond data rates using transient states, Nature commu-nications 4, 1364 (2013).

[26] L. Larger, A. Baylon-Fuentes, R. Martinenghi, V. S.Udaltsov, Y. K. Chembo, and M. Jacquot, High-speedphotonic reservoir computing using a time-delay-basedarchitecture: Million words per second classification,Physical Review X 7, 011015 (2017).

[27] X. X. Guo, S. Y. Xiang, Y. H. Zhang, L. Lin, A. J. Wen,and Y. Hao, Four-channels reservoir computing based onpolarization dynamics in mutually coupled vcsels system,Optics express 27, 23293 (2019).

[28] Y.-S. Hou, G.-Q. Xia, E. Jayaprasath, D.-Z. Yue, W.-Y.Yang, and Z.-M. Wu, Prediction and classification per-formance of reservoir computing system using mutuallydelay-coupled semiconductor lasers, Optics Communica-

9

tions 433, 215 (2019).[29] S. Ortın and L. Pesquera, Reservoir computing with an

ensemble of time-delay reservoirs, Cognitive Computa-tion 9, 327 (2017).

[30] L. Keuninckx, J. Danckaert, and G. Van der Sande, Real-time audio processing with a cascade of discrete-time de-lay line-based reservoir computers, Cognitive Computa-tion 9, 315 (2017).

[31] B. Penkovsky, X. Porte, M. Jacquot, L. Larger,and D. Brunner, Coupled nonlinear delay systems asdeep convolutional neural networks, arXiv preprintarXiv:1902.05608 (2019).

[32] M. R. Salehi and L. Dehyadegari, Optical signal process-ing using photonic reservoir computing, Journal of Mod-ern Optics 61, 1442 (2014).

[33] K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen,and P. Bienstman, Parallel reservoir computing using op-tical amplifiers, IEEE transactions on neural networks22, 1469 (2011).

[34] K. Vandoorne, W. Dierckx, B. Schrauwen, D. Ver-straeten, R. Baets, P. Bienstman, and J. Van Camp-enhout, Toward optical signal processing using photonicreservoir computing, Optics express 16, 11182 (2008).

[35] K. Vandoorne, P. Mechet, T. Van Vaerenbergh, M. Fiers,G. Morthier, D. Verstraeten, B. Schrauwen, J. Dambre,and P. Bienstman, Experimental demonstration of reser-voir computing on a silicon photonics chip, Nature com-munications 5, 3541 (2014).

[36] D. Brunner and I. Fischer, Reconfigurable semiconduc-tor laser networks based on diffractive coupling, Opticsletters 40, 3854 (2015).

[37] J. Bueno, S. Maktoobi, L. Froehly, I. Fischer, M. Jacquot,L. Larger, and D. Brunner, Reinforcement learning in alarge-scale photonic recurrent neural network, Optica 5,756 (2018).

[38] J. Dong, S. Gigan, F. Krzakala, and G. Wainrib, Scalingup echo-state networks with multiple light scattering, in2018 IEEE Statistical Signal Processing Workshop (SSP)(IEEE, 2018) pp. 448–452.

[39] J. Dong, M. Rafayelyan, F. Krzakala, and S. Gigan, Op-tical reservoir computing using multiple light scatteringfor chaotic systems prediction, IEEE Journal of SelectedTopics in Quantum Electronics 26, 1 (2019).

[40] U. Paudel, M. Luengo-Kovac, J. Pilawa, T. J. Shaw, andG. C. Valley, Classification of time-domain waveforms us-ing a speckle-based optical reservoir computer, OpticsExpress 28, 1225 (2020).

[41] S. Rotter and S. Gigan, Light fields in complex media:Mesoscopic scattering meets wave control, Reviews ofModern Physics 89, 015005 (2017).

[42] H. Yu, K. Lee, and Y. Park, Ultrahigh enhancement oflight focusing through disordered media controlled bymega-pixel modes, Optics express 25, 8036 (2017).

[43] J. Pathak, B. Hunt, M. Girvan, Z. Lu, and E. Ott, Model-free prediction of large spatiotemporally chaotic systemsfrom data: A reservoir computing approach, Physical re-view letters 120, 024102 (2018).

[44] A. Saade, F. Caltagirone, I. Carron, L. Daudet,A. Dremeau, S. Gigan, and F. Krzakala, Random projec-tions through multiple optical scattering: Approximatingkernels at the speed of light, in 2016 IEEE InternationalConference on Acoustics, Speech and Signal Processing(ICASSP) (IEEE, 2016) pp. 6215–6219.

[45] S. Popoff, G. Lerosey, R. Carminati, M. Fink, A. Boc-

cara, and S. Gigan, Measuring the transmission matrixin optics: an approach to the study and control of lightpropagation in disordered media, Physical review letters104, 100601 (2010).

[46] N. Bertschinger and T. Natschlager, Real-time computa-tion at the edge of chaos in recurrent neural networks,Neural computation 16, 1413 (2004).

[47] J. Pathak, Z. Lu, B. R. Hunt, M. Girvan, and E. Ott, Us-ing machine learning to replicate chaotic attractors andcalculate lyapunov exponents from data, Chaos: An In-terdisciplinary Journal of Nonlinear Science 27, 121102(2017).

[48] R. Davidchack, Kuramoto-sivashinsky simulations, inChaos: Classical and Quantum, edited by P. Cvitanovic,R. Artuso, R. Mainieri, G. Tanner, and G. Vattay (NielsBohr Institute, 2012) 14th ed., Chap. 29.

[49] F. Schurmann, K. Meier, and J. Schemmel, Edge of chaoscomputation in mixed-mode vlsi-a hard liquid, in Ad-vances in neural information processing systems (2005)pp. 1201–1208.

[50] J. Isensee, G. Datseris, and U. Parlitz, Predicting spatio-temporal time series using dimension reduced localstates, arXiv preprint arXiv:1904.06089 (2019).

[51] Dell Precision 7920 Workstation Desktop Tower with 2xIntel Xeon Gold 5120 (14 cores, 2.2-3.7 GHz Turbo, 19Mb cache), 64 Gb 2666 MHz DDR4, 2 SSD M.2 2 TbPcie NVMe.

[52] S. Maktoobi, L. Froehly, L. Andreoli, X. Porte,M. Jacquot, L. Larger, and D. Brunner, Diffractive cou-pling for photonic networks: how big can we go?, IEEEJournal of Selected Topics in Quantum Electronics 26, 1(2019).

[53] R. Ohana, J. Wacker, J. Dong, S. Marmin, F. Krzakala,M. Filippone, and L. Daudet, Kernel computations fromlarge-scale random features obtained by optical process-ing units, arXiv preprint arXiv:1910.09880 (2019).

[54] T. W. Hughes, I. A. Williamson, M. Minkov, and S. Fan,Wave physics as an analog recurrent neural network,arXiv preprint arXiv:1904.12831 (2019).

[55] G. Marcucci, D. Pierangeli, and C. Conti, Theory ofneuromorphic computing by waves: machine learningby rogue waves, dispersive shocks, and solitons, arXivpreprint arXiv:1912.07044 (2019).

[56] Y. Zuo, B. Li, Y. Zhao, Y. Jiang, Y.-C. Chen, P. Chen,G.-B. Jo, J. Liu, and S. Du, All optical neural net-work with nonlinear activation functions, arXiv preprintarXiv:1904.10819 (2019).

[57] T. Yan, J. Wu, T. Zhou, H. Xie, F. Xu, J. Fan, L. Fang,X. Lin, and Q. Dai, Fourier-space diffractive deep neuralnetwork, Physical review letters 123, 023901 (2019).

[58] X. Guo, T. D. Barrett, Z. M. Wang, and A. Lvovsky,End-to-end optical backpropagation for training neuralnetworks, arXiv preprint arXiv:1912.12256 (2019).

[59] J. W. Goodman, Speckle phenomena in optics: the-ory and applications (Roberts and Company Publishers,2007).

[60] P. Hohenberg and B. I. Shraiman, Chaotic behavior of anextended system, Physica D: Nonlinear Phenomena 37,109 (1989).

[61] R. A. Edson, J. E. Bunder, T. W. Mattner, andA. J. Roberts, Lyapunov exponents of the kuramoto–sivashinsky pde, The ANZIAM Journal 61, 270 (2019).

http://chaosbook.org/extras/KSEproject/html/index.html

Date post:	15-Aug-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Prediction · 2020-01-27 · Large-Scale Optical Reservoir Computing for Spatiotemporal Chaotic...

Documents