+ All Categories
Home > Documents > Forecasting Disruptions in the ADITYA Tokamak

Forecasting Disruptions in the ADITYA Tokamak

Date post: 10-Apr-2018
Category:
Upload: ali61n60
View: 215 times
Download: 0 times
Share this document with a friend

of 16

Transcript
  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    1/16

    Forecasting disruptions in the ADITYA tokamak

    using neural networks

    A. Sengupta, P. Ranjan

    Institute for Plasma Research,Bhat, Gandhinagar, India

    Abstract. A neural network technique has been used to predict disruptions in the ADITYA tokamak.A time series prediction method is employed whereby a series of past values of some time dependent

    quantity is used to predict its value in the future. The time varying observables used in the present

    work are the different diagnostic signals from four Mirnov probes, one soft X ray monitor and one H monitor. The predicted quantities are the same observables at some future time. The neural network

    is trained with the past values of the different diagnostic signals as inputs and the future values of the

    same quantities as targets. The trained neural network is used to forecast in a multistep sequence. This

    amounts to a prediction several time steps earlier. Very good prediction results have been obtained

    up to 8 ms earlier with little distortion of the signals and no appreciable time lag, a capability which

    is believed to be well suited to the task of on-line predictions of disruptions in ADITYA. As actual

    experimental signals are used, confidence regarding the performance of the neural network on hardware

    implementation is automatically ensured.

    1. Introduction

    Disruption in tokamaks is a sudden loss of confine-

    ment and subsequent transfer of plasma energy to thesurrounding structures. As a result the machine walls

    and the supporting structures are subjected to enor-

    mous heat load causing moderate to severe damage.

    Disruptions also result in rapid plasma current decay,

    which induces large electric fields that in turn drive

    large eddy currents in the conducting structures andmechanical supports. This results in enormous jB

    forces. The damage caused by these forces determinesthe lifetime of a machine. Disruption avoidance, or

    minimization of disruptivity, therefore, is important

    for cost effective operation of tokamaks.

    Artificial neural networks (ANNs) have alreadybeen used for studying different aspects of tokamak

    plasmas. These include fast estimation of plasma

    parameters in DIII-D [1], ASDEX Upgrade [2] and

    ITER [3], as a means of predicting disruptions [46]

    and the vertical position of the plasma current cen-

    troid [7]. It has also been used to order the magnetic

    sensors according to their importance in the estima-

    tion of plasma parameters [2, 3].The motivation for using ANNs for prediction of

    disruptions came from the early use of ANNs in

    various forecasting applications [8, 9]. However, the

    ultimate aim of the prediction will be to make an

    Corresponding author.

    attempt to reduce the frequency of disruptions on-

    line in hardware. Therefore, if used as a disruption

    alarm, an ANN should not only give an accurate pre-

    diction of an approaching disruption, but also should

    make this prediction sufficiently early to allow formeasures to be taken to soften the impact of disrup-

    tion. In this article, ways to predict plasma disrup-

    tions in the ADITYA tokamak [10, 11] are discussed,

    using time series of various time dependent quanti-

    ties obtained from diagnostics. These include fluctu-

    ations of the tangential component of poloidal mag-

    netic field B as measured by Mirnov probes placedat different poloidal locations around the plasma.

    These have been used earlier [4], where only a single

    probe is used as input to the ANN for the prediction.

    The results of that study are not suitable for the goal

    of disruption control, since:

    (a) Large errors are present for predictions more

    than 1.1 ms earlier.

    (b) An increasing time lag appears between the

    actual and the predicted instants of disruption

    as the prediction is made earlier and earlier.

    In a recent work [6], soft X rays have been used as

    inputs instead of magnetic signals, and the predic-

    tion is made 3.12 ms in advance of the event, whichis a 200% improvement over the results of Ref. [4].

    However, the time lag problem persists for predic-

    tions more than 3.12 ms in advance. For effective real

    Nuclear Fusion, Vol. 40, No. 12 c2000, IAEA, Vienna 1993

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    2/16

    A. Sengupta and D. Ranjan

    time measures to be taken, this time has to increase

    by at least a factor of 2.There are two goals for this article:

    (i) To use an ANN to predict the instant oftriggering a disruption.

    (ii) To use an ANN to make the prediction suf-

    ficiently early that measures can be taken to

    soften the impact of disruptions.

    The criterion for (i) above is that the exact instant

    of triggering of the disruptive instabilities should be

    picked up, rather than the instant of current decay,

    because once current quench starts, control mea-sures, even if taken, may prove futile. The triggering

    of instabilities is signalled primarily by:

    (a) Increased MHD activities around the plasma

    edge, primarily the (m,n) = (2, 1) mode, picked

    up by a set of Mirnov coils located around the

    plasma. These immediately precede the thermal

    quench.

    (b) A fall in the soft X ray (SXR) intensity at theplasma core, which immediately follows edge

    cooling.

    (c) Increased H emission.

    For (ii), the earliness, i.e. the extent of early pre-

    diction, which can be quantified by a time interval

    t, is the major issue. This t, when applied to dis-

    ruption avoidance or minimization, must be around

    57 ms for effective measures to be taken.The purpose of this article is to find out whether

    ANN architectures, different from those used earlier,

    and the use of additional diagnostic information helpimprove upon these results. So in addition to several

    Mirnov probe signals, soft X ray (SXR) and H emis-

    sion signals have also been used here. A series of val-ues of the diagnostic signals has been chosen as their

    past values, and a prediction involves a continuation

    of the series. This prediction can be a single time

    step in future, or several time steps. The latter rep-

    resents an earlier forecast, and this earliness can be

    increased by increasing the number of predicted time

    steps. However, since the prediction error increaseswith the increase in the number of time steps, the

    choice for sufficiently early prediction should neces-

    sarily be within permissible errors.

    The organization of this article is as follows. Sec-

    tion 2 contains a general treatment of time series pre-

    diction, while Section 3 discusses briefly ANNs andtheir relation to time series prediction. In Section 4

    an overview of the different ANN architectures used

    for time series prediction is given. Section 5 shows

    the preparation of the database, while Section 6 givesour forecasting results in detail, Section 7 discusses

    the results and Section 8 summarizes the results and

    conclusions.

    2. Time series prediction

    A time series [9] basically refers to a set of values

    which are taken to be measurements of an observ-able over time. The system on which the observ-

    able is being measured is evolving with time, i.e. it

    is a dynamical system. The observable is a functiononly of the state of the system; as soon as the sys-

    tem returns to the original state, the observable also

    returns to the original value.

    Let the state of the system at present be repre-

    sented by a and the observable being measured byp(a). It is assumed that state a contains all the infor-

    mation required to predict the state t time units into

    the future. Let the state at this future time be Ft(a).

    The prediction refers to the calculation of the observ-able at time t from a knowledge only of the present.

    Similarly, if one goes backwards in time from the

    present instant, a time series of past values of theobservable is obtained:

    b = [p(a), p(F(a)), p(F2(a)),....,p(Fm(a))]

    (1)

    where is the time step length or the rate of sam-

    pling of the observable. b is thus a segment of a time

    series where the time dependence is now expressed

    explicitly:

    b = [xt1 , xt1, .....,xt1m] (2)

    where x is the measured quantity xt1 = p(a) and t1is the present time instant.

    Equation (2) is the form of the time series that isgenerally used [4, 6, 8, 9].

    Prediction means estimating the measured vari-

    able at future times, i.e. the continuation of the series

    by way of extrapolation. For the extrapolation, some

    functional representation of the extrapolated (pre-

    dicted) value is required in terms of the given timeseries. This should have the following form:

    xpredt1+n = fn[xt1 , xt1, .....,xt1mr]. (3)

    The left hand side of the above equation gives

    the predicted value of the dynamical quantity at thefuture time t1 + n(n = 1, 2, ....), where again t1refers to the present. fn gives the functional form

    for the transformation. The problem, therefore, is

    1994 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    3/16

    Article: Neural network forecasts of disruptions in ADITYA

    to find an approximation for fn to bring about the

    extrapolation.Extrapolation schemes for fn can be divided into

    two broad categories, linear and non-linear. Linear

    models such as auto-regressive (AR), moving average

    (MA) or auto-regressive moving average (ARMA)

    have been most frequently used for time series anal-

    ysis [9]. These models work well only for simple

    time series and are most likely to fail for stochas-tic or chaotic series. Analysis of such complex series

    requires a long time history of the series, yielding

    very high order linear models, i.e. models involving

    a very large number of linear terms (corresponding

    to the past temporal points of the series). In prac-tice such high order models are impractical from a

    computational point of view.

    Non-linear techniques, such as the ANN, wavelet

    and chaos analysis can provide good insight into a

    complex time series when linear models fail (Ref. [12]

    and Refs [215] therein). The ANN algorithm invokes

    non-linear models that approximate a much broaderclass of functions than linear models, so that it can

    analyse any complex time series without involving

    large errors due to numerical instabilities.

    3. Artificial neural networks

    The ANN technique, which has its origins as an

    artificial model of the parallel processing capabili-

    ties of the human brain, is typically used in patternrecognition where a collection of images is presented

    to the network, and its task is to assign the images to

    one or more classes. Another typical use of the ANNis non-linear regression, where the algorithm is used

    to find a smooth interpolation between data points.

    By way of contrast, time series prediction involves

    processing of patterns which evolve over time, the

    response at a particular point of time depending not

    only on the current value of the observable, but also

    on the past. The ANN, of which the multilayer per-

    ceptron (MLP) is the most widely used type, consists

    of several layers of nodes or neurons, and represents

    an analytic mapping between a set of inputs xi anda set of outputs yk (shown in Fig. 1, where i = 1

    5, k = 12). The layer(s) not directly accessible to

    the user, referred to as the hidden layer(s), produce

    the inherent non-linearity in the transformation, and

    also increase the networks ability to model different

    classes of function. While the size of the input andoutput layers are determined by the problem being

    solved, the size of the hidden layer is determined by

    trial and error, from the training and testing errors.

    Signals, propagating in the forward direction only,i.e. from the input towards the output, impinging on

    a particular neuron j of a hidden layer, are weighted

    by certain factors to give the net input gj to the

    neuron j:

    gj =

    m

    i=0

    wjixi (4)

    where xi refers to the output of the ith neuron of theinput layer, m is the total number of input neurons

    and the weight wji represents the strength of the con-

    nection between the neuron j of a hidden layer and

    the neuron i of the input layer. i = 0 corresponds tothe bias term, whose value is x0. The non-linear func-

    tion usually chosen for the mapping is a sigmoidal

    function [13], acting on gj , with the form

    f(gj) =2

    1 + egj 1. (5)

    Neural network training refers to an adjustment of

    the weights to achieve the minimization of an error,called the mean square error, defined by

    E2 =

    k

    l

    (y(l)k y

    (l)desk )

    2

    NoutNex(6)

    where y(l)desk is the desired value of the kth output

    as determined by the lth member of a training data

    set. Nout and Nex are the total number of outputs

    and examples, respectively, in a given problem. Notethat E2 is averaged over all examples and all out-

    puts (normalized). Training is stopped when E2

    decreases to a pre-defined error goal.To evaluate the performance of the network, the

    same network with the correct weights is applied to

    another set of known input/output examples calledthe test dataset. If the network performance on this

    dataset is satisfactory, it is supposed to have a gener-

    alization capability over any set of similar data, and

    can be used to process the unknown data in those

    data sets.

    For time series analysis, the inputs to the ANN are

    the past values of the measured (temporally varying)quantity and the output is the predicted value. The

    more complex the time series, the more past infor-

    mation is needed. This results in a larger number

    of inputs and weights. The yks in the numerator of

    Eq. (6) are the outputs (ANN calculated and the tar-

    get) measured at a certain (future) time instant andare therefore local in time.

    The functional representation fn as shown in

    Eq. (3) is in general unknown and, for ANN

    Nuclear Fusion, Vol. 40, No. 12 (2000) 1995

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    4/16

    A. Sengupta and D. Ranjan

    Figure 1. Structure of the ANN. This shows a general 5:3:3:2 MLP-2 network. The

    offset bias is not shown.

    modelling, is usually approximated by a sigmoidalfunction, shown in Eq. (5). Here a very impor-

    tant property of ANNs is used, which is the fact

    that it is only the nature of the function, i.e.

    whether it is linear or non-linear, that determines

    a transformation, rather than its actual form. It

    is this property which is utilized while definingthe inherent non-linearity of the ANN by only cer-

    tain specific forms of sigmoidal functions, while

    the examples in different problems may involve a

    broad spectrum of non-linear functions. If the time

    series is multivariate rather than univariate, the

    scalars x and y representing the inputs and out-puts are to be replaced by vectors. In that case

    the product in Eq. (4) is also to be substituted

    by W x.

    4. General methods for the prediction

    There are three possible methods for the predic-

    tion of disruption from the past values of a given time

    series, using a feedforward neural network [9]. These

    methods are used to predict the dynamical observ-able at a future time t1 + n, i.e. xt1 + n, from the

    available data at time t1.Method 1. One possibility is to construct a sin-

    gle function f which predicts one point into the

    future, and iterate this function on its own out-

    puts to predict further into the future. Expressed

    mathematically,

    xpredt1+1

    = f(xt1 , xt11, ....) (7)

    xpredt1+2 = fx

    predt1+1, xt1 , xt11, ....) (8)

    ...

    xpredt1+n1

    = f(xpredt1+n2, xpredt1+n3

    ,....,xt1 , xt11,....)

    (9)

    xpredt1+n = f(x

    predt1+n1

    , xpredt1+n2

    ,....,xt1 , xt11, ....)

    (10)

    xpredt1+n is the predicted value of x at a time n steps

    ahead of t1.Method 2. One function can be constructed that

    uses only past data as inputs to directly predict one

    desired future point; i.e.

    xpredt1+n = f(xt1 , xt11, ....). (11)

    Method 3. Another method which can be pro-

    posed is to construct functions which take both

    previous predictions and past values as inputs, and

    predict only the future point as output:

    xpredt1+1

    = f1(xt1 , xt11,....) (12)

    xpredt1+2 = f2(xpredt1+1, xt1 , xt11, ....) (13)

    ...

    xpredt1+n1

    = fn1(xpredt1+n2

    , xpredt1+n3

    , xt1 , xt11,....)

    (14)

    1996 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    5/16

    Article: Neural network forecasts of disruptions in ADITYA

    Table 1. Major parameters of the tokamak ADITYA

    Parameters Design values Range of the discharges used

    Major radius (cm) 75

    Minor radius (cm) 25

    Plasma cross-section shape Circular

    Plasma current (kA) 250 80100

    Toroidal field at plasma centre (T) 1.5 0.75

    Plasma duration (ms) 300 6085Electron temperature (eV) 500 250300

    xpredt1+n = fn(x

    predt1+n1

    , xpredt1+n2

    , xt1 , xt11,....). (15)

    In all the above cases, n = 1 implies a single step

    prediction. Although both single and multistep pre-

    diction can be used, our primary aim will be the

    latter, since the application here requires long term

    prediction. Methods 1 and 3 are called iterated pre-

    diction methods, while method 2 is a direct predic-

    tion method.

    5. Database preparation

    The database for the prediction task was prepared

    using experimental ADITYA discharges (Table 1 lists

    some major parameters of ADITYA). One disruptivedischarge was used for training purposes and one for

    testing. Forecasting was then done with three disrup-

    tive discharges. The plasma discharges chosen for our

    work were all sampled at 0.02 ms.

    Ten past values of each of the input variables wereused, and one predicted value at the output, which

    was chosen many steps ahead, given the requirements

    for real time prediction. This number of past tem-

    poral points was slightly less than that used for the

    TEXT studies, where 15 past values of a single input

    were used. However, we shall see later that therewould be a total of 60 inputs in the present study,

    that would consist of 10 past temporal values of six

    different diagnostic signals. This would be shown to

    be the optimum number of inputs.

    The type of network chosen for this work was an

    MLP-2 ANN with two layers of 16 neurons each.

    The reason for using this rather than the MLP-1

    network lay in the quality of fitting. It was found

    that although the training error was less for the

    MLP-1 network with 32 hidden neurons, the testing

    error, as also the difference between the training andthe testing errors, was much smaller for the MLP-2

    network with the same 32 neurons divided equally

    between the two hidden layers. This was not surpris-

    ing, because if the number of input neurons is large

    in comparison with that of the hidden neurons (asin our case), an MLP-2 always contains a smaller

    number of weights and therefore shows a better

    generalization property than an MLP-1 network.

    Looking at the iterated methods in Section 4, it

    was observed that since the number of inputs and

    outputs increased with every iteration, long term

    prediction would be computationally intensive, whileit is known that for real time prediction of disrup-

    tions, these predictions should be long enough to ask

    for iterations of the order of 200400. Moreover, the

    single step predicted variable xt1+1, that is fed back

    to the input to predict xt1+2, is certainly not as accu-

    rate as the target xt1 , xt11, .... Therefore, the iter-ative method was not thought to be well suited to

    the task of disruption prediction. Hence method 2,

    the direct method, was used for our predictions. In

    the present study, with a sampling time of 0.02 ms,

    there were 50 predicted time steps corresponding toa prediction 1 ms earlier (i.e. n = 50 in Eq. (11)).

    Similarly n = 100 for a 2 ms early prediction andn = 400 when a forecast is made 8 ms in advance.

    The non-linear mapping was brought about by

    the sigmoid of Eq. (5). This is a symmetric sigmoid,

    bounded in the interval [1,+1]. The inputs and out-

    puts were normalized in the same interval. Without

    this normalization, a normalization constant would

    have been required in Eq. (6), as the outputs had

    different dimensions.

    The ANN was trained using the general adap-

    tive recipe (GAR) algorithm [14]. Learning rate

    or gradient descent step length was initialized to

    1.0. On-line modification of the learning rate was

    possible in GAR, through specification of up and

    down adaptation parameters which were set at0.002 and 0.8, respectively. These values were deter-

    mined by the network training process. A larger up

    Nuclear Fusion, Vol. 40, No. 12 (2000) 1997

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    6/16

    A. Sengupta and D. Ranjan

    5

    0

    5

    expt.

    5

    0

    5

    predicted t = 1ms

    5

    0

    5

    predicted t = 2ms

    5

    0

    5

    predicted t = 3ms

    55 60 65 70 75 80 85 90

    5

    0

    5

    time(ms)

    predicted t = 4ms

    Figure 2. Using only soft X ray signals as input to the neural network, the quality

    of prediction for t = 1, 2, 3 and 4 ms, respectively, are compared with the actual

    signal. It is observed that for t = 3 ms, a time lag appears for the first time in the

    predicted signal with respect to the actual experimental data. This lag increases with

    higher t. The vertical lines represent the instant the disruption is actually triggered.

    adaptation increased the gradient descent step length

    so much as to often overshoot the minima, whereby

    the error increased. A smaller down adaptation did

    not reduce the learning rate enough, so that after afew iterations the learning rate increased once again

    to overshoot the minimum. This effectively slowed

    down the training.

    To begin with, the ANN was trained with only one

    diagnostic signal. This was to test the performance of

    the network with similar input information as that

    already used in Refs [4] and [6]. First, one Mirnov

    probe was used as input, followed by the SXR signal.

    Finally, only the H signal was used as the single

    input. From the training stage itself it became clear

    that the network required additional information to

    learn the trends in the data as the learning remained

    very slow throughout. The only exception was the

    training with the H signal, when the error reducedmuch faster.

    The performance of the trained network, fed

    with SXR signals, in forecasting disruptions is pre-

    sented in Fig. 2. The vertical lines denote the actual

    triggering instant of the instabilities. The main

    observation here is that the instant of prediction

    of the triggering of the disruption started laggingbehind with respect to the actual signal when pre-

    diction was done 3 ms or more early. This more or

    less agreed with the results of Ref. [6].

    The number of inputs was then increased by

    choosing two Mirnov probes and the SXR and Hsignals. The Mirnov probes chosen first were two

    closely located ones, at poloidal angles of 114 and

    138. It was observed that the learning rate wors-

    ened, as did the forecasting errors on a new dis-

    charge. Next, two probes located more or less dia-metrically opposite to each other were selected, at

    angles of 42 and 234. For this set of inputs, the

    learning improved over the SXR case but was worse

    than that of the H case. The performance of the

    ANN on new data, however, remained more or less

    the same as that on the single input cases. The exper-iment was repeated with similar inputs, but now the

    two Mirnov probes were those located at 138 and

    330. A much improved generalization capability of

    1998 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    7/16

    Article: Neural network forecasts of disruptions in ADITYA

    Table 2. Comparison of the mean square training

    errors for the ANN provided with different combinations

    of diagnostic signals as inputs

    Combination of inputs Training error

    SXR 0.0165

    H 2.65 104

    Four inputsa 0.0054

    Four inputsb

    0.0103Six inputsc 0.0086

    a Four inputs: Inputs comprised of Mirnov probes at 42

    and 234, together with SXR and H.b Four inputs: Inputs comprised of Mirnov probes at 138

    and 330, together with SXR and H.c Six inputs: Inputs comprised of Mirnov probes at 42,

    138, 234 and 330, together with SXR and H.

    the ANN was noticed. Moreover, the ANN seemed

    to have gained a better tolerance for long term

    predictions.

    The number of inputs was further increased tofour magnetic signals from probes located in the four

    quadrants around the plasma at angles 42, 138, 234

    and 330, together with the SXR and H signals.

    Although the execution time increased because of a

    larger ANN structure, this set of inputs clearly pro-

    duced an overall improvement in the fitting.

    This observation was believed to be due to the uni-

    formity of the probe locations around the plasma so

    that more information was now put into the network.This was corroborated by the fact that initially the

    choice of Mirnov probes located diametrically oppo-

    site improved the performance, as compared withthe set of signals from two closely located probes.

    Table 3. Comparison of ANN p erformance with respect to mean square error E2 for single input and multiple

    input cases

    t (ms) E2SXR E2H E

    24 E24 E

    26

    0.02 0.0224 5.62 104 0.1396 0.0221 0.0144

    1.00 0.0342 0.1039 0.1758 0.0579 0.0365

    2.00 0.0721 0.1898 0.2032 0.0922 0.0562

    3.00 0.1141 0.2107 0.2143 0.1105 0.0667

    4.00 0.1621 0.2254 0.2278 0.1262 0.0762

    8.00 0.3038 0.2546 0.2662 0.1763 0.1088

    Notes:

    E2SXR: mean square error for single input with SXR signal.

    E2H: mean square error for single input with H signal.

    E24: mean square error for four inputs when the Mirnov probes were chosen from the 42 and 234 locations.

    E24: mean square error for four inputs when the Mirnov probes were chosen from the 138 and 330 locations.

    E26: mean square error for six inputs.

    Then the trained ANN behaved still better with four

    probes more uniformly spread out in the four quad-

    rants. Thus, this shows that the poloidal distribution

    of the probes was crucial for the ANN to perform

    well on out of sample discharges. Use of more probes,

    however, did not improve the fitting much, and the

    network ran the risk of being too heavy, resulting in

    unnecessary computation time.

    Table 2 compares the training errors for differentANN inputs. Table 3 displays the performance of the

    trained ANN with various combinations of inputs.

    These include:

    (a) A single SXR signal input;

    (b) A single H signal input;

    (c) Four inputs consisting of the two Mirnov probes

    at poloidal angles of 42 and 234, the SXR and

    H signals;(d) Four inputs consisting of the two Mirnov probes

    at poloidal angles of 138 and 330, the SXR and

    H signals;

    (e) Six inputs comprising all four Mirnov probe

    signals, and the SXR and H signals.

    When applied to new data, it is clear from Table 3

    that the ANN was most tolerant to the increase of

    predicted time steps when six different diagnostics

    were used, although the training error as well the

    single step prediction error were the minimum whenonly the H signal was the input.

    Therefore, the final set of diagnostic data used in

    this study consisted of the following:

    (i) Four Mirnov probe signals. The probes chosenare located more or less symmetrically around

    Nuclear Fusion, Vol. 40, No. 12 (2000) 1999

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    8/16

    A. Sengupta and D. Ranjan

    Table 4. Comparison of the instants of disruption

    triggering as displayed by the indicator for various t

    using the H signal

    t (ms) Actual instant Predicted instant

    1.00 81.495 81.50

    2.00 81.495 81.50

    3.00 81.495 81.50

    4.00 81.495 81.505.00 81.495 81.50

    6.00 81.495 81.50

    7.00 81.495 81.50

    8.00 81.495 81.52

    the plasma, at poloidal angles of 42, 138, 234

    and 330.

    (ii) One set of SXR monitor data.(iii) One set of H monitor data.

    Since each of the inputs to the ANN was an array,

    composed of the past values of the variable, it had tobe expressed as a vector rather than a scalar, the vec-

    tor components corresponding to the past values (thenumber of which in our case was ten). Thus there

    were six input vectors in the network, corresponding

    to the six diagnostic signals listed above. The out-

    puts were the future values of the same signals to be

    predicted, which in this study was at a single time

    instant only, according to Eq. (11). Thus, the ANN

    had six scalar outputs.

    6. Forecasting disruption

    After the ANN was trained and the weight fac-

    tors properly set, it was used to forecast disruption

    on three disruptive discharges from ADITYA. These

    discharges differed in the maximum plasma current

    and the duration, but the general behaviours of the

    fluctuating quantities were similar. Another notablefeature was that all these discharges ended in a major

    disruption, without any preceding minor disruption.

    As already mentioned, an important criterion for all

    our forecasting was to choose the instant of disrup-

    tion triggering.

    For the actual detection of the instant of disrup-

    tion triggering, which in fact was our first goal, an

    indicator was made whereby the moment the insta-

    bilities set in, an alarm would be given to the controlsystem, which then could take measures to soften the

    impact of the disruption. Table 4 shows the trigger-

    ing instants as displayed by the indicator for various

    Table 5. Comparison of ANN performance with respect

    to mean square error E2 for unfiltered and filtered input

    signals

    (The first value of t corresponds to a single step ahead

    prediction.)

    t (ms) Without filter With filter

    0.02 0.0272 0.0090

    1.00 0.0577 0.03652.00 0.0650 0.0452

    3.00 0.1037 0.0858

    t, using one of the forecasting discharges for the

    H signal. The H radiation in ADITYA was seen

    to remain at a more or less constant value (Figs 3, 6

    and 9) during the ramp-up and flat-top phase of the

    discharge before starting to rise at the instant thedisruption precursors set in (which coincides with

    the instant of disruption triggering). So the crite-

    rion for defining the disruption triggering was thatthe signal value should be greater than 2.00. The

    results showed that the prediction instants remained

    exactly the same up to t = 7 ms (although therewas a very small discrepancy with the actual signal),

    while for t = 8 ms, a small time lag of 0.02 ms was

    observed for the first time. This seemed to be the

    trend in all the discharges used for forecasting, where

    this time lag varied from 0.02 to 0.03 ms. Therefore,

    in our results t was limited to 8 ms. Since the ANNinputs were experimental signals, the inherent noise

    was inevitably there. It was observed that there was

    a good reduction of error after filtering of the noise,as shown in Table 5, so that a better fitting was

    achieved. This motivated us to use filtered experi-

    mental data as inputs in the subsequent cases.

    Figure 3 shows the first of the discharges used

    for forecasting, shot 6690. This 95.28 kA plasma dis-

    rupted at t 82 ms, while a disruption was triggered

    at t 81.50 ms, as our indicator shows. Figure 4

    compares the quality of prediction of this disruptive

    event t = 1, 2, 4 and 8 ms earlier, with respect

    to the SXR experimental signal. Figure 5 does the

    same, but with the H signals. With a sampling time

    of 0.02 ms for these discharges, this corresponded to

    predicted time instants 50, 100, 200 and 400 time

    steps ahead, respectively; these being the values ofn

    in Eq. (11).

    The major observations from these figures were

    the following.

    (a) Unlike the previous articles [4] and [6] where

    a time lag was reported for the predicted instant of

    2000 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    9/16

    Article: Neural network forecasts of disruptions in ADITYA

    0

    10

    20

    Vloop2

    5

    0

    5

    SX

    R

    5

    0

    5

    mag.

    fluct.

    0

    50

    100

    Ip(kA)

    Shot : 6690 06Jan1999 01:46:58 PM

    0

    5

    H

    0 10 20 30 40 50 60 70 80 90 1000

    1

    2

    Bv(kA)

    Time(ms)

    Figure 3. The first disruptive discharge, shot 6690, was used for forecasting. This

    plasma shot disrupted around 82 ms, and the disruption was triggered around 81 ms.

    The plasma current attained prior to disruption was t 90 kA.

    5

    0

    5

    predicted t = 1ms

    5

    0

    5

    predicted t = 2ms

    5

    0

    5

    predicted t = 4ms

    50 55 60 65 70 75 80 85 905

    0

    5

    time(ms)

    predicted t = 8ms

    5

    0

    5

    expt.

    Figure 4. Forecasting disruption using our full network with six inputs for shot

    6690. Only SXR signals are shown. The actual experimental signal is compared with

    the neural network predictions for t = 1, 2, 4 and 8 ms early, as shown. The vertical

    lines indicate the actual instant of triggering the disruption.

    Nuclear Fusion, Vol. 40, No. 12 (2000) 2001

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    10/16

    A. Sengupta and D. Ranjan

    0

    5

    predicted t = 2ms

    0

    5

    predicted t = 4ms

    50 55 60 65 70 75 80 85 900

    5

    time(ms)

    predicted t = 8ms

    0

    5

    exppt.

    0

    5

    predicted

    t = 1ms

    Figure 5. Forecasting disruption using shot 6690. Only H signals are shown. The

    actual experimental signal is compared with the neural network predictions for t =

    1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant of

    triggering the disruption.

    disruption beyond 1.12 and 3.12 ms, respectively, thepresent study did not show any appreciable time lag

    even for a prediction 8 ms earlier. This showed a

    significant improvement of the results by the use of

    more diagnostic information into our neural network.(b) As the temporal activities were predicted ear-

    lier and earlier, there was only a small change in the

    waveform of the predicted signals with respect to the

    corresponding targets.

    (c) The last 30 ms of the discharge was scanned.

    This was found to be enough for our purpose, asthe temporal activities around the time the instabil-

    ities were triggered have been well depicted. More-

    over, sawtooth phenomena are clearly observed from

    Fig. 4, around 55 ms, which are also included within

    the predicted part of the signal.

    (d) The vertical lines in Figs 4 and 5 indicatethe instant the disruptive instabilities have just been

    triggered. By following this line for each of the fiveplots of each figure, the ANN prediction and the

    actual disruption can be compared very well.

    (e) A prediction at a time t early means that

    the signal at time t is predicted at the instant tt.

    If the prediction results are analysed for t = 8 ms,

    it is observed that the instant of observation of dis-

    ruption precursors around 81 ms was predicted by

    using the temporal behaviour around 73 ms.

    Figure 6 shows the second plasma discharge usedfor forecasting. This 83.57 kA discharge disrupted

    at t 62 ms, the disruption being triggered at t

    60 ms. Figures 7 and 8 display the performance of the

    neural network for prediction of this disruption 1, 2,4 and 8 ms early, with only two of the inputs, the

    SXR and H signals, being shown.

    Analysis of shot 6520 revealed the following:

    (i) Once again a very good prediction of the trig-

    gering of the instability, the instant of which is

    given by the vertical lines, was observed even for

    t = 8 ms.

    (ii) The last 28 ms of this discharge were pre-

    dicted. The reason for choosing only this portion

    was that in this temporal range the SXR signal was

    observed to rise along with the current ramp-up.It was observed that the signal from the monitor

    was able to pick up the actual rise of core temper-

    ature only around 30 ms. However, once again this

    2002 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    11/16

    Article: Neural network forecasts of disruptions in ADITYA

    0

    10

    20

    Vloop2

    42

    02

    SXR

    1

    0

    1

    mag.

    fluct.

    0

    50

    100

    Ip(kA)

    Shot : 6520 24Dec1998 05:09:45 PM

    0

    5

    H

    0 10 20 30 40 50 60 70 800

    1

    2

    Bv(kA)

    Time(ms)

    Figure 6. The second disruptive discharge, shot 6520, used for forecasting. This

    plasma discharge disrupted around 62 ms, and the disruption was triggered around

    60 ms. The plasma current attained prior to disruption was 80 kA.

    5

    0

    5

    expt.

    5

    0

    5

    p

    redicted t = 1ms

    5

    0

    5

    predicted t = 2ms

    35 40 45 50 55 60 655

    0

    5

    time(ms)

    pr

    edicted t = 8ms

    5

    0

    5

    predicted t = 4ms

    Figure 7. Forecasting disruption using shot 6520. Only SXR signals are shown.

    The actual experimental signal is compared with the neural network predictions for

    t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant

    of triggering the disruption.

    Nuclear Fusion, Vol. 40, No. 12 (2000) 2003

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    12/16

    A. Sengupta and D. Ranjan

    0

    5

    expt.

    0

    5

    predicted

    t = 1ms

    0

    5

    predicted

    t = 2ms

    0

    5

    predicted

    t = 4ms

    35 40 45 50 55 650

    5

    time(ms)

    predicted

    t = 8ms

    Figure 8. Forecasting disruption using shot 6520. Only H signals are shown. The

    actual experimental signal is compared with the neural network predictions for t =

    1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant of

    triggering the disruption.

    sufficed, as this time regime contained the disruption

    precursors followed by the current quench, as also a

    portion of the discharge prior to the triggering of the

    instabilities.

    (iii) The spikes of the SXR signal towards thenegative side were only noise and obviously did not

    have any physical significance. These spikes contin-

    ued even after the discharge terminated. However,Fig. 7 shows that the noise level was considerably fil-

    tered, and the negative spikes were greatly reduced.

    The third discharge used for forecasting,

    shot 6688, is shown in Fig. 9. In this case the98.39 kA plasma disrupted at t 65 ms, while

    the triggering instabilities set in around 63.32 ms,

    according to the indicator. The observations from

    this discharge are described below:

    (a) The signal from the Mirnov probe at 42, and

    the SXR and H signals were predicted remarkably

    well, with very little distortion in the signals even for

    a prediction 8 ms early.

    (b) The SXR signals in this case did not contain

    any negative spikes. In addition, sawtooth oscilla-

    tions were observed prior to the disruption, for the

    last 30 ms. These sawteeth were excellently picked

    up by the neural network.

    (c) The vertical lines in Figs 1012 show the

    instant of triggering of the disruptive instabilities.

    From Fig. 10 one observes that the MHD activitiesas picked up by the Mirnov probe started increas-

    ing around 63 ms, when the magnetic fluctuations

    increased in amplitude.

    It was seen from the results of all the three disrup-

    tive discharges that, while predicting the disruption

    occurrence, the ANN did not give any false predic-

    tion within the non-disruptive part of the discharge.

    This should be a good motivation for using this algo-

    rithm as a disruption alarm.

    A general feature of all the predictions was that

    towards the beginning of the predicted interval, sev-

    eral of the predicted signals became a little distorted

    with respect to the actual signal, especially at higher

    t. However, for achieving the goals of the present

    study, this was not likely to prove any hurdle, as onlythe prediction of the signal around the instant of

    the triggering of disruptive instabilities was of prime

    concern. In the earlier part of the discharges, the

    2004 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    13/16

    Article: Neural network forecasts of disruptions in ADITYA

    0

    10

    20

    Vloop2

    42

    02

    SXR

    5

    0

    5

    mag.

    fluct.

    0

    5

    H

    0 10 20 30 40 50 60 700

    1

    2

    Bv(kA)

    Time(ms)

    0

    50

    100

    Ip(kA)

    Shot : 6688 06Jan1999 01:34:55 PM

    Figure 9. The third disruptive discharge shot 6688, used for forecasting. This plasma

    discharge shows a major disruption at t 66 ms. The plasma current attained prior

    to disruption was t 103 kA.

    5

    0

    5

    expt.

    5

    0

    5

    p

    redicted t = 1ms

    5

    0

    5

    predicted t = 4ms

    35 40 45 50 55 60 65 70 75 805

    0

    5

    time(ms)

    pr

    edicted t = 8ms

    5

    0

    5

    predicted t = 2ms

    Figure 10. Forecasting disruption using shot 6688. Only B=42 signals are shown.

    The actual experimental signal is compared with the neural network predictions for

    t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant

    of triggering the disruption.

    Nuclear Fusion, Vol. 40, No. 12 (2000) 2005

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    14/16

    A. Sengupta and D. Ranjan

    5

    0

    5

    expt.

    5

    0

    5

    predicted t = 1ms

    5

    0

    5

    predicted t = 2ms

    5

    0

    5

    predicted t = 4ms

    35 40 45 50 55 60 65 70 75 80

    5

    0

    5

    time(ms)

    predicted t = 8ms

    Figure 11. Forecasting disruption using shot 6688. Only SXR signals are shown.

    The actual experimental signal is compared with the neural network predictions for

    t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant

    of triggering the disruption.

    0

    5

    expt.

    0

    5

    pred

    icted

    t = 1ms

    0

    5

    predicted

    t = 2 sm

    0

    5

    predicted

    t = 4ms

    35 40 45 50 55 60 65 70 75 800

    5

    time(ms)

    pred

    icted

    t = 8ms

    Figure 12. Forecasting disruption using shot 6688. Only H signals are shown.

    The actual experimental signal is compared with the neural network predictions for

    t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant

    of triggering the disruption.

    2006 Nuclear Fusion, Vol. 40, No. 12 (2000)

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    15/16

    Article: Neural network forecasts of disruptions in ADITYA

    point that was of real importance for our purpose

    was whether any false alarms were produced by the

    ANN, when there were no indications of the trigger-

    ing of a disruption in the actual data.

    None of the discharges used in this work was pre-

    dicted entirely. To do this, a fresh training was nec-

    essary, as the plasma dynamics during the startup

    phase were not picked up by the ANN during train-

    ing, which was also done using the last 35 msof the training discharge. For forecasting the wholedischarge, a large error was, therefore, anticipated.

    But although the time series prediction formalism

    requires the use of more past information for an accu-

    rate prediction of the future, the initial phase of the

    discharges was unlikely to provide any extra infor-

    mation regarding the triggering of the instabilities

    leading to the disruption.

    7. Discussions

    The forecasting of plasma disruptions in toka-

    mak ADITYA were described in the previous sec-

    tion, using a set of diagnostics different from what

    had been used in the earlier works [4, 6]. The

    use of a combination of several diagnostic signals,

    rather than a single type of diagnostic as had been

    used in the studies of [4, 6], was thought to haveproduced the improved forecasting capabilities of

    the ANN.

    Apart from changing the nature of the inputs,another major change was made in the present work

    from Refs [4, 6]. This concerns the use of a direct pre-diction of the disruption, unlike the iterated predic-

    tion methods incorporated earlier. But it was proved

    that the improvement in forecasting was not really

    due to this change, as the use of a single input in this

    work did not produce better results. In particular,

    the performance of a trained network with only an

    SXR signal showed a result similar to that of Ref. [6],

    as the time lag was first observed around 3 ms. A

    glance at Table 3 would reveal that the ANN predic-

    tion error with only an H signal worsened further.There was not much change when the SXR and Hsignals were used along with two Mirnov probes at

    42 and 234. There was, however, a marked improve-

    ment when the magnetic signals were from probes at

    138 and 330. The best predictions were obtained

    from all four probes, together with the SXR and Hsignals. From this it appears that two main factors

    were responsible for the best prediction results in this

    work:

    (a) A definite combination of diagnostic signalsfrom 4 Mirnov probes, one SXR and one Hmonitor.

    (b) The poloidal distribution of the Mirnov probes

    around the plasma.

    The performance of the trained ANN in an actual

    real time application for plasma disruption forecastcould not be in doubt, as the discharges used in this

    study were experimental, and noise tolerance of theANN was automatically ensured.

    Regarding the timescales of TEXT and ADITYA,

    it can be stated that the scales are much shorter

    for ADITYA, as the plasma duration for ADITYA

    is around 100 ms while that for TEXT varies from

    250 to 400 ms or more [6, 15]. Thus a detection of an

    approaching disruption 8 ms in advance in ADITYAwould correspond to a much more reliable situation

    for a real time prediction.

    8. Summary and conclusions

    In this article a neural network was used for fore-

    casting plasma disruptions in ADITYA. A number of

    diagnostic signals were fed into the network input.

    Although this made the structure of the network

    heavier, it is believed that this increased input infor-

    mation from a definite combination of the diagnostics

    was the main reason for the significantly improved

    performance. This combination provided the opti-

    mum number of inputs to the ANN, with ten past

    values of each of the temporal variables.Confidence about the performance of the ANN

    in real time could be gained from the fact that the

    algorithm not only predicted the trigger of the insta-

    bilities correctly, but did it sufficiently early which

    was the basic requirement for real time operations.

    A forecast of an approaching disruption about 8 ms

    in advance is extremely crucial, not only for medium

    sized machines like ADITYA, but also for reactor

    grade tokamaks like ITER where the pulse lengths

    are to be around 1000 s. Since such long pulse opera-tions can be strongly inhibited by major disruptions,

    a forecast as proposed in this study can be effectively

    used to alert the real time control systems and mea-

    sures, such as electron cyclotron resonance heating,

    pellet injections and neutral beam heating, can be

    put into operation to soften the harmful effects of dis-

    ruptive termination of a plasma discharge. In addi-tion, it was amply demonstrated that in the absence

    of any approaching disruption, the network would

    Nuclear Fusion, Vol. 40, No. 12 (2000) 2007

  • 8/8/2019 Forecasting Disruptions in the ADITYA Tokamak

    16/16

    A. Sengupta and D. Ranjan

    not give any false alarms. Finally, since experimen-

    tal plasma discharges were used in this study, the

    ability of the ANN from the point of view of noise

    tolerance was automatically ensured.

    One crucial observation in this work was that the

    discharges used were not taken on the same day, and

    yet no effect was noticed in the prediction quality.

    The quality degraded slightly only due to a larger

    t. From this it could be concluded that the physi-cal conditions, such as wall conditioning and average

    plasma density, do not have any effect on the predic-

    tion of disruption. Prediction depends basically on

    the nature of the discharges. The discharges used in

    this work were, by nature, similar in so far as thegeneral variation of the different temporally varying

    plasma parameters is concerned. Moreover, all the

    discharges ended in a major disruption without any

    intermediate minor disruption. So although the max-

    imum plasma current, loop voltage and the duration

    of the discharges varied from discharge to discharge,

    these had no real effect on the quality of prediction.

    Acknowledgements

    The authors take this opportunity to express

    their sincere thanks to J.B. Lister for providing

    them with the neural network program. They grate-

    fully acknowledge H. Ramachandran for his sugges-tions and critical comments after going through this

    manuscript. One of the authors (AS) would like to

    thank C. Ramdas who helped in drawing the neu-

    ral network structure of Fig. 1. Finally, the authors

    thank the entire ADITYA team for supplying theexperimental data.

    References

    [1] Lister, J.B., Schnurrenberger, H., Nucl. Fusion 31

    (1991) 1291.

    [2] Coccorese, E., Morabito, C., Martone, R., Nucl.

    Fusion 34 (1994) 1349.

    [3] Albanese, R., et al., Fusion Technol. 30 (1996) 219.

    [4] Hernandez, J.V., et al., Nucl. Fusion 36 (1996) 1009.

    [5] Wroblewski, D., Jahns, G.L., Leuer, J.A., Nucl.

    Fusion 37 (1997) 725.[6] Vannucci, A., Oliveira, K.A., Tajima, T., Nucl.

    Fusion 39 (1999) 255.

    [7] Yoshino, R., Koga, J.K., Takeda, T., Fusion Tech-

    nol. 30 (1996) 237.

    [8] Hamilton, J.D., Time Series Analysis, Princeton

    University Press, Princeton, NJ (1994).

    [9] Weigend, A.S., Gershenfeld, N.A., Time Series Pre-

    diction: Forecasting the Future and Understanding

    the Past, Addison-Wesley, Reading, MA (1992).

    [10] Bhatt, S.B., et al., Indian Pure Appl. Phys. 27

    (1989) 710.

    [11] Saxena, Y.C., Curr. Sci. 65 (1993) 25.

    [12] Geva, A.B., IEEE Trans. Neural Networks NN-9(1998) 1471.

    [13] Bishop, C.M., Rev. Sci. Instrum. 65 (1994) 1803.

    [14] Lister, J.B., Schnurrenberger, H., Marmillod, P.,

    Implementation of a Multilayer Perceptron for a

    Non-linear Control Problem, Rep. LRP 398/90,

    CRPPEPFL, Lausanne (1990).

    [15] Vannucci, A., McCool, S.C., Nucl. Fusion 37 (1997)

    1229.

    (Manuscript received 27 October 1999

    Final manuscript accepted 30 August 2000)

    E-mail address of A. Sengupta:

    [email protected]

    Subject classification: C0, Tm

    2008 Nuclear Fusion, Vol. 40, No. 12 (2000)


Recommended