Artificial neural network based peak load forecasting using Levenberg-Marquardt and quasi-Newton...

Artificial neural network based peak load forecasting using Levenberg-Marquardt and quasi-Newton

L.M. Saini and M.K. Soni

Abstract: Daily electrical peak-load forecasting has been done using the feedforward neural network bascd on the Levenherg-Marquardt back-propagation algorithm, Broyden-Fletcher- Goldfarb-Shanno hack-propagation algorithm and one-step secant hack-propagation algorithm by incorporating the effect of eleven weather parameters. the type of day and the previous day peak- load information. To avoid the trapping of the network into a state of local minima, thc optimisation of user-defined parameters I;C. learning rate and error goal has been performed. Training data set has been selected using a growing window concept and is reduced as per the nature of the day and the season for which the forecast is made. For redundancy removal in the input variables, reduction of the number of input variables has been done by the principal component analysis method of factor extraction. The resultant data sct is used for the training of a three-layered neural network. To increase the lcarning speed, the weights and biases are initialised according to the Nguyen and Widrow method. To avoid over-fitting: an early stopping of training is done at the minimum validation error.

1 Introduction

Electrical peak load forecasting (PLF) is very important for power system operators and planncrs. since many important functions in power system operational planning such a s unit commitment. economic load dispatch. maintenance scheduling and expansion pllrnning are usually pcrfonned based on the forecast loads. Mostly- PLF has been perfomied using supcrvised ncural learning techniques, employing the steepcst descent hack propagation (SDBP) or momentum back propagation (MOBP) algorithm [1-7]. These neural algorithms have a drawback of slow convergence rate. To accclerale the learning process. the Levenberg-Marquardt back-propagation algorithm (LMBP)? Broydcn-Fletcher-Goldfark-Shanno back-propagation algorithm (BFGS-BP) and one-step secant back- propagation algorithm (OSS-BP) can he used [%I I]. The electrical peak load depends on weather conditions. Most of the time, only temperature has been used as a weather variable [1-3. 121. Some of the time relative humidity has been included [MI_ while wind speed has been rarely used as a weather variable for PLF [7]. In this w!ork, eleven weather parameters have been used to make load forecasting more accurate. The parameters used for comparing the pcrfomiance of leaming techniques are: mean absolute percentage error (MAPE). millioii floating-point opcrations (MFLOP) performed per lcarning and time taken per learning. The first parameter MAPE gives the accuracy of recall: the second, the complexity of the learning technique; the third, a ineasurc of the spced of the Icarning technique.

ID IEE. ZWZ IEE Proc<wdings online no. 2WZO462 Dol: IO. I MY/ip-gtd:2WZW6? Paper fint meired 10th May 2WI and in rcvired form 19th February 2002 The author, are with ilic Depiiitnicnl of Elecinctl Engiommp. National lnilituie of Technology. Kuruksheira. Haiyma. 1361 19. India

578

The LMBP_ BFGS-BP and OSS-BP neural learning techniques are discussed in the following Sections. followed by their implementation on PLF.

2 Levenberg-Marquardt back-propagation algorithm (LMBP)

The Levenherg-Marquardt algorithm is a variation of Newton‘s nicthod [XI. This is very well suited to neural iictwork (NN) training, where the perfonnance index is the mean squared error. Newton‘s update for optimising a performance index F(x) is

xk+l = XP ~ AF’g, (‘1 where Ak V2F(x)Ix=>* and gk = VF(x)l.=,,

Assuming that F(x ) is a sum of square function

then the jth element of the gradient would be

The gradient can therefore be written in matrix form

where

J(x) =

is the Jacobian matrix. Next. the Hessian matrix is to be determined. The k, j element of thc Hessian niatrlx would he

The Hessian matrix call then be expressed in matrix form

-, V 2 F ( x ) = 2J'(x)J(s) + 2S(x) (7)

where

will approach Gauss-Newton; which should provide Faster convergence. The algorithm provides a nice compromise between the speed of Newton's method and the guaranteed convergence of steepest descent.

3 Quasi-Newton back-propagation methods

The derivatives in minimisation algorithms are the gradient and the Hessian. Thc gradient must be known accurately as descent directions have to be calculated from them and approximations to gradient do not provide the required accuracy. On thc other hand, the Hessian can be approximatcd by secant techniques. Since the Hessian is the Jacobian of the nonlinear system of equations VF(x) = 0, it could be approximated. But, the Hessian has two important properties: it is always symmetric and

If ~ ( x ) is assumed to be approximated as

the ~~~~i~~

- can be often positive definite. The incorporation of thcse two

properties into the secant approximation is an important amect of the BFGS-BP and OSS-BP methods discussed .

V'F(x) 2 ? J ' ( s ) J ( s ) (9) Substituting (9) and (4) into (I). the Gauss-Newton method is obtained

subsequently. They are most often called positive definite SecLInt updates

X i + ] =Xa - [2JT(Xh)J(si)]~'2J'(xa)v(xt) (10)

=Xp - [JT(sh)J(Xh)]-'J'(x~)V(sk)

From this. it is evident that the advantagc of Gauss- Newton method over the standard Newton's method is that it does not require calculation of second-order derivatives. One pmbleni with the Gauss-Newton method is that the matrix H = JTJ may not he invcrtible. This can be overcome by using the following modification to the approximate Hessian matrix:

G = H + p I (11) To make this matrix invertible, suppose that the eigenvalues and eigenvectors of H are {il, i2; ..., I),} and (zI, 22, .. ., z,J Then

Gz; = [H + p l ] ~ , = Hz; + /m, = i ; z i + ~ I Z ;

= (i, + p)z; (12) Therefore the eigenvectors of G are the Same as the eigenvectors of H, and the eigenvalues of G are (;.i+p). G can be made positive definite by increasing / I until (i;+p)zO for all i, and therefore the matrix will be invertible. This leads to the Levenberg-Marqiiardt algorithm [I31

Sk+l = Xb ~ (J'(Sk)J(Xk) f / I ~ I ] - ' J ' (x~ )v (s~) or

dxk = -[Jr(xh)J(Xk) + /~~I]~IJ'(sa)v(xp) (13) This algorithm has a useful feature that as pk is increased it approaches the steepest descent algorithm with small learning rate

1 S ~ + I N x ~ ~ -Jr (xk)v(xk)

ilh- 1 (14)

~ - i -VF(x) forlargepi -ph-

and if i l k is decreased to zero the algorithm becomes Gauss- Newton. The algorithm begins with pk set to some small value (e.g. pi = 0.01). If :I step does not yield a smaller value for q x ) , then the step is repeated with /ik inultiplied by some factor :i> 1 (e.%. 3 = IO). Eventually F(x) should decrease, as a small step is taken in the direction of steepest descent. If a step does produce a smaller value for F(x), then ph is divided by 9 for the next step, so that the algorithm

3.1 Broyden-Fletcher-Goldfarb-Shanno back- propagation method In Newton's method a quadratic approximation is used instead o f a linear approximation of the function F(x). The next approximate solution is obtained at a point that niinimises tlic quadratic function

F ( s ~ + I ) = f(Xk + d x r )

= f (Xp)+g:dXh + - A X l & d S k I (15) 2

Hence, the obtained sequence is

XI+] = sx - AT'gp (16)

The main advantage ofthe Newtoii's method is that it has a quadratic convergence rate. while the steepest descent has a much slower, linear convergence rate. However, each step of Newton's method requircs a large amount of computa- tion. Assuming that the dimensionality of the problem is N , an O(N3) floating- oint operation is needed to compute the search direction d . A method that uses an approximate Hessian matrix iii computing the search direction is the quasi-Newton method. Let H h be an N x N sytnmetric matrix that approximates the Hessian matrix AI; then the search direction for the quasi-Newton method is obtained by minimising the quadratic filnctioti

51

F ( X n + i ) = F ( x h + AXh)

1 = F ( x k ) + g : d x k + - d s : H k d x n (17)

2

If Hh is invertible. a descent direction can be obtained from the solution of the quadratic program

pk := x - xk = - ( H ; ~ ) v F ( ~ ) ~ ~ ~ = ~ ~ (18)

As the matrix Hi. is to approximate the Hessian of the function F(x) at x = sh, it needs to be updated from iteration to iteration by incorporating the most recent gradient infonnation.

The matrix Hp is updated according to the following equation [9, IO]:

3.2 One-step secant back-propagation method One drawback of the BFGS update of(19) is that it requires storage for a matrix of size N x Nand calculations of order O(N2). Although the available storage is less of a problem now than it was a decade ago, the computational problem still exists for large N. It is possible to use a secant approxiination with O(N) computing [ I I]. In this method, . the new search directiou is obtained fi-om vectors coinpuled from gradients. lr a+ I is the current gradient, the new search direction. pk+ is obtained as

Pk+i = -&+[ + Bk?, + CkSA (21) where g k + = VF(x,<+ I ) and the two scalars BL. and Ck are the following coinhination of scalar products of the previously defined vectors sA, &.+I. and y ~ . (last step, current gradient and difference of gradients):

The backtracking line search is used with the quasi- Newton's OSS optimisation algorithm [14, IS]. At the beginning of learning, the search direction is -go and it is restarted to - g k + , every N steps. Multiplier 1.1 increases the last successful learning rate and first tentative step is cxecuted. I f the energy of the network is higher than the upper limiting value, then a new tentative is tried by using successive quadratic interpolation until the requirement is met. The learning rate is decreased to half after each unsuccessful trial.

4 Collection of data

The daily weather and electrical peak load data of four years of a 220 kV subsration of Haryana Vidyut Prasaran Nigam Ltd. (HVPNL) has been takcn Tor the prescnt study. Data from 1997 to 1999 has been used for neural network (NN) training. Data from the year 2000 has been used to test the trained NN. For NN training, the concept of the growing window has been used in deciding the training data, i.e. if current day is nth day, then dara of (n- I ) days has been used for the training. Each of the eight NN with the same structure has been trained for the current day. one day ahead, two days ahead.. . . seven days ahead PLF by incorporating I 1 weather parameters. totaling 21 variables u t . temperature of the day (maximum. minimum), _pros minimum temperature. rainfall, evaporation per day, sunshine hours of the previous day, wind speed. Also, the dry bulb temperature, the wet bulb temperature. the vapour pressure, the relative humidity ("4) and the soil temperature at a depth of 5; IO, 20cm were observed at 0722 and 1422 h 1161. The quality of PLF is compared on the basis of MAPE

.I l Y - F l * i M ,\ MAPE = (23) N , = I

where X, is the actual value. F, is the forecast value and N is the total number of values predicted. To check the quality of results throughout the year, the year has been divided into four seasons as shown in Table I. The forecast MAPE has been obtained for a week in every season.

5x0

Table 1: Definition of seasons

Season Period

Winter Summer Rainy DW

December-March April-May JuneSept

Oct-Nov

5

Neural learning is a slow artificial intelligence process. So optimisation is required at every point. i.e. in deciding the neural architecture. the number of training/validation patterns, the learning rate, thc error goal. etc. The training pattern consists of the input/output pairs. The number of variables in the input pair is 21. To these 21, variables like the year number, day of the year. the previous day peak load, the type of day are added? thus making them 28. A day has been classified into four types ui-. working day. holiday. Saturday and Sunday. As the training pattern consist of the data for three years. the minimum size of input training paltern iiiatrix was 28 x 1095. The output pattern matrix is of size 1 x 1095. The NN may be trained with this pair of pattem matrices. Once trained, it can predict the peak load of current day or of one day ahead ... or of seveii days ahead. If the season of the current day is summer_ there is no use of training the NN with data of winter or of any other season. Hence, the training data should consist of data of some previous days (a week before the current day), and the similar days of all the previous years (a fortnight before the current day. the current day. and a fortnight after the current day during the previous years). This relevant selection of input data reduces the number of training patterns to 100 from 1095. Further, a large amount of redundancy may exist in input variables. To remove it, preprocessing is donc.

Preprocessing consists of a feature extraction process [ I 71. The feature extraction is a process whereby a data space is transformed into a feature space that. in theory, has exactly the same dimension as the origmal data space; but the data set is represented by a reduced nunibcr of effective features and yet retains most of the intrinsic infonnation content of the data set. The process to achieve this is known as principal component analysis (PCA). The intercorrelation matrix of input data was factored by applying the PCA method of factor extraction. Mostly. I I Factors were extracted out of 28 input variables by following Kaiser's recommendation to stop the extraction of Factors when their effect on the variance of the output variables is the less than I % . To achieve the approximation to a simple structure, the fictors were rotated in accordance with the vanmax criterion of orthogonal rotation [l7]. Hence, the input matrix was reduced by relev;ini selection and preprocessing procedures from 28 x 1095 to I I x 100; thus resulting into a minimum reduction of 96.41% and a maximum reduction of 97.3%. This reduction of training matrix has three advantages: the NN architecture is simple, the time taken for training is reduced, and the NN is trained with the daVa of a nature similar to the one which is to be predicted.

6 Neural network implementation

A three-layer FFNN has been simulated using MATLAB 5.2 for learning the three-year weather and peak load data.

IEE Proc-Gosr T r u m D l s e h

Reduction and preprocessing of data

In this network the input nodes. the first hidden layer neurons, the second hidden layer neurons and the output layer neurons have been taken as I I , 5. 5 and I , respectively. The nun ik r of input nodes is dynamic, depending on the number of factors extracted from the factor analysis process. The number of neurons in Lhe output layer is one 8s either the current day or one day ahead or two days ahead . .. or seventh day ahead peak load is to be forecast. The number of neurons in the first hidden and the second hidden layer is decided by the nuinber of training patterns and the number of unknown weights and biases to be determined. To avoid the fall of algorithni in local minima. the user-defined parameters OD. the learning rate and the error goal are optimised as shown in Figs. 1 and 2, respectively. To increase the learning speed, the weights of the NN are initialised according to the method given by Nguyen and Widrow [18]. In this: first thc weights are initialised according to a uniform random distribution between - 1 and 1. Next, the magnitude of weight vectors is adjusted in a nianiier such that each hidden node is linear over only a small interval. The optimised value of learning rate and error goal for this m e is 0.002 and 0.04534.

8.0 ,

f

respectively. The NN has been trained for a maximum of 9950 epochs with training data by monitoring the squared error for training as well as validation data at the end of every epoch. Once the network has attained minimum validation error in the last 500 epochs, a criterion of early stopping of training is used to avoid over fitting the network. The weights and biases at the minimum of validation error are returned [IY]. The NN has been trained on an Intel P-IV. 1.4GHz PC using various neural learning techniques.

7 Results

In the present study, SDBP, LMBP, BFGS-BP and OSS- BP has been used for PLF. The MAPE and average MAPE obtained for PLF o f a week in four seasons of 1 I year 2000 are given in Tables 2, 3; 4 and 5. respectively. The reduction or squared error with respect to epoch using tlie LMBP. BFGS-BP and 0%-BP techniques for a random day is shown in Figs. 3_ 4, 5 and 6, respectively. From these, it is evident that the goal for learning data set is achieved in 7. 51 and 57 epoch, respcctively: whcreas, using SDBP it is not achieved even in 9950 epochs. The comparison of season and week-ahcad average MAPE (SWAAMAPE) is shown in Fig. 7. The minimum, maxiinum and mean MFLOP and time taken per learning are given in Table 6. The mean time and mean MFLOP per learning are shown in Fig. 8. From Tables 2 to 6 i t is evident that SDBP, LMBP. BFGS-BP and 0 % - B P algorithms are able to predict peak load of a day in 104.79, 6.16. 1.85 and 1.59s (mean time) with an average error of 2.78, 2.87; 2.38 and 2.41% (SWAA- MAPE), respectively. Also from Tables 2 to 6 it is concluded that using the LMBP technique in coinparison to the SDBP technique. the SWAAMAPE has increased by 3.26%; but the mean MFLOP has decreased by 19.49% and mean time in learning has decreased by 94.1 I%. Similarly. using BFGS-BP in contract to the SDBP learning technique, SWAAMAPE decreased by 14.28%, the mean MFLOP decreased by 93.48 O/' and the mean time decreased by 98.22%. Also. using OSS-BP in contrast to the SDBP learning technique, SWAAMAPE dccreased by 13.26%, mean MFLOP decreased by 98.98% and the nieaii time decreased by 98.47%.

8 Conclusions

The growing window concept has been used to decide the training matrix Tor PLF. The reduction of training matrix has been done according to the nature of the day for which forecast is to be made. For removal of redundancy in input variables. PCA. a concept from psychology has been used.

Table 2: MAPE of PLF using steepest descent back-propagation algorithm ISDBP)

Season Winter Summer Rainy DW Average

Current day

One day ahead

Two days ahead

Three days ahead

Four days ahead

Five days ahead

Six days ahead

Seven days ahead

SWAAMAPE

2.99

2.60

2.11

2.39

2.70

2.89

2.51

2.86

4.04

1.98

2.85

3.83

3.92

2.63

3.51

3.50

2.66 1.74

2.71

3.22

2.39

2.45

2.57

3.44

2.65 2.36

2.83

1.53

2.92

1.66

2.95

3.36

3.08 2.17

2.64

2.74

2.98

2.41

2.89

3.29

2.78

Table 3 MAPE of PLF using LMBP learning technique

Season Winter Summer Rainy DV Average

Current day 3.23 2.69 2.49 2.33 2.69 One day ahead 2.68 3.01 2.74 1.57 2.50 Two days ahead 2.43 2.60 3.66 3.15 2.96 Three days ahead 2.94 3.20 2.92 2.31 2.84 Four days ahead 3.38 -2.42 3.66 2.06 2.89

Six days ahead 4.19 4.07 3.11 2.52 3.47 Five days ahead 2.80 3.42 3.25 2.68 3.04

Seven days ahead 1.78 2.63 2.67 3.09 2.54 SWAAMAPE 2.87

Table 4 MAPE of PLF using BFGS-BP neural learning technique

Season Winter Summer Rainy DV Average

Current day

One day ahead

Two days ahead

Three days ahead

Four days ahead

Five days ahead

Six days ahead

Seven days ahead

SWAAMAPE

3.02 2.85 1.82 2.53 1.92 2.70 2.63 4.07

2.12 2.87 2.62 2.16 3.05 1.86 1.91 2.43

2.22 1.28 2.26 2.58 3.10 2.64 3.24 2.28

1.78 2.89 1.60 0.96 1.94 2.10 1.82 2.30

2.28

2.08 2.47

2.21 2.50 2.32 2.40 2.77 2.38

Table 5: MAPE of PLF using OSS-BP neural learning technique

Season

Current day

One day ahead

Two days ahead

Three days ahead Four days ahead

Five days ahead

Six days ahead

Seven days ahead

SWAAMAPE

Winter

2.58 2.80 1.70 1.54 2.62 3.29 3.50

2.56

Summer

1.91 2.08 1.82 2.97 2.42 1.64 3.20 3.11

Rainy

1.90 2.30 2.61 2.66 2.44 3.03 2.93 3.88

~

DV

1.93 2.39 1.93 1.74 1.70 2.37 1.52 2.15

Average

2.08 2.39 2.02 2.23 2.30

2.58 2.78 2.92 2.41

100 - training error

validafion error ..............

5 w

...............................................................................................................

10-2 0 1000 2000 3000 4000 5000 6000 7000 8000 9000

eoochs

I "

- Wining error validation error ..............

L '. n 10-1 . E -. 3 w

- - _ _

10 , 1

- .-\ _ - ........................................... :.:..<.?>..?...~.2?? .................. %

I

- - _ _ - ........................................... :.=..~~.>..?...~.~-~.~ ......................... ?

- L'

validation error .. . ......... ..

100 I

2 : 10-1 L m rs

The optimisation of user-defined parameters has been done to avoid trapping of the network in local minima. The perfonnance of LMBP, BFGS-BP and OSS-BP techniques has been compared with that of SDBP technique. From Fig. 7 it is evident that by using the four methods, the rwall error was inaxitnuin from LMBP leaming, moderate from SDBP learning and was of minimum order from BFGS-BP and OSS-BP type of leaming. Similarly, from Fig. 8 it is evident that by using the four methods, the mean MFLOP was of inaxitnuin order from SDBP and LMBP learnins, moderate from BFGS-BP learning and minimum from

ltE Proc~G~mer. Trinam DiJirih.

Table6: MFLOP and time comparison of SDBP, LMBP, BFGS-BP and OSS-BP learning technique for PLF

Neural Min. Max. Mean Min. Max. Mean learning MFLOP MFLOP MFLOP time, time, time, technique S S s

SDBP 40.56 804.27 642.39 6.76 121.93 104.79 LMBP 7.75 35276.70 517.15 0.65 370.63 6.16 BFGS-BP 7.95 2257.96 41.82 0.82 67.84 1.85 0%-BP 1.01 29.22 6.54 0.76 5.33 1.59

1000 mean MFLOP mean time. s

.- i! 100

s 10

- . a

a

1

OSS-BP type of learning. Also. from Fig. 8 it is evident that by using the four methods. the mean timc was maximum from SDBP learning, moderate from LMBP learning and was of minimum order from BFGS-BP and OSS-BP type of learning. In comparison with OSS-BP, the BFGS-BP technique has given 1.17 "h lower SWAAMAPE, taken 538.64 more MFLOP and 16.45 % mot-e time. Thus, in overall comparison. the OSS-BP technique takes minimum MFLOP, minimum time pcr learning and its recall error is of the minimum order. Hence, among the four techniques, OSS-BP has been adjudged to be the best learning technique for PLF.

9 Acknowledgments

The authors would like to thank Haryana Vidyut Prasaran Nigatn Ltd, Karnal for providing the peak load data and Central Soil Salinity Research Institute. Karnal for providing the weather data.

10 References

I PAKK. D.C.. EL-SHARKAWI. M A . MAKKS I I . R.J., ATLAS. L.E.. atid IIAMBORG. M.J.: 'Electric load forecasting using a n xlificial iicuriil network. IEEE Trutz.~. Poww S ~ S L . 1991. 6, (21, pp. M 7 A O . .- . .,

2 LIU. C.N.. LIU. 1i.T.. ;and VEMURI. S.: 'Nrural rhm- te rm load Curecasting'. IEEE 7hsn. POIW s ~ ~ i . . 21&-14,

network based 1993.8. (I). pp.

.,.," i_i

3 PAPALEXOPOULOS. A.D.. HAO. S.. and TIE-MA0 PENG. S.: 'An implcmrnfation ofil cncuriil nclwork based load forecasting model R x the EMS'. IEKE Trms P o w r S~.SI . . 1994. Y. (4). pp. 19561962

4 DASH. P.K.. LIEW. A.C.. and RAMAKKISHNA. G.: 'Powcr- demand forecasting using B iicuriil network with an adaptive leaming algor~thm', IEEPrw., Gmo1 Tmou,?,. Di\irih.. 1995. 142. (6). pp. Sh(b 168 KHOTANZAD. A.. HWANG. R.-C.. ABAYE. A,. and MARA- TUKULAM. D.: 'An ;dmtive modular anificial neural network

5

hourly load forecaslcr and'its implemenlalion at elcctnc iililities'. IELL T w z s . Yowr S?SI.. I995 IO. (3). pp. 17161722

6 ALPUHAID. A.S.. EL-SAYED. M.A.. and MAHMOUD. M.S.: ' C .I%iided , IEEE T,re~r F o i w S'ri.. 1997. 12. (4). pp. 1524-1529

snilicial neursl networks fur shon-tcm load forcaisting'.

583

7 KHOTANZAD. A,. and AFKHAMI-ROHANI, R.: 'AN- 14 DENNIS. S.F... and SCHNAHEL. R.B.: 'Numel-icnl mcthods for NSTLF-Artificial neur;d network short term load forcwster-Ccn- uncoiistraincd optimization and nonlinear rqualions' (Prentice Hall. eration threc'. IEEE fimi.~. Powr Swt.. 1998. 13, (4). pp. 1413-1422 Englewood Cliffs. New Jerxy. 1983)

8 HACAN. M.T.. and MEHNA.1,. M.H.: 'Training fkd funvlird 15 MCKEOWN. S.J.. MEECAN. D. and .SPRhVAK. D.: 'An networks with Maraunrdt nleodhm'. IEEE Trurs. N w o l A'eiw. introduction to unconstrained ontitmimion'. (Adam Hikcr, 1994. 5. (6). pp. 989-bY3 1990) BATTITI. R.: 'Fiist and ssond-urdcr methods for learning: Uciweeii 16 SAINI. L.M.: 'Some aspects of iiiilCIline leamirig using artificiel st~cpest descent and Newton's method'. h'eu,wl Gmpul., 1992. 4. (?). neural network'. PhD dissert;ition. Kurukshetra University. Depart- pp. 141-166 men1 of Elsctl-iciil Engineering. 2001. Kurukshctia. I-laryana.

I O BA11'171. R., and MASULLI. F.: 'BFGS uph iz i t i on for iaster and India. automated supervised learning'. lnlcrnntlonal ncural nctwork 17 H A K M A N . H.H.: 'Modem fiicior analysis' (University of Chicago, confcrene. Prcss. Chicago. 1976. 3rd rdn.) BATTITI. R.. and TECCHIOLLI. G.: 'Lcarninc with f i s t . mond NOUYEN. D.. and WIIIKOW. 8.: 'lniilrorine the leamine sneed of

9

P ; ~ s . 2 suly. 1990. pp. 757-760 II 18

and no dedwtivcs: A case studv in high cnerpy physics'. Neiiromric 2-kiyer neural nctworks by choorina initial &ucs of t h e adaotive ~ .~

puii,ig 19Y4. 6. (2) . pp. 181-206.

A n.: 'Anohcation of radial hnsis function neural network model for

wcifhts'. 'International j&t conrere& 011 Neural nctworks. .July

19 SARLE. W.% 'Stoooed trainiiie and ~, ther rcmdies for over fittinn'. 12 RANAWEERA. D.K..H,UBELE.N.F..andPAPALEXOPOULOS, 1990. VOI. 3. pp. 21-26

~~~~~~ ~~ - ~~~ ~. . ~ ~~~~ ~ ~~

s h & c r ~ ~ load iorccasting', IEE Proc.. Cirier Trmvri. Di.siiib.. 1995. 142. (1). pp. 45-50 SCALES. L.E.: .Introduction to nonlinear optimimtion' (Springer- Verlilg. Ncw York, 19x5)

Prncnted ai the 27th symposium on Interface. stillistics a i d manuf'dcturing with sabrtmces in enviranmencil statistics. graphics and imaging. 1995 13

584

Date post:	20-Sep-2016
Category:	Documents
Upload:	mk
View:	244 times
Download:	0 times

Artificial neural network based peak load forecasting using Levenberg-Marquardt and quasi-Newton...

Documents