+ All Categories
Home > Documents > A comprehensive review for industrial applicability...

A comprehensive review for industrial applicability...

Date post: 15-Jun-2018
Category:
Upload: ngoduong
View: 213 times
Download: 0 times
Share this document with a friend
17
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003 585 A Comprehensive Review for Industrial Applicability of Artificial Neural Networks Magali R. G. Meireles, Paulo E. M. Almeida, Student Member, IEEE, and Marcelo Godoy Simões, Senior Member, IEEE Abstract—This paper presents a comprehensive review of the industrial applications of artificial neural networks (ANNs), in the last 12 years. Common questions that arise to practitioners and control engineers while deciding how to use NNs for specific industrial tasks are answered. Workable issues regarding imple- mentation details, training and performance evaluation of such algorithms are also discussed, based on a judiciously chronolog- ical organization of topologies and training methods effectively used in the past years. The most popular ANN topologies and training methods are listed and briefly discussed, as a reference to the application engineer. Finally, ANN industrial applications are grouped and tabulated by their main functions and what they actually performed on the referenced papers. The authors prepared this paper bearing in mind that an organized and normalized review would be suitable to help industrial managing and operational personnel decide which kind of ANN topology and training method would be adequate for their specific problems. Index Terms—Architecture, industrial control, neural network (NN) applications, training. I. INTRODUCTION I N engineering and physics domains, algebraic and differen- tial equations are used to describe the behavior and func- tioning properties of real systems and to create mathematical models to represent them. Such approaches require accurate knowledge of the system dynamics and the use of estimation techniques and numerical calculations to emulate the system operation. The complexity of the problem itself may introduce uncertainties, which can make the modeling nonrealistic or in- accurate. Therefore, in practice, approximate analysis is used and linearity assumptions are usually made. Artificial neural networks (ANNs) implement algorithms that attempt to achieve a neurological related performance, such as learning from ex- perience, making generalizations from similar situations and judging states where poor results were achieved in the past. ANN history begins in the early 1940s. However, only in the mid-1980s these algorithms became scientifically sound and ca- Manuscript received October 23, 2001; revised September 20, 2002. Abstract published on the Internet March 4, 2003. This work was supported by the Na- tional Science Foundation under Grant ECS 0134130. M. R. G. Meireles was with the Colorado School of Mines, Golden, CO 80401 USA. She is now with the Mathematics and Statistics Department, Pon- tific Catholic University of Minas Gerais, 30.535-610 Belo Horizonte, Brazil (e-mail: [email protected]). P. E. M. Almeida was with the Colorado School of Mines, Golden, CO 80401 USA. He is now with the Federal Center for Technological Education of Minas Gerais, 30.510-000 Belo Horizonte, Brazil (e-mail: [email protected]). M. G. Simões is with the Colorado School of Mines, Golden, CO 80401 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TIE.2003.812470 pable of application. Since the late 1980s, ANN started to be utilized in a plethora of industrial applications. Nowadays, ANN are being applied to a lot of real world, industrial problems, from functional prediction and system modeling (where physical processes are not well understood or are highly complex), to pattern recognition engines and robust classifiers, with the ability to generalize while making decisions about imprecise input data. The ability of ANN to learn and approximate relationships between input and output is decoupled from the size and complexity of the problem [49]. Actually, as relationships based on inputs and outputs are enriched, approximation capability improves. ANN offers ideal solutions for speech, character and signal recognition. There are many different types of ANN. Some of the more popular include multilayer perceptron (MLP) (which is generally trained with the back-propagation of error algorithm), learning vector quantization, radial basis function (RBF), Hopfield and Kohonen, to name a few. Some ANN are classified as feed forward while others are recurrent (i.e., implement feedback) depending on how data is processed through the network. Another way of classifying ANN types is by their learning method (or training), as some ANN employ supervised training, while others are referred to as unsupervised or self-organizing. This paper concentrates on industrial applications of neural networks (NNs). It was found that training methodology is more conveniently associated with a classification of how a certain NN paradigm is supposed to be used for a particular indus- trial problem. There are some important questions to answer, in order to adopt an ANN solution for achieving accurate, con- sistent and robust modeling. What is required to use an NN? How are NNs superior to conventional methods? What kind of problem functional characteristics should be considered for an ANN paradigm? What kind of structure and implementation should be used in accordance to an application in mind? This article will follow a procedure that will bring such manage- rial questions together and into a framework that can be used to evaluate where and how such technology fits for industrial applications, by laying out a classification scheme by means of clustered concepts and distinctive characteristics of ANN engineering. II. NN ENGINEERING Before laying out the foundations for choosing the best ANN topology, learning method and data handling for classes of industrial problems, it is important to understand how artificial intelligence (AI) evolved with required computational resources. 0278-0046/03$17.00 © 2003 IEEE
Transcript

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003 585

A Comprehensive Review for Industrial Applicabilityof Artificial Neural Networks

Magali R. G. Meireles, Paulo E. M. Almeida, Student Member, IEEE, andMarcelo Godoy Simões, Senior Member, IEEE

Abstract—This paper presents a comprehensive review of theindustrial applications of artificial neural networks (ANNs), inthe last 12 years. Common questions that arise to practitionersand control engineers while deciding how to use NNs for specificindustrial tasks are answered. Workable issues regarding imple-mentation details, training and performance evaluation of suchalgorithms are also discussed, based on a judiciously chronolog-ical organization of topologies and training methods effectivelyused in the past years. The most popular ANN topologies andtraining methods are listed and briefly discussed, as a referenceto the application engineer. Finally, ANN industrial applicationsare grouped and tabulated by their main functions and whatthey actually performed on the referenced papers. The authorsprepared this paper bearing in mind that an organized andnormalized review would be suitable to help industrial managingand operational personnel decide which kind of ANN topology andtraining method would be adequate for their specific problems.

Index Terms—Architecture, industrial control, neural network(NN) applications, training.

I. INTRODUCTION

I N engineering and physics domains, algebraic and differen-tial equations are used to describe the behavior and func-

tioning properties of real systems and to create mathematicalmodels to represent them. Such approaches require accurateknowledge of the system dynamics and the use of estimationtechniques and numerical calculations to emulate the systemoperation. The complexity of the problem itself may introduceuncertainties, which can make the modeling nonrealistic or in-accurate. Therefore, in practice, approximate analysis is usedand linearity assumptions are usually made. Artificial neuralnetworks (ANNs) implement algorithms that attempt to achievea neurological related performance, such as learning from ex-perience, making generalizations from similar situations andjudging states where poor results were achieved in the past.ANN history begins in the early 1940s. However, only in themid-1980s these algorithms became scientifically sound and ca-

Manuscript received October 23, 2001; revised September 20, 2002. Abstractpublished on the Internet March 4, 2003. This work was supported by the Na-tional Science Foundation under Grant ECS 0134130.

M. R. G. Meireles was with the Colorado School of Mines, Golden, CO80401 USA. She is now with the Mathematics and Statistics Department, Pon-tific Catholic University of Minas Gerais, 30.535-610 Belo Horizonte, Brazil(e-mail: [email protected]).

P. E. M. Almeida was with the Colorado School of Mines, Golden, CO 80401USA. He is now with the Federal Center for Technological Education of MinasGerais, 30.510-000 Belo Horizonte, Brazil (e-mail: [email protected]).

M. G. Simões is with the Colorado School of Mines, Golden, CO 80401 USA(e-mail: [email protected]).

Digital Object Identifier 10.1109/TIE.2003.812470

pable of application. Since the late 1980s, ANN started to beutilized in a plethora of industrial applications.

Nowadays, ANN are being applied to a lot of real world,industrial problems, from functional prediction and systemmodeling (where physical processes are not well understoodor are highly complex), to pattern recognition engines androbust classifiers, with the ability to generalize while makingdecisions about imprecise input data. The ability of ANN tolearn and approximate relationships between input and outputis decoupled from the size and complexity of the problem[49]. Actually, as relationships based on inputs and outputsare enriched, approximation capability improves. ANN offersideal solutions for speech, character and signal recognition.There are many different types of ANN. Some of the morepopular includemultilayerperceptron (MLP) (which isgenerallytrained with the back-propagation of error algorithm), learningvector quantization, radial basis function (RBF), Hopfield andKohonen, to name a few. Some ANN are classified as feedforward while others are recurrent (i.e., implement feedback)depending on how data is processed through the network.Another way of classifying ANN types is by their learningmethod (or training), as some ANN employ supervised training,while others are referred to as unsupervised or self-organizing.

This paper concentrates on industrial applications of neuralnetworks (NNs). It was found that training methodology is moreconveniently associated with a classification of how a certainNN paradigm is supposed to be used for a particular indus-trial problem. There are some important questions to answer,in order to adopt an ANN solution for achieving accurate, con-sistent and robust modeling. What is required to use an NN?How are NNs superior to conventional methods? What kind ofproblem functional characteristics should be considered for anANN paradigm? What kind of structure and implementationshould be used in accordance to an application in mind? Thisarticle will follow a procedure that will bring such manage-rial questions together and into a framework that can be usedto evaluate where and how such technology fits for industrialapplications, by laying out a classification scheme by meansof clustered concepts and distinctive characteristics of ANNengineering.

II. NN ENGINEERING

Before laying out the foundations for choosing the best ANNtopology, learning method and data handling for classes ofindustrial problems, it is important to understand how artificialintelligence (AI) evolved with required computational resources.

0278-0046/03$17.00 © 2003 IEEE

586 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

Artificial intelligence applications moved away from laboratoryexperiments to real world implementations. Therefore, soft-ware complexity also became an issue since conventional VonNeumann machines are not suitable for symbolic processing,nondeterministic computations, dynamic execution, parallel,distributed processing, and management of extensive knowl-edge bases [118].

In many AI applications, the knowledge needed to solvea problem may be incomplete, because the source of theknowledge is unknown at the time the solution is devised, orthe environment may be changing and cannot be anticipatedat design time. AI systems should be designed with an openconcept that allows continuous refinement and acquisition ofnew knowledge.

There exist engineering problems for which finding the per-fect solution requires a practically impossible amount of re-sources and an acceptable solution would be fine. NNs can givegood solutions for such classes of problems. Tackling the bestANN topology, learning method and data handling themselvesbecome engineering approaches. The success of using ANN forany application depends highly on the data processing, (i.e., datahandling before or during network operation). Once variableshave been identified and data has been collected and is ready touse, one can process it in several ways, to squeeze more infor-mation out of and filter it.

A common technique for coping with nonnormal data is toperform a nonlinear transform to the data. To apply a transform,one simply takes some function of the original variable and usesthe functional transform as a new input to the model. Commonlyused nonlinear transforms include powers, roots, inverses, expo-nentials, and logarithms [107].

A. Assessment of NN Performance

An ANN must be used in problems exhibiting knottiness,nonlinearity, and uncertainties that justify its utilization [45].They present the following features to cope with such complex-ities:

• learning from training data used for system identifi-cation; finding a set of connection strengths will allowthe network to carry out the desired computation [96];• generalization from inputs not previously presented

during the training phase; by accepting an input andproducing a plausible response determined by the in-ternal ANN connection structure makes such a systemrobust against noisy data, features exploited in indus-trial applications [59];• mapping of nonlinearity making them suitable for

identification in process control applications [90];• parallel processing capability, allowing fast pro-

cessing for large-scale dynamical systems;• applicable to multivariable systems; they naturally

process many inputs and have many outputs.• used as a black-box approach (no prior knowledge

about a system) and implemented on compact proces-sors for space and power constrained applications.

In order to select a good NN configuration, there are severalfactors to take into consideration. The major points of interest

regarding the ANN topology selection are related to networkdesign, training, and practical considerations [25].

B. Training Considerations

Considerations, such as determining the input and outputvariables, choosing the size of the training data set, initializingnetwork weights, choosing training parameter values (suchas learning rate and momentum rate), and selecting trainingstopping criteria, are important for several network topologies.There is no generic formula that can be used to choose theparameter values. Some guidelines can be followed as an initialtrial. After a few trials, the network designer should haveenough experience to set appropriate criteria that suit a givenproblem.

The initial weights of an NN play a significant role in the con-vergence of the training method. Without a priori informationabout the final weights, it is a common practice to initialize allweights randomly with small absolute values. In linear vectorquantization and derived techniques, it is usually required torenormalize the weights at every training epoch. A critical pa-rameter is the speed of convergence, which is determined bythe learning coefficient. In general, it is desirable to have fastlearning, but not so fast as to cause instability of learning iter-ations. Starting with a large learning coefficient and reducingit as the learning process proceeds, results in both fast learningand stable iterations. The momentum coefficients are, usually,set according to a schedule similar to the one for the learningcoefficients [128].

Selection of training data plays a vital role in the performanceof a supervised NN. The number of training examples used totrain an NN is sometimes critical to the success of the trainingprocess. If the number of training examples is not sufficient,then the network cannot correctly learn the actual input–outputrelation of the system. If the number of training examples is toolarge, then the network training time will be longer. For someapplications, such as real-time adaptive neural control, trainingtime is a critical variable. For others, such as training the net-work to perform fault detection, the training can be performedoff-line and more training data are preferred, over using insuffi-cient training data to achieve greater network accuracy. Gener-ally, rather than focusing on volume, it is better to concentrateon the quality and representational nature of the data set. A goodtraining set should contain routine, unusual and boundary-con-dition cases [8].

Popular criteria used to stop network training are small mean-square training error and small changes in network weights.Definition about how small is usually up to the network designerand is based on the desired accuracy level of the NN. Usingas an example a motor bearing fault diagnosis process, theyused a learning rate of 0.01 and momentum of 0.8; the trainingwas stopped when the root mean-square error of the trainingset or the change in network weights was sufficiently smallfor that application (less than 0.005) [72]. Therefore, if anyprior information about the relationship between inputs andoutputs is available and used correctly, the network structureand training time can be reduced and the network accuracycan be significantly improved.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 587

TABLE IORGANIZATION OF NNS BASED ONTHEIR FUNCTIONAL CHARACTERISTICS

C. Network Design

Some of the design considerations include determining thenumber of input and output nodes to be used, the number ofhidden layers in the network and the number of hidden nodesused in each hidden layer. The number of input nodes is typi-cally taken to be the same as the number of state variables. Thenumber of output nodes is typically the number that identifiesthe general category of the state of the system. Each node con-stitutes a processing element and it is connected through var-ious weights to other elements. In the past, there was a generalpractice of increasing the number of hidden layers, to improvetraining performance. Keeping the number of layers at three andadjusting the number of processing elements in the hidden layer,can achieve the same goal. A trial-and-error approach is usuallyused to determine the number of hidden layer processing ele-ments, starting with a low number of hidden units and increasingthis number as learning problems occur. Even though choosingthese parameters is still a trial-and-error process, there are someguidelines that can be used, (i.e., testing the network’s perfor-mance). It is a common practice to choose a set of training dataand a set of testing data that are statistically significant and rep-resentative of the system under consideration. The training dataset is used to train the NN, while the testing data is used to testthe network performance, after the training phase finishes.

D. Practical Considerations

Practical considerations regarding the network accuracy, ro-bustness and implementation issues must be addressed, forreal-world implementation. For ANN applications, it is usu-ally considered a good estimation performance when patternrecognition achieves more than 95% of accuracy in overall andcomprehensive data recalls [25]. Selection and implementationof the network configuration needs to be carefully studied sinceit is desirable to use the smallest possible number of nodeswhile maintaining a suitable level of conciseness. Pruning al-gorithms try to make NNs smaller by trimming unnecessarylinks or units, so the cost of runtime, memory and hardwareimplementation can be minimized and generalization is im-proved. Depending on the application, some system functionalcharacteristics are important in deciding which ANN topologyshould be used [81]. Table I summarizes the most common ANNstructures used for pattern recognition, associative memory,optimization, function approximation, modeling and control,image processing, and classification purposes.

III. EVOLUTION OF UNDERLYING FUNDAMENTALS THAT

PRECEDEDINDUSTRIAL APPLICATIONS

While there are several tutorials and reviews discussing thefull range of NNs topologies, learning methods and algorithms,the authors in this paper intend to cover what had actually beenapplied to industrial applications. An initial historical perspec-tive is important to get the picture for the age of industrial ap-plications, which started in 1988, just after the release of [123]by Widrow.

It is well known that the concept of NNs came into ex-istence around the Second World War. In 1943, McCullochand Pitts proposed the idea that a mind-like machine could bemanufactured by interconnecting models based on behaviorof biological neurons, laying out the concept of neurologicalnetworks [77]. Wiener gave this new field the popular namecybernetics, whose principle is the interdisciplinary relation-ship among engineering, biology, control systems, and brainfunctions [125]. At that time, computer architecture was notfully defined and the research led to what is today definedas the Von Neumann-type computer. With the progress in re-search on the brain and computers, the objective changed fromthe mind-like machine to manufacturing a learning machine,for which Hebb’s learning model was initially proposed [53].In 1958, Rosenblatt from the Cornell Aeronautical Labora-tory put together a learning machine, called the “perceptron.”That was the predecessor of current NNs. He gave specificdesign guidelines used by the early 1960s [91]. Widrow andHoff proposed the “ADALINE” (ADAptive LINear Element),a variation on the pPerceptron, based on a supervised learningrule (the “error correction rule”) which could learn in a fasterand more accurate way: synaptic strengths were changed inproportion to the error (what the output is and what it shouldhave been) multiplied by the input. Such a scheme was suc-cessfully used for echo cancellation in telephone lines and isconsidered to be the first industrial application of NNs [124].During the 1960s, the forerunner for current associative memorysystems was the work of Steinbuch with his “Learning Ma-trix,” which was a binary matrix accepting a binary vectoras input, producing a binary vector as output and capable offorming associations between pairs with a Boolean Hebbianlearning procedure [108]. The perceptron received consider-able excitement, when it was first introduced, because of itsconceptual simplicity. The ADALINE is a weighted sum of theinputs, together with a least-mean-square (LMS) algorithm toadjust the weights and to minimize the difference between thedesired signal and the actual output. Because of the rigorousmathematical foundation of the LMS algorithm, ADALINEhas become a powerful tool for adaptive signal processing andadaptive control, leading to work on competitive learning andself-organization. However, Minsky and Papert proved math-ematically that the Perceptron could not be used for a classof problems defined as nonseparable logic functions [80].

Very few investigators conducted research on NNs during the1970s. Albus developed his adaptive “Cerebellar Model Articu-lation Controller” (CMAC), which is a distributed table-lookupsystem based on his view of models of human memory [1]. In1974, Werbos originally developed the backpropagation algo-rithm. Its first practical application was to estimate a dynamic

588 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

model, to predict nationalism and social communications [120].However, his work remained almost unknown in the scien-tific community for more than ten years. In the early 1980s,Hopfield introduced a recurrent-type NN, which was basedon Hebbian learning law. The model consisted of a set offirst-order (nonlinear) differentiable equations that minimize acertain energy function [55]. In the mid-1980s, backpropaga-tion was rediscovered by two independent groups led by Parkerand Rumelhartet al., as the learning algorithm of feedforwardNNs [88], [95]. Grossberg and Carpenter made significant con-tributions with the “Adaptive Resonance Theory” (ART) inthe mid-1980s, based on the idea that the brain spontaneouslyorganizes itself into recognition codes and neurons organizethemselves to tune various and specific patterns defined asself-organizing maps [20]. The dynamics of the network weremodeled by first order differentiable equations based on im-plementations of pattern clustering algorithms. Furthermore,Kosko extended some of the ideas of Grossberg and Hopfieldto develop his adaptive “Bi-directional Associative Memory”(BAM) [67]. Hinton, Sejnowski, and Ackley developed the“Boltzmann Machine,” which is a kind of Hopfield net thatsettles into solutions by a simulated annealing process as astochastic technique [54]. Broomhead and Lowe first intro-duced “RBF networks” in 1988 [15]. Although the basic ideaof RBF was developed before, under the name method of po-tential function, their work opened another NN frontier. Chenproposed functional-link networks (FLNs), where a nonlinearfunctional transform of the network inputs aimed lower com-putational efforts and fast convergence [22].

The 1988 DARPA NN study listed various NN applicationssupporting the importance of such technology for commercialand industrial applications and triggering a lot of interest inthe scientific community, leading eventually to applicationsin industrial problems. Since then, the application of ANN tosophisticated systems has skyrocketed. NNs found widespreadrelevance for several different fields. Our literature reviewshowed that practical industrial applications were reported inpeer-reviewed engineering journals from as early as 1988.Extensive use has been reported in pattern recognition andclassification for image and speech recognition, optimizationin planning of actions, motions, and tasks and modeling,identification, and control.

Fig. 1 shows some industrial applications of NNs reported inthe last 12 years. The main purpose here is to give an idea of themost used ANN topologies and training algorithms and to relatethem to common fields in the industrial area. For each entry,the type of the application, used ANN topology, implementedtraining algorithm, and the main authors are presented. The col-lected data give a good picture of what has actually migratedfrom academic research to practical industrial fields and showssome of the authors and groups responsible for this migration.

IV. DILEMMATIC PIGEONHOLE OFNEURAL STRUCTURES

Choosing an ANN solution for an immediate application isa situation that requires a choice between options that are (orseem) equally unfavorable or mutually exclusive. Several issuesmust be considered when regarding the problem point of view.The main features of an ANN can be classified as follows [81].

• Topology of the networks:multilayered, single-lay-ered, or recurrent. The network is multilayered if it hasdistinct layers such as input, hidden and output. Thereare no connections among the neurons within the samelayer. If each neuron can be connected with every otherneuron in the network through directed edges, exceptthe output node, this network is called single layered(i.e., there is no hidden layer). A recurrent networkdistinguishes itself from the feedforward topologies inthat it has at least one feedback loop [49].• Data flow: recurrent or nonrecurrent. A nonrecur-

rent or feedforward model where the outputs alwayspropagate from left to right in the diagrams. Theoutputs from the input layer neurons propagate to theright, becoming inputs to the hidden layer neurons andthen, outputs from the hidden layer neurons propagateto the right, becoming inputs to the output layerneurons. An NN, in which the outputs can propagatein both directions, forward and backward, is called arecurrent model.• Types of input values:binary, bipolar or contin-

uous. Neurons in an artificial network can be defined toprocess different kinds of signals. The most commontypes are binary (restricted to either 0 or 1), bipolar (ei-ther 1 or 1) and continuous (continuous real numbersin a certain range).• Forms of activation:linear, step, or sigmoid. Acti-

vation functions will define the way neurons will be-have inside the network structure and, therefore, thekind of relationship that will occur between input andoutput signals.

A common classification of ANNs is based on the way inwhich their elements are interconnected. There is a study thatpresents the approximate percentage of network utilization as:MLP, 81.2%; Hopfield, 5.4%; Kohonen, 8.3%; and the others,5.1% [49]. This section will cover the main types of networksthat have been used in industrial applications, in a reasonablenumber of reports and applications. A comprehensive listing ofall available ANN structures and topologies is out of the scopeof this discussion.

A. MLPs

In this structure, each neuron output is connected to everyneuron in subsequent layers connected in cascade with no con-nections between neurons in the same layer. A typical diagramof this structure is detailed in Fig. 2.

MLP has been reported in several applications. Some ex-amples are speed control of dc motors [94], [117], diagnosticsof induction motor faults [24], [25], [41], [42], inductionmotor control [17], [18], [56], [127], and current regulator forpulsewidth-modulation (PWM) rectifiers [31]. Maintenanceand sensor failure detection was reported in [82], check valvesoperating in a nuclear power plant [57], [114], and vibrationmonitoring in rolling element bearings [2]. It was widelyapplied in feedback control [19], [40], [52], [59], [87], [89],[109], [110] and fault diagnosis of robotic systems [116].This structure was also used in a temperature control system

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 589

Fig. 1. Selected industrial applications reported since 1989.

[63], [64], monitoring feed water flow rate and componentthermal performance of pressurized water reactors [61], and

fault diagnosis in a heat exchanger continuous stirred tankreactor system [102]. It was used in a controller for turbo

590 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

Fig. 2. MLP basic topology.

Fig. 3. Typical recurrent network structure.

generators [117], digital current regulation of inverter drivers[16], and welding process modeling and control [4], [32]. TheMLP was used in modeling chemical process systems [12], toproduce quantitative estimation of concentration of chemicalcomponents [74], and to select powder metallurgy materialsand process parameters [23]. Optimization of the gas industrywas reported by [121], as well as prediction of daily natural gasconsumption needed by gas utilities [65]. The MLP is indeedthe most used structure and spread out across several disciplineslike identification and defect detection onwoven fabrics [99],prediction of paper cure in the papermaking industry [39],controller steering backup truck [85], and modeling of platerolling processes [46].

B. Recurrent NNs (RNNs)

A network is called recurrent when the inputs to the neuronscome from external input, as well as from the internal neurons,consisting of both feed forward and feedback connections be-tween layers and neurons. Fig. 3 shows such a structure; it wasdemonstrated that recurrent network could be effectively usedfor modeling, identification, and control of nonlinear dynam-ical systems [83].

The trajectory-tracking problem, of controlling the nonlineardynamic model of a robot, was evaluated using an RNN; thisnetwork was used to estimate the dynamics of the system and itsinverse dynamic model [97]. It was also used to control robotic

Fig. 4. Hopfield network structure.

manipulators, facilitating the rapid execution of the adaptationprocess [60]. A recurrent network was used to approximate atrajectory tracking to a very high degree of accuracy [27]. Itwas applied to estimate the spectral content of noisy periodicwaveforms that are common in engineering processes [36]. TheHopfield network model is the most popular type of recurrentNN. It can be used as associative memory and can also be ap-plied to optimization problems. The basic idea of the Hopfieldnetwork is that it can store a set of exemplar patterns as multiplestable states. Given a new input pattern, which may be partialor noisy, the network can converge to one of the exemplar pat-terns nearest to the input pattern. As shown in Fig. 4, a Hopfieldnetwork consists of a single layer of neurons. The network isrecurrent and fully interconnected (every neuron in the networkis connected to every other neuron). Each input/output takes adiscrete bipolar value of either 1 or1 [81].

A Hopfield network was used to indicate how to apply it tothe problem of linear system identification, minimizing the leastsquare of error rates of estimates of state variables [29]. A mod-ified Hopfield structure was used to determine the imperfectionby the degree of orthogonality between the automated extractedfeature, from the send-through image and the class feature ofearly good samples. The performance measure used for suchan automatic feature extraction is based on a certain mini-maxcost function useful for image classification [112]. Simulationresults illustrated the Hopfield network’s use, showing that thistechnique can be used to identify the frequency transfer func-tions of dynamic plants [28]. An approach to detect and isolatefaults in linear dynamic systems was proposed and systems pa-rameters were estimated by Hopfield network [105].

C. Nonnrecurrent Unsupervised Kohonen Networks

A Kohonen network is a structure of interconnected pro-cessing units that compete for the signal. It uses unsupervisedlearning and consists of a single layer of computational nodes(and an input layer). This type of network uses lateral feedback,which is a form of feedback whose magnitude is dependent onthe lateral distance from the point of application. Fig. 5 showsthe architecture with two layers. The first is the input layerand the second is the output layer, called the Kohonen layer.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 591

Fig. 5. Kohonen network structure.

Fig. 6. CMAC network structure.

Every input neuron is connected to every output neuron with itsassociated weight. The network is nonrecurrent, input informa-tion propagates only from the left to right. Continuous (ratherthan binary or bipolar) input values representing patterns arepresented sequentially in time through the input layer, withoutspecifying the desired output. The output neurons can bearranged in one or two dimensions. A neighborhood parameter,or radius, , can be defined to indicate the neighborhood of aspecific neuron. It has been used as a self-organization map forclassification [98] and pattern recognition purposes in general.

D. CMAC

The input mapping of the CMAC algorithm can be seen asa set of multidimensional interlaced receptive fields, each onewith finite and sharp borders. Any input vector to the networkexcites some of these fields, while the majority of the receptivefields remain unexcited, not contributing to the correspondingoutput. On the other hand, the weighted average of the excitedreceptive fields will form the network output. Fig. 6 shows a

Fig. 7. Adaptive resonance theory network.

schematic diagram of this structure. This figure depicts the non-linear input mapping in the Albus approach and a hashing oper-ation that can be performed to decrease the amount of memoryneeded to implement the receptive fields.

CMAC networks are considered local algorithms because, fora given input vector, only a few receptive fields will be activeand contribute to the corresponding network output [3]. In thesame way, the training algorithm for a CMAC network shouldaffect only the weights corresponding to active fields, excludingthe majority of inactive fields in the network. This increases theefficiency of the training process, minimizing the computationalefforts needed to perform adaptation in the whole network.

CMAC was primarily applied to complex robotic systemsinvolving multiple feedback sensors and multiple commandvariables. Common experiments involved control of positionand orientation of an object using a video camera mountedat the end of a robot arm and moving objects with arbitraryorientation relative to the robot [79]. This network was alsoused for air-to-fuel ratio control of automotive fuel-injectionsystems. Experimental results showed that the CMAC is veryeffective in learning the engine nonlinearities and in dealingwith the significant time delays inherent in engine sensors [76].

E. Adaptive Resonance Theory (ART) (Recurrent,Unsupervised)

The main feature of ART, when compared to other similarstructures, is its ability to not forget after learning. Usually,NNs are not able to learn new information without damagingwhat was previously ascertained. This is caused by the factthat when a new pattern is presented to an NN in the phaseof learning, the network tries modifying the weights at nodeinputs, which only represent what was previously learned. TheART network is recurrent and self-organizing. Its structure isshown in Fig. 7. It has two basic layers and no hidden layers.The input layer is also called “comparing” while the outputlayer is called “recognizing.” This network is composed of twocompletely interconnected layers in both directions. [58]. It wassuccessfully used for sensor pattern interpretation problems[122], among others.

592 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

Fig. 8. RBF network structure.

F. RBF Networks

Fig. 8 shows the basic structure of an RBF network. The inputnodes pass the input values to the connecting arcs and the firstlayer connections are not weighted. Thus, each hidden nodereceives each input value unaltered. The hidden nodes are theRBF units. The second layer of connections is weighted and theoutput nodes are simple summations. This network does not ex-tend to more layers.

For applications such as fault diagnosis, RBF networks offeradvantages over MLP. It is faster to train because training ofthe two layers is decoupled [70]. This network was used toimprove the quantity and the quality of the galvanneale sheetproduced on galvanizing lines, integrating various approaches,including quality monitoring, diagnosis, control, and optimiza-tion methods [13]. RBF was trained to evaluate and comparethe different grasping alternatives by a robotic hand, accordingto geometrical and technological aspects of object surfaces,as well as the specific task to be accomplished. The adoptionof RBF topology was justified by two reasons [34].

• In most cases, it presents higher training speedwhen compared with ANN based on back-propagationtraining methods.• It allows an easier optimization of performance,

since the only parameter that can be used to modify itsstructure is the number of neurons in the hidden layer.

Results using RBF networks were presented to illustrate that it ispossible to successfully control a generator system [43]. Powerelectronic drive controllers have also been implemented withthese networks, in digital signal processors (DSPs), to attain ro-bust properties [38].

G. Probabilistic NNs (PNNs)

PNNs are somewhat similar in structure to MLPs. Somebasic differences among them are the use of activation byexponential functions and the connection patterns betweenneurons. As a matter of fact, the neurons at the internal layersare not fully connected, depending on the application in turn.Fig. 9 depicts this structure, showing its basic differences fromordinary MLP structure. PNN training is normally easy andinstantaneous, because of the smaller number of connections.

Fig. 9. Probabilistic ANN structure.

Fig. 10. Polynomial ANN structure.

A practical advantage is that, unlike other networks, it operatescompletely in parallel and the signal flows in a unique direction,without a need for feedback from individual neurons to theinputs. It can be used for mapping, classification and associativememory, or to directly estimatea posterioriprobabilities [103],[104]. Probabilistic NNs were used to assist operators whileidentifying transients in nuclear power plants, such as plantaccident scenario, equipment failure or an external disturbanceto the system, at the earliest stages of their developments [6].

H. Polynomial Networks

Fig. 10 depicts a polynomial network. It has its topologyformed during the training process. Due to this feature, it isdefined as a plastic network. The neuron activation function isbased on elementary polynomials of arbitrary order. In this ex-ample, the network has seven inputs, although the network usesonly five of them. This is due to the automatic input selectioncapability of the training algorithm. Automatic feature selectionis very useful in control applications when the plant model orderis unknown. Each neuron output can be expressed by a second-order polynomial function

, where and are inputs andand are polynomial coefficients which are equivalent to thenetwork weights and is the neuron output. The Group Methodof Data Handling (GMDH) is a statistics-based training methodlargely used in modeling economic, ecological, environmentaland medical problems. The GMDH training algorithm can be

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 593

Fig. 11. Functional link ANN structure.

used to adjust the polynomial coefficients and to find the net-work structure. This algorithm employs two sets of data: Onefor estimating the network weights and the other for testing withneurons should survive during the training process. A new formof implementation of a filter was proposed using a combinationof recurrent NN and polynomial NN [101].

I. FLNs

Since NNs are used for adaptive identification and control,the learning capabilities of the networks can have significanteffects on the performance of closed-loop systems. If the infor-mation content of data input to a network can be modified on-line, then it will more easily extract salient features of the data.The functional link acts on an element of an input vector, or onall the input vectors, by generating a set of linearly independentfunctions, then evaluating these functions with the pattern as theargument. Thus, both the training time and training error of thenetwork can be improved [113]. Fig. 11 shows a functional linkNN, which can be considered a one-layer feedforward networkwith an input preprocessing element. Only the weights in theoutput layer are adjusted [66].

The application of an FLN was presented for heating, ven-tilating, and air conditioning (HVAC) thermal dynamic systemidentification and control. The use of an NN provided a means ofadapting a controller online, in an effort to minimize a given costindex. The identification networks demonstrated the capacity tolearn changes in the plant dynamics and to accurately predictfuture plan behavior [113]. A robust ANN controller, to the mo-tion control of rigid-link electrically driven robot using an FLN,had been presented. The method did not require the robot dy-namics to be exactly known [69]. Multilayer feedforward andFLN forecasters were used to model the complex relationshipbetween weather parameters and previous gas intake with futureconsumption [65]. The FLN was used to improve performancein the face of unknown nonlinear characteristics by adding non-linear effects to the linear optimal controller of robotic systems[66].

J. Functional Polynomial Networks (FPNs)

This network structure merges both models of functionallink and polynomial network resulting in a very powerful ANN

Fig. 12. Common grids of CNNs.

model due to the automatic input selection capability of thepolynomial networks. The FPN presents advantages such as fastconvergence, no local minima problem, structure automaticallydefined by the training process, and no adjustment of learningparameters. It has been tested for speed control with a dc motorand the results have been compared with the ones provided byan indirect adaptive control scheme based on MLPs trained bybackpropagation [101].

K. Cellular NNs (CNNs)

The most general definition for such networks is that theyare arrays of identical dynamical systems, called cells, whichare only locally connected. Only adjacent cells interact directlywith each other [78]. In the simplest case, a CNN is an arrayof simple, identical, nonlinear, dynamical circuits placed on atwo–dimensional (2-D) geometric grid, as shown in Fig. 12. Ifthese grids are duplicated in a three–dimensional (3-D) form,a multilayer CNN can be constructed [30]. It is an efficient ar-chitecture for performing image processing and pattern recog-nition [51]. This kind of network has been applied to problemsof image classification for quality control. Gulglielmiet al.[47]described a fluorescent magnetic particle inspection, which isa nondestructive method for quality control of ferromagneticmaterials.

V. TRAINING METHODS

There are basically two main groups of training (or learning)algorithms: supervised learning (which includes reinforcementlearning) and unsupervised learning. Once the structure of anNN has been selected, a training algorithm must be attached,to minimize the prediction error made by the network (forsupervised learning) or to compress the information from theinputs (for unsupervised learning). In supervised learning, thecorrect results (target values, desired outputs) are known andare given to the ANN during training so that the ANN canadjust its weights to try match its outputs to the target values.After training, an ANN is tested as follows. One gives it onlyinput values, not target values and sees how close the networkcomes to outputting the correct target values. Unsupervisedlearning involves no target values; it tries to auto-associateinformation from the inputs with an intrinsic reduction of datadimension, similar to extracting principal components in linearsystems. This is the role of the training algorithms (i.e., fitting themodel represented by the network to the training data available).The error, of a particular configuration of the network, can be

594 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

Fig. 13. Supervised learing scheme.

determined by running all the training cases through the networkand comparing the actual output generated with the desired ortarget outputs or clusters. In learning algorithms considerations,one is interested in whether a particular algorithm converges,its speed of convergence and the computational complexityof the algorithm.

In supervised learning, a set of inputs and correct outputs isused to train the network. Before the learning algorithms areapplied to update the weights, all the weights are initializedrandomly [84]. The network, using this set of inputs, producesits own outputs. These outputs are compared with the correctoutputs and the differences are used to modify the weights, asshown in Fig. 13. A special case of supervised learning is rein-forcement learning, shown in Fig. 14, where there is no set ofinputs and correct outputs. Training is commanded only by sig-nals indicating if the produced output is bad or good, accordingto a defined criterion.

Fig. 15 shows the principles of unsupervised learning, alsoknown as self-organized learning, where a network develops itsclassification rules by extracting information from the inputspresented to the network. In other words, by using the correla-tion of the input vectors, the learning rule changes the networkweights to group the input vectors into clusters. By doing so,similar input vectors will produce similar network outputs sincethey belong to a same cluster.

A. Early Supervised Learning Algorithms

Early learning algorithms were designed for single layer NNs.They are generally more limited in their applicability, but theirimportance in history is remarkable.

Perceptron Learning:A single-layer perceptron is trained asfollows.

1) Randomly initialize all the networks weights.2) Apply the inputs and calculate the sum of each unit

.3) The outputs from each unit are

thresholdotherwise

(1)

4) Compute the error whereis the known desired output value.

5) Update each weight as.

6) Repeat steps 2)–4) until the errors reach the satisfactorylevel.

LMS or Widrow–Hoff Learning:The LMS algorithm is quitesimilar to perceptron learning algorithm. The differences are asfollows.

1) The error is based on the sum of inputs to the unitratherthan the binary-valued output to the unit. Therefore,

(2)

2) The linear sum of the inputs is passed through bipolarsigmoid functions, which produces the output1 or 1,depending on the polarity of the sum.

This algorithm can be used in structures as RBF networks andwas successfully applied [43], [70]. Some CMAC approachescan also use this algorithm to adapt a complex robotic systeminvolving multiple feedback sensors and multiple commandvariables [79].

Grossberg Learning:Sometimes known as in-star andout-star training, this algorithm is updated as follows:

(3)

where could be the desired input values (in-star training) orthe desired output values (out-star training) depending on thenetwork structure.

B. First-Order Gradient Methods

Backpropagation:Backpropagation is a generalization ofthe LMS algorithm. In this algorithm, an error function isdefined as the mean-square difference between the desiredoutput and the actual output of the feedforward network [45]. Itis based on steepest descent techniques extended to each of thelayers in the network by the chain rule. Hence, the algorithmcomputes the partial derivative , of theerror function with respect to the weights. The error functionis defined as , where is thedesired output, is the network output. The objective is tominimize the error function by taking the error gradientwith respect to the parameters or weight vector, for example,

, that is to be adapted. The weights are then updated by using

(4)

where is the learning rate and

(5)

This algorithm is simple to implement and computationally lesscomplex than other modified forms. Despite some disadvan-tages, it is popularly used and there are numerous extensionsto improve it. Some of these techniques will be presented.

Backpropagation With Momentum (BPM):The basic im-provement to the backpropagation algorithm is to introduce amomentum term in the weights updating equation

(6)where the momentum factor is commonly selected inside[0,1]. Adding the momentum term improves the convergencespeed and helps the network from being trapped in a localminimum.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 595

Fig. 14. Reinforcement learning scheme.

Fig. 15. Unsupervised learning scheme.

A modification to (6) was proposed in 1990, inserting theconstant , defined by the user [84]

(7)

The idea was to reduce the possibility of the network beingtrapped in the local minimum.

Delta–Bar–Delta (DBD): The DBD learning rules use adap-tive learning rates, to speed up the convergence. The adaptivelearning rate adopted is based on a local optimization method.This technique uses gradient descent for the search direction andthen applies individual step sizes for each weight. This meansthe actual direction taken in weight space is not necessarilyalong the line of the steepest gradient. If the weight updates be-tween consecutive iterations are in opposite directions, the stepsize is decreased; otherwise, it is increased. This is prompted bythe idea that if the weight changes are oscillating, the minimumis between the oscillations and a smaller step size might findthat minimum. The step size may be increased again once theerror has stopped oscillating.

If denotes the learning rate for the weight , then

(8)

and is as follows:

otherwise(9)

where

(10)

(11)

The positive constant and parameters ( ) are specified bythe user. The quantity is basically an exponentially de-caying trace of gradient values. When theand are set tozero, the learning rates assume a constant value as in the stan-dard backpropagation algorithm.

Using momentum along with the DBD algorithm can enhanceperformance considerably. However, it can make the search di-verge wildly, especially if is moderately large. The reason isthat momentum magnifies learning rate increments and quicklyleads to inordinately large learning steps. One possible solutionis to keep factor very small, but this can easily lead to slowincrease in and little speedup [84].

C. Second-Order Gradient Methods

These methods use the Hessian matrix, which is the matrixof second derivatives of with respect to the weights . Thismatrix contains information about how the gradient changes indifferent directions in weight space

(12)

Newton Method:The Newton method weights update is pro-cessed as follows:

(13)

However, the Newton method is not commonly used becausecomputing the Hessian matrix is computationally expensive.Furthermore, the Hessian matrix may not be positive definiteat every point in the error surface. To overcome the problem,several methods have being proposed to approximate the Hes-sian matrix [84].

Gauss–Newton Method:The Gauss–Newton method pro-duces an matrix that is an approximation to the Hessianmatrix, having elements represented by

(14)

596 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

where . However, theGauss–Newton method may still have ill conditioning ifclose to or is singular [75].

Levenberg–Marquardt (LM) Method:The LM method over-comes this difficulty by including an additional term, whichis added to the Gauss–Newton approximation of the Hessiangiving

(15)

where is a small positive value andis the identity matrix.could also be made adaptive by having

ifif (16)

where is a value defined by the user. It is important to noticethat when is large, the algorithm becomes backpropagationwith learning rate and, when is small, the algorithm be-comes Gauss–Newton. An NN, trained by using this algorithm,can be found in the diagnosis of various motor bearing faultsthrough appropriate measurement and interpretation of motorbearing vibration signals [72].

D. Reinforcement Learning Algorithms

Reinforcement learning has one of its roots in psychology,from the idea of rewards and penalties used to teach animals todo simple tasks with small amounts of feedback information.Barto and others proposed the Adaptive Critic Learning algo-rithm to solve discrete domain decision-making problems in the1980s. Their approach was generalized to the NNs field laterused by Sutton, who used CMAC and other ANN structures tolearn paths for mobile robots in maze-like environments [111].

Linear Reward–Penalty Learning:When the reinforcementsignal is positive ( 1), the learning rule is

(17)

for (18)

If the reinforcement signal is negative (1), the learning rule is

(19)

for (20)

where and are learning rates, denotes the probabilityat iteration , and is the number of actions taken. For positivereinforcement, the probability of the current action is increasedwith relative decrease in the probabilities of the other actions.The adjustment is reversed in the case of negative reinforcement.

Associative Search Learning:In this algorithm, the weightsare updated as follows [9]:

(21)

where is the reinforcement signal and is eligibility.Positive indicates the occurrence of a rewarding event andnegative indicates the occurrence of a punishing event. Itcan be regarded as a measure of the change in the value of aperformance criterion. Eligibility reflects the extent of activityin the pathway or connection link. Exponentially decaying

eligibility traces can be generated using the following lineardifference equation:

(22)

where determines the trace decay rate,is theinput, and is the output.

Adaptive Critic Learning: The weights update in a critic net-work is as follows [9]:

(23)where , , is a constant discount factor, is thereinforcement signal from the environment, and is thetrace of the input variable . is the prediction at time ofeventual reinforcement, which can be described as a linear func-tion . The adaptive critic network output

, the improved or internal reinforcement signal, is computedfrom the predictions as follows:

(24)

E. Unsupervised Learning

Hebbian Learning: Weights are updated as follows:

(25)

(26)

where is the weight from th unit to th unit at timestep , is the excitation level of the source unit orthunit, and is the excitation level of the destination unitor the th output unit. In this system, learning is a purelylocal phenomenon, involving only two units and a synapse.No global feedback system is required for the neural patternto develop. A special case of Hebbian learning is correlationlearning, which uses binary activation for function andis defined as the desired excitation level for the destinationunit. While Hebbian learning is performed in unsupervisedenvironments, correlation learning is supervised [128].

Boltzmann Machine Learning:The Boltzmann Machinetraining algorithm uses a kind of stochastic technique knownas simulated annealing, to avoid being trapped in local minimaof the network energy function. The algorithm is as follows.

1) Initialize weights.2) Calculate activation as follows.

a) Select an initial temperature.b) Until thermal equilibrium, repeatedly calculate theprobability that is active by (23).c) Exit when the lowest temperature is reached.Otherwise, reduce temperature by a certain annealingschedule and repeat step 2)

(27)

Above, is the temperature, is the total inputreceived by theth unit, and the activation level of unit

is set according to this probability.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 597

Kohonen Self-Organizing Learning:The network is trainedaccording to the following algorithm, frequently called the“winner-takes-all” rule.

1) Apply an input vector .2) Calculate the distance (in -dimensional space) be-

tween and the weight vectors of each unit. In Eu-clidean space, this is calculated as follows:

(28)

3) The unit that has the weight vector closest tois declaredthe winner unit. This weight vector, called , becomesthe center of a group of weight vectors that lie within adistance from .

4) For all weight vectors within a distance of , trainthis group of nearby weight vectors according to the for-mula that follows:

(29)

5) Perform steps 1)–4), cycling through each input vectoruntil convergence.

F. Practical Considerations

NNs are unsurpassed at identifying patterns or trends in dataand well suited for prediction or forecasting needs includingsales and customer research, data validation, risk management,and industrial process control. One of the fascinating aspects, ofthe practical implementation of NNs to industrial applications,is the ability to manage data interaction between electrical andmechanical behavior and often other disciplines, as well. Themajority of the reported applications involve fault diagnosis anddetection, quality control, pattern recognition, and adaptive con-trol [14], [44], [74], [115]. Supervised NNs can mimic the be-havior of human control systems, as long as data correspondingto the human operator and the control input are supplied [7],[126]. Most of the existing, successful applications in controluse supervised learning, or any form of a reinforcement learningapproach that is also supervised. Unsupervised learning is notsuitable, particularly for online control, due the slow adaptationand required time for the network to settle into stable condi-tions. Unsupervised learning schemes are used mostly for pat-tern recognition, by defining group of patterns into a number ofclusters or classes.

There are some advantages to NNs over multiple data re-gression. There is no need to select the most important inde-pendent variables in the data set. The synapses associated withirrelevant variables readily show negligible weight values; intheir turn, relevant variables present significant synapse weightvalues. There is also no need to propose a model functionas required in multiple regressions. The learning capability ofNNs allows them to discover more complex and subtle in-teractions between the independent variables, contributing tothe development of a model with maximum precision. NNsare intrinsically robust showing more immunity to noise, animportant factor in modeling industrial processes. NNs havebeen applied within industrial domains, to address the inherentcomplexity of interacting processes under the lack of robustanalytical models of real industrial processes. In many cases,network topologies and training parameters are systematically

varied until satisfactory convergence is achieved. Currently,the most widely used algorithm for training MLPs is the back-propagation algorithm. It minimizes the mean square errorbetween the desired and the actual output of the network. Theoptimization is carried out with a gradient-descent technique.There are two critical issues in network learning: estimationerror and training time. These issues may be affected by thenetwork architecture and the training set. The network archi-tecture includes the number of hidden nodes, number of hiddenlayers and values of learning parameters. The training set isrelated to the number of training patterns, inaccuracies of inputdata and preprocessing of data. The backpropagation algorithmdoes not always find the global minimum, but may stop at alocal minimum. In practice, the type of minimum has littleimportance, as long as the desired mapping or classificationis reached with a desired accuracy. The optimization criterionof the backpropagation algorithm is not very good, from thepattern recognition point of view. The algorithm minimizesthe square error between the actual and the desired output, notthe number of faulty classifications, which is the main goalin pattern recognition. The algorithm is too slow for prac-tical applications, especially if many hidden layers are used.In addition, a backpropagation net has poor memory. Whenthe net learns something new it forgets the old. Despite itsshortcomings, bac-propagation is broadly used. Although theback-propagation algorithm has been a significant milestone,many attempts have been made to speed up the convergenceand significant improvement are observed by using varioussecond order approaches, namely, Newton’s method, conjugategradient’s, or the LM optimization technique [5], [10], [21],[48]. The issues to be dealt with are [84], [102] as follows:

1) slow convergence speed;2) sensitivity to initial conditions;3) trapping in local minima;4) instability if learning rate is too large.

One of the alternatives for the problem of being trapped in a localminimum is adding the momentum term using the BPM, whichalso improves the convergence speed. Another alternative usedwhen a backpropagation algorithm is difficult to implement,as in analog hardware, is the random weight change (RWC).This algorithm has shown to be immune to offset sensitivityand nonlinearity errors. It is a stochastic learning that makessure that the error function decreases on average, since it isgoing up or down at any one time. It is often called simulatedannealing because of its operational similarity to annealingprocesses [73]. Second order gradient methods use the matrixof second derivatives of with respect to the weights .However, computing this matrix is computationally expensiveand the methods presented tried to approximate this matrixto make algorithms more accessible.

Linear reward–penalty, associative search, and adaptive criticalgorithms are characterized as a special case of supervisedlearning called reinforcement learning. They do not need toexplicitly compute derivatives. Computation of derivativesusually introduces a lot of high frequency noise in the controlloop. Therefore, they are very suitable for some complexsystems, where basic training algorithms may fail or producesuboptimal results. On the other hand, those methods present

598 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

slower learning processes; and, because of this, they are adoptedespecially in the cases where only a single bit of information(for example, whether the output is right or wrong) is available.

VI. TAXONOMY OF NN APPLICATIONS

From the viewpoint of industrial applications, ANN applica-tions can be divided into four main categories.

A. Modeling and Identification

Modeling and identification are techniques to describe thephysical principles existing between the input and the outputof a system. The ANN can approximate these relationshipsindependent of the size and complexity of the problem. It hasbeen found to be an effective system for learning discriminants,for patterns from a body of examples. MLP is used as thebasic structure for a bunch of applications [4], [12], [17], [18],[32], [46], [56], [85], [94], [119], [127]. The Hopfield networkcan be used to identify problems of linear time-varying ortime-invariant systems [28]. Recurrent network topology [36],[83] has received considerable attention for the identificationof nonlinear dynamical systems. A functional-link NN ap-proach (FLN) was used to perform thermal dynamical systemidentification [113].

B. Optimization and Classification

Optimization is often required for design, planning of ac-tions, motions, and tasks. However, as known in the TravelingSalesman Problem, many parameters cause the amount of cal-culation to be tremendous and the ordinary method cannot beapplied. An affective approach is to find the optimal solutionby defining an energy function and using the NN with parallelprocessing, learning and self-organizing capabilities to operatein such a way that the energy is reduced. It is shown that ap-plication of the optimal approach makes effective use of ANNsensing, recognizing, and forecasting capabilities, in the controlof robotic manipulators with impact taken into account [45].

Classification using an ANN can also be viewed as anoptimization problem, provided that the existing rules to dis-tinguish the various classes of events/materials/objects can bedescribed in functional form. In such cases, the networks willdecide if a particular input belongs to one of the defined classesby optimizing the functional rules anda posteriorievaluatingthe achieved results. Different authors [34], [70] have proposedRBF approaches. For applications such as fault diagnosis, RBFnetworks offer clear advantages over MLPs. They are fasterto train, because layer training is decoupled [70]. Cellularnetworks [47], ART networks [122], and Hopfield networks[105], [112] can be used as methods to detect, to isolate faults,and to promote industrial quality control. MLPs are also widelyused for these purposes [23], [25], [35], [102], [114], [121].This structure can be found in induction motor [41], [42] andbearing [72] fault diagnosis, for nondestructive evaluation ofcheck valve performance and degradation [2], [57], in defectdetection onwoven fabrics [99], and in robotic systems [116].

Finally, it is important to mention clustering applications,which are special cases of classification where there is nosupervision during the training phase. The relationships be-tween elements of the existing classes and even the classes

themselves, have to be found from data in the training phase,without supervision.

C. Process Control

The NN makes use of nonlinearity, learning, parallel pro-cessing, and generalization capabilities for application toadvanced intelligent control. They can be classified into somemajor methods, such as supervised control, inverse control,neural adaptive control, back-propagation of utility (which isan extended method of a back-propagation through time) andadaptive critics (which is an extended method of reinforcementlearning algorithm) [2]. MLP structures were used for digitalcurrent regulation of inverter drives [16], to predict trajectoriesin robotic environments [19], [40], [52], [73], [79], [87], [89],[110], to control turbo generators [117], to monitor feed waterflow rate and component thermal performance of pressurizedwater reactors [61], to regulate temperature [64], and to predictnatural gas consumption [65]. Dynamical versions of MLPnetworks were used to control a nonlinear dynamic model ofa robot [60], [97], to control manufacturing cells [92], and toimplement a programmable cascaded low-pass filter [101]. Adynamic MLP is a classical MLP structure where the outputsare fed back to the inputs by means of time delay elements.

Other structures can be found as functional link networks tocontrol robots [69] and RBF networks to predict, from operatingconditions and from features of a steel sheet, the thermal en-ergy required to correct alloying [13]. RBF networks can be ob-served, as well, in predictive controllers for drive systems [38].Hopfield structures were used for torque minimization controlof redundant manipulators [33]. FPNs can be used for functionapproximation inside specific control schemes [100]. CMACnetworks were implemented in research automobiles [76] andto control robots [79].

D. Pattern Recognition

Some specific ANN structures, such as Kohonen and proba-bilistic networks, are studied and applied mainly for image andvoice recognition. Research in image recognition includes ini-tial vision (stereo vision of both eyes, outline extraction, etc.)close to the biological (particularly brain) function, manuallywritten character recognition by cognition at the practical leveland cell recognition for mammalian cell cultivation by usingNNs [45]. Kohonen networks were used for image inspectionand for disease identification from mammographic images [98].Probabilistic networks were used for transient detection to en-hance nuclear reactors’ operational safety [6]. As in the otherscategories, the MLP is widely used as well. The papermakingindustry [39] is one such example.

VII. CONCLUSION

This paper has described theoretical aspects of NNs related totheir relevance for industrial applications. Common questionsthat an engineer would ask when choosing an NN for a par-ticular application were answered. Characteristics of industrialprocesses, which would justify the ANN utilization, were dis-cussed and some areas of importance were proposed. Importantstructures and training methods, with relevant references that il-lustrated the utilization of those concepts, were presented.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 599

This survey observed that, although ANNs have a historyof more than 50 years, most of industrial applications werelaunched in the last ten years, where it was justified that theinvestigators provided either an alternative or a complement toother classical techniques. Those ANN applications demon-strated adaptability features integrated with the industrialproblem, thus becoming part of the industrial processes. Theauthors firmly believe that such an intricate field of NNs is juststarting to permeate a broad range of interdisciplinary problemsolving streams. The potential of NNs will be integrated into astill larger and all-encompassing field of intelligence systemsand will soon be taught for students and engineers as anordinary mathematical tool.

REFERENCES

[1] J. S. Albus, “A new approach to manipulator control: The cerebellarmodel articulation controller,”Trans. ASME, J. Dyn. Syst., Meas. Con-trol, vol. 97, pp. 220–227, Sept. 1975.

[2] I. E. Alguíndigue and R. E. Uhrig, “Automatic fault recognition in me-chanical components using coupled artificial neural networks,” inProc.IEEE World Congr. Computational Intelligence, June–July 1994, pp.3312–3317.

[3] P. E. M. Almeida and M. G. Simões, “Fundamentals of a fast conver-gence parametric CMAC network,” inProc. IJCNN’01, vol. 3, 2001,pp. 3015–3020.

[4] K. Andersen, G. E. Cook, G. Karsai, and K. Ramaswamy, “Artificialneural networks applied to arc welding process modeling and control,”IEEE Trans. Ind. Applicat., vol. 26, pp. 824–830, Sept./Oct. 1990.

[5] T. J. Andersen and B. M. Wilamowski, “A modified regression algorithmfor fast one layer neural network training,” inProc. World Congr. NeuralNetworks, vol. 1, Washington DC, July 17–21, 1995, pp. 687–690.

[6] I. K. Attieh, A. V. Gribok, J. W. Hines, and R. E. Uhrig, “Pattern recogni-tion techniques for transient detection to enhance nuclear reactors’ oper-ational safety,” inProc. 25th CNS/CNA Annu. Student Conf., Knoxville,TN, Mar. 2000.

[7] S. M. Ayala, G. Botura Jr., and O. A. Maldonado, “AI automates substa-tion control,” IEEE Comput. Applicat. Power, vol. 15, pp. 41–46, Jan.2002.

[8] D. L. Bailey and D. M. Thompson, “Developing neural-network appli-cations,”AI Expert, vol. 5, no. 9, pp. 34–41, 1990.

[9] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike elementsthat can solve difficult control problems,”IEEE Trans. Syst., Man, Cy-bern., vol. SMC-13, pp. 834–846, Sept./Oct. 1983.

[10] R. Battiti, “First- and second-order methods for learning: Betweensteepest descent and Newton’s method,”Neural Computation, vol. 4,no. 2, pp. 141–166, 1992.

[11] B. Bavarian, “Introduction to neural networks for intelligent control,”IEEE Contr. Syst. Mag., vol. 8, pp. 3–7, Apr. 1988.

[12] N. V. Bhat, P. A. Minderman, T. McAvoy, and N. S. Wang, “Modelingchemical process systems via neural computation,”IEEE Contr. Syst.Mag., vol. 10, pp. 24–30, Apr. 1990.

[13] G. Bloch, F. Sirou, V. Eustache, and P. Fatrez, “Neural intelligent controlfor a steel plant,”IEEE Trans. Neural Networks, vol. 8, pp. 910–918,July 1997.

[14] Z. Boger, “Experience in developing models of industrial plants by largescale artificial neural networks,” inProc. Second New Zealand Interna-tional Two-Stream Conf. Artificial Neural Networks and Expert Systems,1995, pp. 326–329.

[15] D. S. Broomhead and D. Lowe, “Multivariable functional interpolationand adaptive network,”Complex Syst., vol. 2, pp. 321–355, 1988.

[16] M. Buhl and R. D. Lorenz, “Design and implementation of neural net-works for digital current regulation of inverter drives,” inConf. Rec.IEEE-IAS Annu. Meeting, 1991, pp. 415–423.

[17] B. Burton and R. G. Harley, “Reducing the computational demands ofcontinually online-trained artificial neural networks for system identi-fication and control of fast processes,”IEEE Trans. Ind. Applicat., vol.34, pp. 589–596, May/June 1998.

[18] B. Burton, F. Kamran, R. G. Harley, T. G. Habetler, M. Brooke, and R.Poddar, “Identification and control of induction motor stator currentsusing fast on-line random training of a neural network,” inConf. Rec.IEEE-IAS Annu. Meeting, 1995, pp. 1781–1787.

[19] R. Carelli, E. F. Camacho, and D. Patiño, “A neural network based feedforward adaptive controller for robots,”IEEE Trans. Syst., Man, Cy-bern., vol. 25, pp. 1281–1288, Sept. 1995.

[20] G. A. Carpenter and S. Grossberg, “Associative learning, adaptivepattern recognition and cooperative- competitive decision making,” inOptical and Hybrid Computing, H. Szu, Ed. Bellingham, WA: SPIE,1987, vol. 634, pp. 218–247.

[21] C. Charalambous, “Conjugate gradient algorithm for efficient trainingof artificial neural networks,”Proc. Inst. Elect. Eng., vol. 139, no. 3, pp.301–310, 1992.

[22] S. Chen and S. A. Billings, “Neural networks for nonlinear dynamicsystem modeling and identification,”Int. J. Control, vol. 56, no. 2, pp.319–346, 1992.

[23] R. P. Cherian, L. N. Smith, and P. S. Midha, “A neural network approachfor selection of powder metallurgy materials and process parameters,”Artif. Intell. Eng., vol. 14, pp. 39–44, 2000.

[24] M. Y. Chow, P. M. Mangum, and S. O. Yee, “A neural network approachto real-time condition monitoring of induction motors,”IEEE Trans. Ind.Electron., vol. 38, pp. 448–453, Dec. 1991.

[25] M. Y. Chow, R. N. Sharpe, and J. C. Hung, “On the application anddesign of artificial neural networks for motor fault detection—Part II,”IEEE Trans. Ind. Electron., vol. 40, pp. 189–196, Apr. 1993.

[26] , “On the application and design of artificial neural networks formotor fault detection—Part I,”IEEE Trans. Ind. Electron., vol. 40, pp.181–188, Apr. 1993.

[27] T. W. S. Chow and Y. Fang, “A recurrent neural-network based real-timelearning control strategy applying to nonlinear systems with unknowndynamics,”IEEE Trans. Ind. Electron., vol. 45, pp. 151–161, Feb. 1998.

[28] S. R. Chu and R. Shoureshi, “Applications of neural networks in learningof dynamical systems,”IEEE Trans. Syst., Man, Cybern., vol. 22, pp.160–164, Jan./Feb. 1992.

[29] S. R. Chu, R. Shoureshi, and M. Tenorio, “Neural networks for systemidentification,” IEEE Contr. Syst. Mag., vol. 10, pp. 31–35, Apr. 1990.

[30] L. O. Chua, T. Roska, T. Kozek, and Á. Zarándy, “The CNN Para-digm—A short tutorial,” inCellular Neural Networks, T. Roska and J.Vandewalle, Eds. New York: Wiley, 1993, pp. 1–14.

[31] M. Cichowlas, D. Sobczuk, M. P. Kazmierkowski, and M. Malinowski,“Novel artificial neural network based current controller for PWM rec-tifiers,” in Proc. 9th Int. Conf. Power Electronics and Motion Control,2000, pp. 41–46.

[32] G. E. Cook, R. J. Barnett, K. Andersen, and A. M. Strauss, “Weld mod-eling and control using artificial neural network,”IEEE Trans. Ind. Ap-plicat., vol. 31, pp. 1484–1491, Nov./Dec. 1995.

[33] H. Ding and S. K. Tso, “A fully neural-network-based planning schemefor torque minimization of redundant manipulators,”IEEE Trans. Ind.Electron., vol. 46, pp. 199–206, Feb. 1999.

[34] G. Dini and F. Failli, “Planning grasps for industrial robotized applica-tions using neural networks,”Robot. Comput. Integr. Manuf., vol. 16,pp. 451–463, Dec. 2000.

[35] M. Dolen and R. D. Lorenz, “General methodologies for neural net-work programming,” inProc. IEEE Applied Neural Networks Conf.,Nov. 1999, pp. 337–342.

[36] , “Recurrent neural network topologies for spectral state estimationand differentiation,” inProc. ANNIE Conf., St. Louis, MO, Nov. 2000.

[37] M. Dolen, P. Y. Chung, E. Kayikci, and R. D. Lorenz, “Disturbanceforce estimation for CNC machine tool feed drives by structured neuralnetwork topologies,” inProc. ANNIE Conference, St. Louis, MO, Nov.2000.

[38] Y. Dote, M. Strefezza, and A. Suyitno, “Neuro fuzzy robust controllersfor drive systems,” inProc. IEEE Int. Symp. Industrial Electronics,1993, pp. 229–242.

[39] P. J. Edwards, A. F. Murray, G. Papadopoulos, A. R. Wallace, J. Barnard,and G. Smith, “The application of neural networks to the papermakingindustry,” IEEE Trans. Neural Networks, vol. 10, pp. 1456–1464, Nov.1999.

[40] M. J. Er and K. C. Liew, “Control of adept one SCARA robot usingneural networks,”IEEE Trans. Ind. Electron., vol. 44, pp. 762–768, Dec.1997.

[41] F. Filippetti, G. Franceschini, and C. Tassoni, “Neural networks aidedon-line diagnostics of induction motor rotor faults,”IEEE Trans. Ind.Applicat., vol. 31, pp. 892–899, July/Aug. 1995.

[42] F. Filippetti, G. Franceschini, C. Tassoni, and P. Vas, “Recent develop-ments of induction motor drives fault diagnosis using AI techniques,”IEEE Trans. Ind. Electron., vol. 47, pp. 994–1004, Oct. 2000.

[43] D. Flynn, S. McLoone, G. W. Irwin, M. D. Brown, E. Swidenbank, andB. W. Hogg, “Neural control of turbogenerator systems,”Automatica,vol. 33, no. 11, pp. 1961–1973, 1997.

600 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003

[44] D. B. Fogel, “Selecting an optimal neural network industrial electronicssociety,” inProc. IEEE IECON’90, vol. 2, 1990, pp. 1211–1214.

[45] T. Fukuda and T. Shibata, “Theory and applications of neural networksfor industrial control systems,”IEEE Trans. Ind. Applicat., vol. 39, pp.472–489, Nov./Dec. 1992.

[46] A. A. Gorni, “The application of neural networks in the modeling ofplate rolling processes,”JOM-e, vol. 49, no. 4, electronic document, Apr.1997.

[47] N. Guglielmi, R. Guerrieri, and G. Baccarani, “Highly constrainedneural networks for industrial quality control,”IEEE Trans. NeuralNetworks, vol. 7, pp. 206–213, Jan. 1996.

[48] M. T. Hagan and M. Menhaj, “Training feedforward networks withthe Marquardt algorithm,”IEEE Trans. Neural Networks, vol. 5, pp.989–993, Nov. 1994.

[49] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nded. New York: Prentice-Hall, 1995.

[50] S. M. Halpin and R. F. Burch, “Applicability of neural networks to indus-trial and commercial power systems: A tutorial overview,”IEEE Trans.Ind. Applicat., vol. 33, pp. 1355–1361, Sept./Oct. 1997.

[51] H. Harrer and J. Nossek, “Discrete-time cellular neural networks,” inCellular Neural Networks, T. Roska and J. Vandewalle, Eds. NewYork: Wiley, 1993, pp. 15–29.

[52] H. Hashimoto, T. Kubota, M. Sato, and F. Harashima, “Visual control ofrobotic manipulator based on neural networks,”IEEE Trans. Ind. Elec-tron., vol. 39, pp. 490–496, Dec. 1992.

[53] D. O. Hebb,The Organization of Behavior. New York: Wiley, 1949.[54] G. E. Hinton and T. J. Sejnowski, “Learning and relearning in Boltzmann

machines,” inThe PDP Research Group, D. Rumelhart and J. McClel-land, Eds. Cambridge, MA: MIT Press, 1986.

[55] J. J. Hopfield, “Neural networks and physical systems with emergentcollective computational abilities,” inProc. Nat. Acad. Sci., vol. 79, Apr.1982, pp. 2445–2558.

[56] C. Y. Huang, T. C. Chen, and C. L. Huang, “Robust control of inductionmotor with a neural-network load torque estimator and a neural-networkidentification,” IEEE Trans. Ind. Electron., vol. 46, pp. 990–998, Oct.1999.

[57] A. Ikonomopoulos, R. E. Uhrig, and L. H. Tsoukalas, “Use of neuralnetworks to monitor power plant components,” inProc. American PowerConf., vol. 54-II, Apr. 1992, pp. 1132–1137.

[58] M. Jelínek. (1999) Everything you wanted to know about ARTneural networks, but were afraid to ask. [Online]. Available:http://cs.felk.cvut.cz/~xjeline1/semestralky/nan

[59] S. Jung and T. C. Hsia, “Neural network impedance force control ofrobot manipulator,”IEEE Trans. Ind. Electron., vol. 45, pp. 451–461,June 1998.

[60] A. Karakasoglu and M. K. Sundareshan, “A recurrent neural network-based adaptive variable structure model-following control of robotic ma-nipulators,”Automatica, vol. 31, no. 10, pp. 1495–1507, 1995.

[61] K. Kavaklioglu and B. R. Upadhyaya, “Monitoring feedwater flowrate and component thermal performance of pressurized water reactorsby means of artificial neural networks,”Nucl. Technol., vol. 107, pp.112–123, July 1994.

[62] M. Kawato, Y. Uno, M. Isobe, and R. Suzuki, “Hierarchical neural net-work model for voluntary movement with application to robotics,”IEEEContr. Syst. Mag., vol. 8, pp. 8–15, Apr. 1988.

[63] M. Khalid and S. Omatu, “A neural network controller for a temperaturecontrol system,”IEEE Contr. Syst. Mag., vol. 12, pp. 58–64, June 1992.

[64] M. Khalid, S. Omatu, and R. Yusof, “Temperature regulation with neuralnetworks and alternative control schemes,”IEEE Trans. Neural Net-works, vol. 6, pp. 572–582, May 1995.

[65] A. Khotanzad, H. Elragal, and T. L. Lu, “Combination of artificialneural-network forecasters for prediction of natural gas consumption,”IEEE Trans. Neural Networks, vol. 11, pp. 464–473, Mar. 2000.

[66] Y. H. Kim, F. L. Lewis, and D. M. Dawson, “Intelligent optimal controlof robotic manipulators using neural network,”Automatica, vol. 36, no.9, pp. 1355–1364, 2000.

[67] B. Kosko, “Adaptive bi-directional associative memories,”Appl. Opt.,vol. 26, pp. 4947–4960, 1987.

[68] S. Y. Kung and J. N. Hwang, “Neural network architectures for roboticapplications,”IEEE Trans. Robot. Automat., vol. 5, pp. 641–657, Oct.1989.

[69] C. Kwan, F. L. Lewis, and D. M. Dawson, “Robust neural-network con-trol of rigid-link electrically driven robots,”IEEE Trans. Neural Net-works, vol. 9, pp. 581–588, July 1998.

[70] J. A. Leonard and M. A. Kramer, “Radial basis function networks forclassifying process faults,”IEEE Contr. Syst. Mag., vol. 11, pp. 31–38,Apr. 1991.

[71] F. L. Lewis, A. Yesildirek, and K. Liu, “Multilayer neural-net robot con-troller with guaranteed tracking performance,”IEEE Trans. Neural Net-works, vol. 7, pp. 388–399, Mar. 1996.

[72] B. Li, M. Y. Chow, Y. Tipsuwan, and J. C. Hung, “Neural-network basedmotor rolling bearing fault diagnosis,”IEEE Trans. Ind. Electron., vol.47, pp. 1060–1069, Oct. 2000.

[73] J. Liu, B. Burton, F. Kamran, M. A. Brooke, R. G. Harley, and T. G.Habetler, “High speed on-line neural network of an induction motor im-mune to analog circuit nonidealities,” inProc. IEEE Int. Symp. Circuitsand Systems, June 1997, pp. 633–636.

[74] Y. Liu, B. R. Upadhyaya, and M. Naghedolfeizi, “Chemometric dataanalysis using artificial neural networks,”Appl. Spectrosc., vol. 47, no.1, pp. 12–23, 1993.

[75] L. Ljung, System Identification: Theory for the User. New York: Pren-tice-Hall, 1987.

[76] M. Majors, J. Stori, and D. Cho, “Neural network control of automotivefuel-injection systems,”IEEE Contr. Syst. Mag., vol. 14, pp. 31–36, June1994.

[77] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,”Bull. Math. Biophys., vol. 5, pp. 115–133, 1943.

[78] M. Milanova, P. E. M. Almeida, J. Okamoto Jr., and M. G. Simões, “Ap-plications of cellular neural networks for shape from shading problem,”in Proc. Int. Workshop Machine Learning and Data Mining in PatternRecognition, Lecture Notes in Artificial Intelligence, P. Perner and M.Petrou, Eds., Leipzig, Germany, September 1999, pp. 52–63.

[79] W. T. Miller III, “Real-time application of neural networks for sensor-based control of robots with vision,”IEEE Trans. Syst., Man, Cybern.,vol. 19, pp. 825–831, July/Aug. 1989.

[80] M. L. Minsky and S. Papert,Perceptrons: An Introduction to Computa-tional Geometry. Cambridge, MA: MIT Press, 1969.

[81] T. Munakata,Fundamentals of the New Artificial Intelligence—BeyondTraditional Paradigms. Berlin, Germany: Springer-Verlag, 1998.

[82] S. R. Naidu, E. Zafiriou, and T. J. McAvoy, “Use of neural networks forsensor failure detection in a control system,”IEEE Contr. Syst. Mag.,vol. 10, pp. 49–55, Apr. 1990.

[83] K. S. Narendra and K. Parthasarathy, “Identification and control of dy-namical systems using neural networks,”IEEE Trans. Neural Networks,vol. 1, pp. 4–27, Mar. 1990.

[84] G. W. Ng,Application of Neural Networks to Adaptive Control of Non-linear Systems. London, U.K.: Research Studies Press, 1997.

[85] D. H. Nguyen and B. Widrow, “Neural networks for self-learning controlsystems,”IEEE Contr. Syst. Mag., vol. 10, pp. 18–23, Apr. 1990.

[86] J. R. Noriega and H. Wang, “A direct adaptive neural-network controlfor unknown nonlinear systems and its application,”IEEE Trans. NeuralNetworks, vol. 9, pp. 27–34, Jan. 1998.

[87] T. Ozaki, T. Suzuki, T. Furuhashi, S. Okuma, and Y. Uchikawa, “Tra-jectory control of robotic manipulators using neural networks,”IEEETrans. Ind. Electron., vol. 38, June 1991.

[88] D. B. Parker, “A comparison of algorithms for neuron-like cells,” inNeural Networks for Computing, J. S. Denker, Ed. New York: Amer-ican Institute of Physics, 1986, pp. 327–332.

[89] P. Payeur, H. Le-Huy, and C. M. Gosselin, “Trajectory prediction formoving objects using artificial neural networks,”IEEE Trans. Ind. Elec-tron., vol. 42, pp. 147–158, Apr. 1995.

[90] M. H. Rahman, R. Fazlur, R. Devanathan, and Z. Kuanyi, “Neural net-work approach for linearizing control of nonlinear process plants,”IEEETrans. Ind. Electron., vol. 47, pp. 470–477, Apr. 2000.

[91] F. Rosenblatt, “The perceptron: A probabilistic model for informationstorage and organization in the brain,”Psych. Rev., vol. 65, pp. 386–408,1958.

[92] G. A. Rovithakis, V. I. Gaganis, S. E. Perrakis, and M. A. Christodoulou,“Real-time control of manufacturing cells using dynamic neural net-works,” Automatica, vol. 35, no. 1, pp. 139–149, 1999.

[93] A. Rubaai and M. D. Kankam, “Adaptive real-time tracking controllerfor induction motor drives using neural designs,” inConf. Rec. IEEE-IASAnnu. Meeting, vol. 3, Oct. 1996, pp. 1709–1717.

[94] A. Rubaai and R. Kotaru, “Online identification and control of a DCmotor using learning adaptation of neural networks,”IEEE Trans. Ind.Applicat., vol. 36, pp. 935–942, May/June 2000.

[95] D. D. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre-sentations by back-propagating errors,”Nature, vol. 323, pp. 533–536,1986.

[96] D. E. Rumelhart, B. Widrow, and M. A. Lehr, “The basic ideas in neuralnetworks,”Commun. ACM, vol. 37, no. 3, pp. 87–92, Mar. 1994.

[97] M. Saad, P. Bigras, L. A. Dessaint, and K. A. Haddad, “Adaptive robotcontrol using neural networks,”IEEE Trans. Ind. Electron., vol. 41, pp.173–181, Apr. 1994.

MEIRELESet al.: INDUSTRIAL APPLICABILITY OF ANNs 601

[98] S. Sardy and L. Ibrahim, “Experimental medical and industrial applica-tions of neural networks to image inspection using an inexpensive per-sonal computer,”Opt. Eng., vol. 35, no. 8, pp. 2182–2187, Aug. 1996.

[99] S. Sardy, L. Ibrahim, and Y. Yasuda, “An application of vision systemfor the identification and defect detection on woven fabrics by usingartificial neural networks,” inProc. Int. Joint Conf. Neural Networks,1993, pp. 2141–2144.

[100] A. P. A. Silva, P. C. Nascimento, G. L. Torres, and L. E. B. Silva, “Analternative approach for adaptive real-time control using a nonpara-metric neural network,” inConf. Rec. IEEE-IAS Annu. Meeting, 1995,pp. 1788–1794.

[101] L. E. B. Silva, B. K. Bose, and J. O. P. Pinto, “Recurrent-neural-net-work-based implementation of a programmable cascaded low-pass filterused in stator flux synthesis of vector-controlled induction motor drive,”IEEE Trans. Ind. Electron., vol. 46, pp. 662–665, June 1999.

[102] T. Sorsa, H. N. Koivo, and H. Koivisto, “Neural networks in processfault diagnosis,”IEEE Trans. Syst., Man. Cybern., vol. 21, pp. 815–825,July/Aug. 1991.

[103] D. F. Specht, “Probabilistic neural networks for classification, mapping,or associative memory,” inProc. IEEE Int. Conf. Neural Networks, July1988, pp. 525–532.

[104] , “Probabilistic neural networks,”Neural Networks, vol. 3, pp.109–118, 1990.

[105] A. Srinivasan and C. Batur, “Hopfield/ART-1 neural network-basedfault detection and isolation,”IEEE Trans. Neural Networks, vol. 5, pp.890–899, Nov. 1994.

[106] W. E. Staib and R. B. Staib, “The intelligence arc furnace controller: Aneural network electrode position optimization system for the electricarc furnace,” presented at the IEEE Int. Joint Conf. Neural Networks,New York, NY, 1992.

[107] R. Steim, “Preprocessing data for neural networks,”AI Expert, pp.32–37, Mar. 1993.

[108] K. Steinbuch and U. A. W. Piske, “Learning matrices and their applica-tions,” IEEE Trans. Electron. Comput., vol. EC-12, pp. 846–862, Dec.1963.

[109] F. Sun, Z. Sun, and PY. Woo, “Neural network-based adaptive controllerdesign of robotic manipulators with an observer,”IEEE Trans. NeuralNetworks, vol. 12, pp. 54–67, Jan. 2001.

[110] M. K. Sundareshan and C. Askew, “Neural network-assisted variablestructure control scheme for control of a flexible manipulator arm,”Au-tomatica, vol. 33, no. 9, pp. 1699–1710, 1997.

[111] R. S. Sutton, “Generalization in reinforcement learning: Successful ex-amples using sparse coarse coding,” inAdvances in Neural Informa-tion Processing Systems. Cambridge, MA: MIT Press, 1996, vol. 8,pp. 1038–1044.

[112] H. H. Szu, “Automatic fault recognition by image correlation neural net-work techniques,”IEEE Trans. Ind. Electron., vol. 40, pp. 197–208, Apr.1993.

[113] J. Teeter and M. Y. Chow, “Application of functional link neural net-work to HVAC thermal dynamic system identification,”IEEE Trans.Ind. Electron., vol. 45, pp. 170–176, Feb. 1998.

[114] L. Tsoukalas and J. Reyes-Jimenez, “Hybrid expert system-neural net-work methodology for nuclear plant monitoring and diagnostics,” inProc. SPIE Applications of Artificial Intelligence VIII, vol. 1293, Apr.1990, pp. 1024–1030.

[115] R. E. Uhrig, “Application of artificial neural networks in industrialtechnology,” inProc. IEEE Int. Conf. Industrial Technology, 1994, pp.73–77.

[116] A. T. Vemuri and M. M. Polycarpou, “Neural-network-based robust faultdiagnosis in robotic systems,”IEEE Trans. Neural Networks, vol. 8, pp.1410–1420, Nov. 1997.

[117] G. K. Venayagamoorthy and R. G. Harley, “Experimental studies with acontinually online-trained artificial neural network controller for a turbogenerator,” inProc. Int. Joint Conf. Neural Networks, vol. 3, Wash-ington, DC, July 1999, pp. 2158–2163.

[118] B. W. Wah and G. J. Li, “A survey on the design of multiprocessingsystems for artificial intelligence applications,”IEEE Trans. Syst., Man,Cybern., vol. 19, pp. 667–692, July/Aug. 1989.

[119] S. Weerasooriya and M. A. El-Sharkawi, “Identification and control ofa DC motor using back-propagation neural networks,”IEEE Trans. En-ergy Conversion, vol. 6, pp. 663–669, Dec. 1991.

[120] P. J. Werbos, “Beyond regression: New tools for prediction and anal-ysis in the behavioral sciences,” Ph.D. dissertation, Harvard Univ., Cam-bridge, MA, 1974.

[121] , “Maximizing long-term gas industry profits in two minutes inlotus using neural network methods,”IEEE Trans. Syst., Man, Cybern.,vol. 19, pp. 315–333, Mar./Apr. 1989.

[122] J. R. Whiteley, J. F. Davis, A. Mehrotra, and S. C. Ahalt, “Observationsand problems applying ART2 for dynamic sensor pattern interpretation,”IEEE Trans. Syst., Man, Cybern. A, vol. 26, pp. 423–437, July 1996.

[123] B. Widrow,DARPA Neural Network Study. Fairfax, VA: Armed ForcesCommunications and Electronics Assoc. Int. Press, 1988.

[124] B. Widrow and M. E. Hoff Jr., “Adaptive switching circuits,”1960 IREWestern Electric Show Conv. Rec., pt. 4, pp. 96–104, Aug. 1960.

[125] N. Wiener,Cybernetics. Cambridge, MA: MIT Press, 1961.[126] M. J. Willis, G. A. Montague, D. C. Massimo, A. J. Morris, and M.

T. Tham, “Artificial neural networks and their application in processengineering,” inIEE Colloq. Neural Networks for Systems: Principlesand Applications, 1991, pp. 71–74.

[127] M. Wishart and R. G. Harley, “Identification and control of inductionmachines using artificial neural networks,”IEEE Trans. Ind. Applicat.,vol. 31, pp. 612–619, May/June 1995.

[128] J. M. Zurada,Introduction to Artificial Neural Networks. Boston, MA:PWS–Kent, 1995.

Magali R. G. Meireles received the B.E. degreefrom the Federal University of Minas Gerais, BeloHorizonte, Brazil, in 1986, and the M.Sc. degreefrom the Federal Center for Technological Education,Belo Horizonte, Brazil, in 1998, both in electricalengineering.

She is an Associate Professor in the Mathematicsand Statistics Department, Pontific Catholic Uni-versity of Minas Gerais, Belo Horizonte, Brazil.Her research interests include applied artificialintelligence and engineering education. In 2001, she

was a Research Assistant in the Division of Engineering, Colorado School ofMines, Golden, where she conducted research in the Mechatronics Laboratory.

Paulo E. M. Almeida (S’00) received the B.E. andM.Sc. degrees from the Federal University of MinasGerais, Belo Horizonte, Brazil, in 1992 and 1996,respectively, both in electrical engineering, and theDr.E. degree from São Paulo University, São Paulo,Brazil.

He is an Assistant Professor at the Federal Centerfor Technological Education of Minas Gerais, BeloHorizonte, Brazil. His research interests are appliedartificial intelligence, intelligent control systems, andindustrial automation. In 2000–2001, he was a Vis-

iting Scholar in the Division of Engineering, Colorado School of Mines, Golden,where he conducted research in the Mechatronics Laboratory.

Dr. Almeida is a member of the Brazilian Automatic Control Society. He re-ceived a Student Award and a Best Presentation Award from the IEEE IndustrialElectronics Society at the 2001 IEEE IECON, held in Denver, CO.

Marcelo Godoy Simões (S’89–M’95–SM’98)received the B.S. and M.Sc. degrees in electricalengineering from the University of São Paulo, SãoPaulo, Brazil, in 1985 and 1990, respectively, thePh.D. degree in electrical engineering from theUniversity of Tennessee, Knoxville, in 1995, andthe Livre-Docencia (D.Sc.) degree in mechanicalengineering from the University of São Paulo, in1998.

He is currently an Associate Professor at the Col-orado School of Mines, Golden, where he is working

to establish several research and education activities. His interests are in theresearch and development of intelligent applications, fuzzy logic and neuralnetworks applications to industrial systems, power electronics, drives, machinecontrol, and distributed generation systems.

Dr. Simões is a recipient of a National Science Foundation (NSF)—FacultyEarly Career Development (CAREER) Award, which is the NSF’s most presti-gious award for new faculty members, recognizing activities of teacher/scholarswho are considered most likely to become the academic leaders of the 21stcentury.


Recommended