Multiobjective Neuroevolutionary Control for a Fuel Cell ...

Multiobjective Neuroevolutionary Control for a Fuel CellTurbine Hybrid Energy System

Mitchell ColbyOregon State University

Corvallis, [email protected]

Logan YliniemiUniversity of Nevada, Reno

Reno, [email protected]

Paolo PezziniAmes Laboratory Ames, [email protected]

David TuckerNational Energy TechnologyLaboratory Morgantown, WV

[email protected]

Kenneth “Mark” BrydenAmes Laboratory Ames, [email protected]

Kagan TumerOregon State University

Corvallis, [email protected]

ABSTRACTIncreased energy demands are driving the development ofnew power generation technologies with high efficient. Di-rect fired fuel cell turbine hybrid systems are one such de-velopment, which have the potential to dramatically increasepower generation efficiency, quickly respond to transient loads(and are generally flexible), and offer fast start up times.However, traditional control techniques are often inadequatein these systems because of extremely high nonlinearitiesand coupling between system parameters. In this work, wedevelop multi-objective neural network controller via neu-roevolution and the Pareto Concavity Elimination Trans-formation (PaCcET). In order for the training process to becomputationally tractable, we develop a computationally ef-ficient plant simulator based on physical plant data, allowingfor rapid fitness assignment. Results demonstrate that themulti-objective algorithm is able to develop a Pareto frontof control policies which represent tradeoffs between track-ing desired turbine speed profiles and minimizing transientoperation of the fuel cell.

CCS Concepts•Hardware → Fuel-based energy;

KeywordsMulti-objective optimization; PaCcET

1. INTRODUCTIONIncreases in population as well as per capita energy con-

sumption have driven the demand for energy sources whichare thermally efficient, economically viable, and environ-mentally friendly. One potential class of solutions involveshybrid power generation systems, which combine existing

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

GECCO ’16, July 20-24, 2016, Denver, CO, USAc© 2016 ACM. ISBN 978-1-4503-4206-3/16/07. . . $15.00

DOI: http://dx.doi.org/10.1145/2908812.2908924

technologies resulting in synergistic relationships betweencomponents. In particular, research is currently being con-ducted on fuel cell turbine hybrid energy systems.

A key difficulty in fuel cell turbine hybrid systems is thatthe combination of recuperative turbine cycles and solideoxide fuel cells result in high nonlinearities as well as ex-treme coupling between state variables, making high fidelitymodels difficult to develop. Without accurate models, tra-ditional control techniques such as optimal control or modelpredictive control become difficult to implement.

One potential solution is the use of model-free controltechniques such as evolving controllers. While this doeseliminate the need for a model, it introduces the problem offitness assignment. Often, the performance of a controllerneeds to be determined in computationally expensive numer-ical simulations, or tested in hardware. In either case, thecost of fitness assignment results in evolutionary algorithmsbecoming computationally intractable.

In this work, we create an efficient statistical model ofthe Hybrid Performance Project (Hyper), a fuel cell turbinehybrid power plant located at the National Energy Tech-nologies Laboratory in Morgantown, West Virginia. Usingthis model as the basis for fitness assignment, we evolvemulti-objective neural network controllers for the facility.

The specific contributions of this paper are to:

• Model the behavior of an advanced fuel cell turbinehybrid energy system.

• Use multi-objective neuroevolutionary control algorithmsto develop controllers for the Hyper facility.

Figure 1 shows an outline of the process used to developa multiobjective neural network controller for Hyper, and isreferred to in later sections of the paper as each componentof the process is discussed.

The rest of this paper is organized as follows. Section 2introduces Hyper, Section 3 details background and relatedwork, Section 4 details the development of the simulator forHyper, Section 5 discusses the multi-objective neuroevolu-tionary algorithm, Section 6 presents experimental results,and Section 7 concludes the paper.

2. HYBRID PERFORMANCE PROJECTThe following sections provide an overview of the Hybrid

Performance Project and its current control techniques.

877

Figure 1: Flow chart of the approach for the multi-objective neuroevolutionary control of the Hyper fa-cility.

2.1 Hyper OverviewThe Hybrid Performance Project facility (Figure 2), Hy-

per, is located at the Department of Energy’s National En-ergy Technology Laboratory (NETL) campus in Morgan-town, West Virginia. The purpose of this experimental plantis to study the complex interactions of the direct fired SolidOxide Fuel Cell (SOFC) gas turbine hybrid configuration,as well as to develop control strategies for such a system.

Hyper is a small scale SOFC gas turbine hyrbid hardwarebased simulation, capable of emulating 320 to 820 kW hy-brid plants [19]. Hyper contains a hardware simulation of a200kW to 700kW solid oxide fuel cell (SOFC) system cou-pled with a 120kW turbine (Figure 2) [17].

The hardware based fuel cell simulation makes Hyper uniquein that it allows for a wide range of fuel cells to be simu-lated without additional cost, and allows for testing of con-trol strategies without risk of damaging a costly fuel cell.The fuel cell can be reset in software rather than being re-built in hardware, allowing for significant progress in theunderstanding of such hybrid configurations [8, 12, 14].

The standard recuperative cycle is a fundamental buildingblock of Hper. The recuperative cycle is a gas turbine cycle

Figure 2: Diagram for Hybrid Performance Projectfacility at NETL.

model used for power generation. Gas cycle turbines areeffective in power generation because they have fast startup times, can be built for a wide range of power outputs,and can use readily available fuels such as natural gas [19].

Typically atmospheric air is drawn into the system througha compressor. The pressurized air is then mixed with fuelin a combustion chamber and the fuel is ignited by addingthermal energy to the air mixture. The exhaust gases fromthe combustion chamber are then expanded through a gasturbine, generating mechanical work in a rotating shaft usedto drive the compressor and electric generator. A standardgas turbine then vents the exhaust gases from the turbineout into the atmosphere. In a turbine with regeneration,the hot exhaust from the turbine is used to preheat the airentering the combustion chamber with a heat exchanger, re-ducing the amount of fuel required to heat the air.

Solid oxide fuel cells utilize a ceramic electrolyte to chan-nel oxygen ions to react with hydrogen, producing an elec-tric current. Fuel cell turbine hybrids operate at very highefficiency, typically up to 60-75% and with low carbon emis-sions [21]. These fuel cells operate at a high temperature,reforming natural gas or other hydrocarbons to produce thehydrogen needed for the reaction, and ionizing the hydrogenand oxygen to be transported across the electrolyte. Tem-peratures can reach up to 1000◦ C, much of which is wastedin the exhaust. Fuel cells are typically slow to heat and startup, limiting their use in applications requiring fast startup[19]. Pressurized air enters the cathode of the fuel cell, andfuel enters the anode. Hot exhaust leaves the fuel cell at ahigh temperature, along with any unconverted fuel.

The Hyper project places a hardware simulation of a fuelcell between the regeneration heat exchangers and combus-tion chamber of a typical regeneration cycle. Primary heatgeneration to run the turbine comes from the fuel cell ex-haust. The combustion chamber burns any unspent fuel,assists in start up, and regulates turbine inlet temperature.Exhaust from the turbine runs through a set of parallel heatexchangers to preheat air into the fuel cell. Nearly 200 sen-sors are located across the plant designed to provide real-time information to the controller and log the system stateduring experiments.

In this work, we focus on the control of three bypass airvalves, which can adjust the state of the Hyper system.These bypass valves (seen in Figure 2) are:

• Cold air bypass (FV-170): regulates cold air flow tothe cathode of the fuel cell.

• Hot air bypass (FV-380): regulates flow of high pres-sure hot air from the heat exchangers to the combus-tion chamber.

• Bleed air valve (FV-162): regulates flow of high pres-sure air from the compressor to the atmosphere.

System wide analytic models and software simulationshave yet to be perfected for the Hyper facility, limiting thestudy of different control studies for the plant. The followingsection discusses the current methods for controlling Hyper.

2.2 Hyper Control StrategiesModern control theory was designed to deal with large,

multi-input, multi-output systems, and is the basis for mod-ern power plant control strategies. Historically, develop-ing highly accurate models for traditional coal-fired power

878

plants has been completed using thermodynamics, heat trans-fer, and fluid dynamics laws. Further, many model-basedcontrol strategies for power systems, including model pre-dictive control, allow for some error in the system modelwithout significantly affecting the performance of the controlstrategy [9]. As a result of the availability of relatively accu-rate plant models, model based control methods are preva-lent in most power plants. These control strategies rely ontwo major assumptions:

1. (Reasonably) accurate models exist for the system.

2. Models can be linearized about a feasible operationpoint.

Several methods of modeling and control of Hyper havebeen used to provide a demonstration of the potential of hy-brid fuel cell technology [18, 15]. Most notably, an adaptivecontrol scheme offers some solutions to the coupled, nonlin-ear control system in the Hyper facility [16]. Using a modelreference adaptive control, the plant was shown to be con-trollable in simulation. This control strategy utilizes empiri-cal transfer functions built from plant characterization testsas a non-linear model of the plant. A control loop uses theseplant models to determine the optimal control input.

While empirical transfer functions of the plant contain in-formation that captures coupling and nonlinearities in thesystem, these functions are extremely costly to acquire; de-veloping these transfer functions require running multipleplant characterization experiments, where the frequency ofactuation is varied and the plant state is captured and recorded.Further, if hardware changes are incorporated into the plant,the transfer functions are invalidated, requiring the plantcharacterization experiments to be rerun. The model refer-ence adaptive control can also result in unwanted oscillationsdue to the nature of the empirical transfer functions. In thispaper, we present a model-free control strategy to relax therequirement for a high-fidelity system model.

3. BACKGROUNDThe following sections describe related work on multi-

objective optimization, neuroevolutionary control, and fit-ness modeling.

3.1 Multi-Objective OptimizationWith multiple simultaneous objectives to optimize, our

definition of optimality must change to include not onlythe single-objective optima, but also the optimal tradeoffsbetween these extremes. This comes from the concept ofPareto optimality, which is a central solution concept inmulti-objective optimization. A point is said to be Paretooptimal if it is not dominated by another solution, wheredomination is defined as follows. A solution u dominatesanother solution v (u ≺ v) if it scores lower on all crite-ria (objectives c): ∀c [fc(u) < fc(v)]. A solution u weaklydominates another solution v (u � v) if it scores equal onsome objectives, but less on others: ∀c [fc(u) ≤ fc(v)] ∧∃j|[fj(u) < fj(v)] [20]. The set of Pareto optimal solutionsis termed the Pareto front.

Multi-objective optimization has the convention of mini-mization, which we use here but in this work we are max-imizing the accuracy; to resolve this, one can consider themaximization problem as one of minimizing the negative ac-curacy without a loss of generality.

Objective 1

Ob

ject

ive

2

Figure 3: PaCcET maps the current-best-estimateof the Pareto front (P ∗

I , grey points) onto an equallyvaluable line to a linear combination (red line) bymoving points radially away or toward the utopiapoint (origin in this figure). Non-dominated pointsare more valuable to a linear combination (thandominated or points in P ∗

I ) after the transform.

There are a host of multi-objective optimization algo-rithms, but in this work we limit our consideration to thePareto Concavity E limination T ransformation (PaCcET, [22])because of the following reasons: 1) It executes over an orderof magnitude faster than some of the other state-of-the-artalgorithms, 2) it has theoretical guarantees stating that thegenerated solutions will be Pareto optimal 3) It has beenshown to function well in domains with arbitrary Paretofronts and 4) it is agnostic to the optimizer used, and istypically not sensitive to optimizer parameters [22].

The primary function of PaCcET (Figure 3) is to con-stantly use the current best-estimate of the Pareto front(P ∗

I ) as an approximation of the nonlinear preference func-tion that would make every Pareto optimal solution equallyvaluable. PaCcET then projects these solutions onto a linethat is equally valuable to a linear combination of (trans-formed) objectives, which moves all unattained areas of theobjective space into the more valuable portion of the space.These solutions will then be preferred, and once attained,they will update the approximation of the Pareto front to bea better approximation of the true Pareto front. PaCcET,in this way, allows any single-objective optimizer to performa multi-objective optimization, with these guarantees.

3.2 Neuroevolutionary ControlNeural networks as controllers have the capability to re-

ject sensor and actuator noise, generalize to previously un-seen system states, and can outperform traditional linearcontrol methods for many difficult control problems [5, 10].Nonlinearities and coupling between state variables in thesystem make many problems difficult for traditional con-trol techniques, as linearizations about operation points toocoarsely approximate the system dynamics. Training thesenetworks is possible when no explicit system model existsthrough neuroevolution [13].

Neuroevolutionary control has been shown to be success-ful in many difficult and highly nonlinear domains, includ-ing micro-air vehicle control in air currents [10], dynamicquadrotor flight [11], wave energy converter ballast design[4], and finless rocket altitude and attitude control [5].

In the context of power plant control, fitness assignment

879

is difficult in neuroevolutionary algorithms; fitness of a con-troller cannot be evaluated on the physical plant due tosafety considerations, and high fidelity numerical simula-tions are too slow for a computationally tractable searchthrough the solution space. Thus, in order to develop con-trollers for power plants via neuroevolution, fitness assign-ment must be made to be computationally cheaper.

3.3 Fitness ModelingIn cases where the system performance cannot be directly

modeled (or such a model is computationally expensive),the system performance can be approximated in order toprovide feedback for an agent learning a control strategy forthe system [1, 2]. These techniques to approximate fitnessfunctions are known as fitness modeling.

Fitness modeling has been shown to successfully allow forevolutionary algorithms to be computationally tractable incomplex domains [3, 7]. Typically, the fitness evaluationdominates the computational cost of an evolutionary algo-rithm [6]. In the Hyper facility, hardware simulations runin real-time, and high fidelity simulations can run in nearreal-time. Thus, in order to develop control strategies forHyper using neuroevolution, the time needed to evaluate aparticular control policy must be dramatically reduced.

4. HYPER SIMULATORIn order to evolve a controller for Hyper, the performance

of a controller in Hyper must be evaluated with minimalcomputational cost. Thus, we develop a neural networkbased simulator for Hyper. There are two key difficultiesin modeling Hyper. First, sensor data from experimentalHyper runs is noisy, so care must be taken to ensure thesimulator does not overfit the data. Second, there existsextremely high coupling between state variables in Hyper,so the neural network based simulator must be designed toaccount for the coupling between state variables, while min-imizing the effects of crosstalk within the networks.

By using real plant data from physical Hyper runs, wecan ensure the simulation of the plant learns the underly-ing dynamics of the system. Once the neural network basedsimulator is developed, we can quickly evaluate different con-trol policies, allowing for a neuroevolutionary algorithm tobecome computationally tractable. The following sectionsdetail the process of developing the neural network basedsimulator for Hyper.

Data Selection.To train a network that can simulate the plant, a suitable

data set was found, corresponding to Box 1 in Figure 1.Many distinct experiments run at the Hyper facility, anddata was collected and recorded to be used for a trainingset for the Hyper simulator. The chosen data set was arecent experiment involving a characterization of the coldair valve. In this experiment, the cold air valve was openedto a position and was systematically changed once the planthad reached steady state. The valve was changed between10% and 80% open. The resultant data set contains 19 plantstate variables, and 2 control variables.

The data set contained 30,626 data points sampled at12.5 Hz, about 10 minutes of run data. To mitigate poten-tial problems with sensor noise affecting simulator training,data was down-sampled by averaging datapoints every 25timesteps. Thus, 1225 points (at a corresponding frequency

of 0.5Hz) are presented to the simulator to learn. All datapoints are then scaled between 0.1 and 0.9, corresponding tothe observed variable minimum value and maximum valuerespectively such that the input to the simulator is scaledconsistently.

Backpropagation.Once the data was collected, a neural network based sim-

ulator was trained to learn the state transition mappingT (st, at) → st+1; in other words, the simulator maps a statest and action at at time t to the resultant state st+1 attime t + 1. It was determined that a single two-layer feed-forward neural network was insufficient to accurately modelthe dataset, due to crosstalk in the first weight layer. Toeliminate crosstalk, we developed an ensemble based simu-lator (Box 2 in Figure 1).

The Hyper training data contains n = 19 state variablesand m = 2 control variables. Our simulator contains 19 neu-ral networks, each of which maps the system state st ∈ R

n

and action at ∈ Rm to a single state variable sit+1 ∈ R. The

ensemble is presented with a state st and action at, andeach neural network NNi finds a single value in the resul-tant state vector st+1. Each network was trained with back-propagation, with a learning rate of 0.2 and a momentumterm of 0.5. These parameters were chosen using a parame-ter sweep, but results are similar as long as the learning rateis below 0.4 (higher learning rates caused poor results). Asingle state transition from the neural network ensemble isshown in Algorithm 1.

Input: Current plant state st, control action at

Initialize output = {}foreach Network NNi do

sit+1 = NNi(st, at) // find ith term of st+1

output.Add(sit+1) // add state variable to output

endreturn output,

Algorithm 1: Neural Network Ensemble State Transi-tion Mapping

After training the neural network ensemble, its approxi-mation performance was evaluated to determine its effective-ness. We found areas of the state space where approximationerror was high, and collected more data from physical Hyperruns to improve the resolution of training data in those op-erational regimes; data from a plant startup experiment wasadded to the training data set. This active data collectionprocess is shown in the data collection loop in Figure 1.

Once the training data set was finalized, the simulatortraining process was repeated for 25 statistical trials to demon-strate the statistical significance of the training algorithm.For each statistical run, 100 data points were randomly re-moved from the training set to create a validation set. Theresulting mean squared error on the validation set had anaverage of 0.12% with a variance of 0.02%. The maximumerror for each state variable had an average of 1.71% with avariance of 0.11%.

4.1 Time Domain SimulationThe best network ensemble found during the training pro-

cess described above is used to develop a time domain sim-ulation of Hyper, corresponding to Box 3 in Figure 1. The

880

Inputs: Controller C(st, zdt ) → at, initial state s0,

desired plant trajectory Zd

Output: Plant state evolution, tracking error

Initialize error = �0foreach Time Step t do

et = zdt − st // find error in state

error += (et)2 // keep track of aggregate error

at = C(st, zdt ) // find control action

st+1 = NN(st, at) // update state

endforeach state variable i do

error[i] /= numTimeSteps // find MSEendreturn error

Algorithm 2: Time Domain Simulation

simulator takes an initial plant state s0, a desired plant tra-jectory Zd, and a controller C(st, z

dt ) → at. The controller

maps the current plant state st and the desired plant statezdt to a control action at. The simulator outputs the meansquared error of each state variable with respect to the de-sired plant trajectory. The time domain simulation is de-tailed in Algorithm 2.

1,5001,0005000

0.90

0.75

0.60

0.45

0.30

0.15

Figure 4: Comparison of true plant data with thesimulation for turbine outlet temperature

To validate the Hyper simulator, the simulation is giventhe initial plant state from the original data and then al-lowed to respond to the control inputs from the originaldata. Figure 4 shows the response of the system to the orig-inal control inputs found in the data. Mean squared erroracross all plant state variables is 0.121%. As can be seenin Figure 4, the maximum error of the network approxima-tion corresponds to about 5% at any point in the simulationfor turbine outlet temperature. This error is acceptable fordeveloping controllers, because it is of a similar magnitudecompared to noise inherent in turbine temperature sensors.This simulator along with desired Hyper system state trajec-

Inputs: Pareto approximation P ∗I ,

vector to evaluate �vOutput: fitness fP ∗I,norm ← normalize(P ∗

I )�vnorm ← normalize(�v)�vτ ← PaCcET(�vnorm, P ∗

I,norm)f ← ∑

i �vτ,ireturn f

Algorithm 3: PaCcET

tories allows us to quickly evaluate the fitness of controllersin a neuroevolutionary algorithm.

5. MULTI-OBJECTIVE NEUROEVOLUTION-ARY CONTROL ALGORITHM

The following sections detail the Pareto Concavity Elimi-nation Transformation (PaCcET) algorithm used for multi-objective fitness assignment, as well as the multi-objectiveneuroevolutionary control algorithm used in this work.

5.1 PaCcETThe full details of PaCcET are provided in [22]. In this

section we provide an overview of the algorithm. For a moredetailed treatment, we refer the reader to the other workson PaCcET [22, 23].

Determining the fitness of a two-objective solution withPaCcET requires P ∗

I , the current best-estimate of the Paretofront, and the solution v to be examined. For each solution,the PaCcETmodule finds the boundaries of the Pareto front,and normalizes P ∗

I,norm ← normalize(P ∗I ), for which each

point in P ∗I will have each of their elements in the range

[0:1]. It then uses this same normalization for the vector inquestion, v: vnorm ← normalize(v)

Once these normalizations have been completed, PaCcETthen transforms P ∗

I to be on line between (0, 1)and(1, 0).This line is equally valuable to a linear combination of trans-formed objectives. Using this same transform for v: vτ ←PaCcET(vnorm), fitness ← ∑

i vτ,i will be less than 1 if thesolution is non-dominated, and greater than 1 if it is domi-nated.

After evaluating the fitness, P ∗I must be maintained, and

as the estimate is updated, it will continually improve to bea better approximation of the true Pareto front.

5.2 Neuroevolutionary Control AlgorithmThe multi-objective neural network controllers for Hyper

are evolved according to the neuroevolutionary algorithmpresented in Algorithm 4. The neuroevolutionary algorithmcorresponds to Box 5 in Figure 1, while the scalarizationof performance across multiple objectives to assign a fit-ness (Box 4 in Figure 1) is achieved via the PaCcET al-gorithm. Throughout the evolutionary algorithm, the Non-Dominated Set (NDS, P ∗

I ) is updated to keep all non-dominatedsolutions found during learning.

The neural network controllers are two-layer, fully con-nected, feedforward neural networks. The networks have 38inputs (corresponding to 19 state variables and 19 desiredstate variables), 10 hidden units, and 2 outputs (correspond-ing to the control actions for the bleed air valve and cold airvalve). N Neural networks are initialized by drawing weightsfrom the normal distribution N(σinit, μinit), where the stan-

881

dard deviation for weight initialization σinit is set to 0.5 andthe mean for weight initialization μinit is set to 0.0.

For each generation, we implement a mutation operator,find the vector fitness (multi-objective fitness values) foreach network, scalarize the multi-objective fitness vectorsand assign fitness to each network, and downselect the pop-ulation. These operators are now described in detail. Formutation, the population size is doubled from N to 2N bycreating a mutated copy of each of the N networks in thepopulation. To mutate a network, 10 weights in each weightlayer are selected at random, and values drawn from thenormal distribution N(σmutate, μmutate are added to eachselected weight. For the normal distribution correspondingto mutation, the standard deviation σmutate is set to 1.0, andthe mean μmutate is set to 0.0. Learning parameters (num-ber of hidden units, N , σinit, μinit, σmutate, μmutate) werechosen using a parameter sweep, but results were fairly con-sistent as long as the number of hidden units was above 5.

To assign fitness, each of the 2N neural network con-trollers are tested in the Hyper simulation, and their per-formance at tracking a desired plant trajectory are deter-mined. The simulator gives a vector of mean squared errorvalues, one for each state variable in the plant. Next, wdifferent state variables are chosen as the parameters of in-terest; these are the different objectives being optimized bythe neuroevolutionary algorithm and correspond to criticalstate variables in the Hyper facility. The mean squared errorMSEi associated with each of these parameters of interesti are used to compute fitness values according to:

fiti =1.0

1.0 +MSEi

Once the vector fitness (one fitness value for each objec-tive) is found, this vector is used as an input to the PaCcETalgorithm, which converts the vector fitness into a scalarvalue. This scalar is used to assign fitness to a network.

After each of the 2N networks are assigned scalar fitnessvalues, N networks are selected to survive to the next gener-ation. For selection, a binary tournament operator is used.Two networks are taken from the population at random,and the network with a higher fitness value is placed intothe population for the next generation. At the end of eachgeneration, the NDS is updated by adding any populationmembers to the NDS that are not dominated by any solu-tions in the NDS. Once the maximum number of generationsis reached, the evolutionary algorithm returns the NDS, orthe collection of neural network controllers along the ap-proximated Pareto front.

6. RESULTSFor the experimental analysis of the neuroevolutionary

multi-objective control algorithm for Hyper, we chose twoobjectives. The first objective corresponds to tracking adesired turbine speed profile, which has various oscillationscorresponding to load demands. The second objective cor-responds to keeping the fuel cell temperature as steady aspossible, in order to minimize the transient operation of thefuel cell. In other words, we aim to develop a controller thatsimultaneously tracks load demands and keeps the fuel cellas close to steady-state as possible. For the experiments, weran 50 statistical trials of the neuroevolutionary algorithm.For each trial, the population size N was set to 25, and each

Inputs: initial plant state s0, desired trajectory Zd,parameters of interest P = {p1, p2, ..., pw}pop = {}nds = {}for i = 1 to n do

create random network npop.Add(n)

endforeach Generation do

for i = 1 to N docreate copy n′ of network ni

mutate n′

pop.Add(n′)endfor i = 1 to 2n do

�MSE = Simulator(ni, s0, Zd)

vectorF itness = {}for j = 1 to w do

vectorF itness.Add(1/(1 + �MSE[pj ]))end

ni.fitness = PaCcET ( �MSE)

endSelect N members of population to survivends.Add(pop)nds.RemoveDominatedSolutions()

endReturn: nds

Algorithm 4: Neuroevolutionary algorithm to developHyper controllers.

evolutionary algorithm was run for 7500 generations.The empirical attainment functions for the non-dominated

sets are shown in Figure 5. This figure suggests that thePareto front may have a strongly concave shape. The trade-off solutions near the middle of the front were found in nearlyevery trial, but the extremes were found in less than half.

We now consider the two policies corresponding to theendpoints of the median NDS in Figure 5. The policy onthe upper left of the NDS corresponds to maximizing thesteady state operation of the fuel cell without consideringturbine speed, while the policy on the lower right of the NDScorresponds to optimizing turbine speed tracking withoutconsidering fuel cell temperature. The performance of thesepolicies on turbine tracking are shown in Figure 6.

As seen in Figure 6, the policy which maximized tur-bine tracking accuracy tracks the desired turbine speed well,while the policy which maximized the time the fuel cell spentin steady state does not track the desired turbine speed well.Instead of accurate turbine tracking, this policy keeps theturbine at a higher than needed speed. Although this iswasteful, this policy does ensure that load demands wouldalways be met. The performance of these two policies onthe fuel cell temperature are shown in Figure 7.

As seen in Figure 7, the policy which optimized turbinetracking accuracy does very poorly at keeping the fuel celloperation near steady state. This is because the fuel celland turbine are extremely coupled, making it difficult totrack a dynamic turbine speed while keeping the fuel celltemperature constant. Conversely, the policy that aimed tokeep the fuel cell operating at steady state does very well,

882

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Turbine Speed Tracking Accuracy

00.

20.

40.

60.

81

Fue

l Cel

l Inl

et T

empe

ratu

re T

rack

ing

Acc

urac

ybest

median

worst

Figure 5: Empirical Attainment Function for multi-objective controllers. “Best:” collection of non-dominated points developed across all statisticalruns. “Median:” area which was attained in half ofthe statistical runs. “Worst:” area attained by everystatistical run.

200150100500

0.80

0.75

0.70

Figure 6: Pareto front endpoint control policies forturbine speed.

200150100500

0.8

0.7

0.6

0.5

0.4

Figure 7: Endpoint control policies for fuel cell tem-perature.

keeping the fuel cell temperature essentially constant afterthe initial heating period.

We now analyze policies along the interior of the non-dominated set, which represent tradeoffs between the ob-jectives. Figures 8 and 9 show three tradeoff policies andtheir performances in tracking desired turbine and fuel celltrajectories, respectively. The green curves show high oscil-lation in both the turbine speed and fuel cell temperatures,demonstrating that policies which appear viable mathemat-ically (based on the fitness function) may not result in de-sirable behaviors on a physical system. The orange curvesshow a policy which does not track the desired turbine speedwell (although does normally meet or exceed output require-ments), and keeps the fuel cell closer to steady state. Fi-nally, the magenta curve simply exceeds the desired turbinespeed at all times (ensuring output requirements are alwaysmet), and keeps the fuel cell at essentially steady state op-eration. These tradeoffs demonstrate that after finding anon-dominated set of control policies, a system designer canchoose which option is best suited for application in thephysical system.

7. DISCUSSIONIncreased energy demands as well as the desire for more

sustainable energy production have led to the developmentof more efficient energy sources. One such source is a fuelcell hybrid energy system, where a synergistic relationshipbetween the fuel cell and turbine power cycle result in in-creased efficiency, decreased emissions, and faster transientplant responses. A key difficulty in such plants is the highnonlinearities and coupling, resulting the the developmentof models to be extremely difficult.

In this work, we implement a model-free approach to de-veloping multi-objective controllers for the Hyper plant us-ing neuroevolution. In order for the evolutionary algorithmto be computationally tractable, we developed a statisticalsimulator of the Hyper facility based on data from physicalruns. This simulator was combined with PaCcET to form amulti-objective fitness assignment operator.

200150100500

0.85

0.80

0.75

0.70

Figure 8: Tradeoff control policies for turbine speed

Our results demonstrate that a set of non-dominated con-trollers can be found that either provide excellent perfor-mance on single objectives, or provide tradeoff choices inwhich the objectives are balanced.

883

Future work involves refining and validating the simulatorto allow for better system modeling, followed by hardwaretests of the derived control policies at the Hyper facility.

200150100500

0.80

0.75

0.70

Figure 9: Tradeoff control policies for fuel cell tem-perature

8. ACKNOWLEDGEMENTSThis research was supported in part by the US Depart-

ment of Energy - Office of Fossil Energy under Contract No.DE-AC02-07CH11358 through the Ames Laboratory. It wasalso supported in part by the US Department of Energy -National Energy Technology Laboratory under Contract No.DE-FE0012302.

9. REFERENCES[1] K. Anderson and Y. Hsu. Genetic crossover strategy

using an approximation concept. In IEEE Congress onEvolutionary Computation, 1999.

[2] J. Branke, C. Schmidt, and H. Schmeck. Efficientfitness estimation in noisy environments. InProceedings of the Genetic and EvolutionaryComputation Conference, 2001.

[3] D. Bunche, N. Schraudolph, and P. Koumoutsakos.Accelerating evolutionary algorithms using fitnessfunction models. In Proceedings of the Genetic andEvolutionary Computation Conference, 2003.

[4] M. Colby, E. Nasroullahi, and K. Tumer. Optimizingballast design of wave energy converters usingevolutionary algorithms. In Proceedings of the Geneticand Evolutionary Computation Conference, 2011.

[5] F. Gomez and R. Miikkulainen. Active guidance for afinless rocket using neuroevolution. In Proceedings ofthe Genetic and Evolutionary ComputationConference, 2003.

[6] Y. Jin. A comprehensive survey of fitnessapproximation in evolutionary computation. SoftComputing, 9(1), 2005.

[7] J. Martikainen and S. Ovaska. Fitness functionapproximation by neural networks in the optimizationof mgp-fir filters. In the 2006 IEEE MountainWorkshop on Adaptive and Learning Systems, 2006.

[8] P. Pezzini, D. Tucker, and A. Traverso. Avoidingcompressor surge during emergency shutdown hyrbid

turbine systems. Journal of Engineering for GasTurbines and Power, 2015.

[9] S. Qin and T. Badgwell. A survey of industrial modelpredictive control technology. Control EngineeringPractice, 11(7), 2003.

[10] M. Salichon and K. Tumer. A neuro-evolutionaryapproach to micro aerial vehicle control. InProceedings of the Genetic and EvolutionaryComputation Conference, 2010.

[11] J. Shepard and K. Tumer. Robust neuro-control for amicro quadrotor. In Proceedings of the Genetic andEvolutionary Computation Conference, 2010.

[12] T. Smitch, C. Haynes, W. Wepfer, D. Tucker, andE. Liese. Hardware-based simulation of a fuel cellturbine hybrid response to imposed fuel cell loadtransients. ASME 2006 International MechanicalEngineering Congress and Exposition, 2006.

[13] J. Spall and J. Cristion. Model-free control ofnonlinear stochastic systems with discrete timemeasurements. IEEE Transactions on AutomaticControl, 43(9), 1998.

[14] A. Traverso, D. Tucker, and C. Haymes. Preliminaryexperimental results of igfc operation using hardwaresimulation. ASME J. Eng. Gas Turbines Power,134(7), 2011.

[15] A. Tsai, L. Banta, L. Lawson, and D. Tucker.Determination of an empirical transfer function of asolid oxide fuel cell gas turbine hybrid system viafrequency response analysis. Journal of Fuel CellScience and Technology, 6(3), 2009.

[16] A. Tsai, D. Tucker, and T. Emami. Adaptive controlof a nonlinear fuel cell gas turbine balance of plantsimulation facility. Journal of Fuel Cell Science andTechnology, 11(6), 2014.

[17] D. Tucker, L. Lawson, and R. Gemmen. Preliminaryresult of a cold flow test in a fuel cell gas turbinehybrid simulation facility. ASME Turbo Expo, 2003.

[18] D. Tucker, L. Lawson, and R. Gemmen.Characterization of air flow management and controlin a fuel cell turbine hybrid power system usinghardware simulation. In ASME 2005 PowerConference, 2005.

[19] D. Tucker, M. Shelton, and A. Manivannan. The roleof solid oxide fuel cells in advanced hybrid powersystems of the future. The Electrochemical SocietyInterface, 18(3), 2009.

[20] D. A. V. Veldhuizen and G. B. Lamont.Multiobjective evolutionary algorithms: Analyzing thestate-of-the-art. Evolutionary Computation,8(2):125–147, 2000.

[21] W. Winkler, P. Nehter, M. Williams, D. Tucker, andR. Gemmen. General fuel cell hybrid synergies andhybrid system testing status. Journal of PowerSources, 159(1), 2006.

[22] L. Yliniemi and K. Tumer. PaCcET: An objectivespace transformation to iteratively convexify thepareto front. In 10th International Conference onSimulated Evolution And Learning (SEAL), 2014.

[23] L. Yliniemi and K. Tumer. Complete coverage in themulti-objective PaCcET framework. In S. Silva,editor, Genetic and Evolutionary ComputationConference, 2015.

884

Date post:	03-Oct-2021
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Multiobjective Neuroevolutionary Control for a Fuel Cell ...

Documents