1 Optimal Power Cost Management Using Stored Energy in … · · 2011-03-22develop an online...

arX

iv:1

103.

3099

v2 [

cs.P

F]

19 M

ar 2

011

1

Optimal Power Cost Management Using StoredEnergy in Data Centers

Rahul Urgaonkar, Bhuvan Urgaonkar†, Michael J. Neely, and Anand Sivasubramaniam†

Dept. of EE-Systems, † Dept. of CSE,University of Southern California The Pennsylvania State University,{urgaonka,mjneely}@usc.edu {bhuvan,anand}@cse.psu.edu

Abstract—Since the electricity bill of a data center constitutesa significant portion of its overall operational costs, reducingthis has become important. We investigate cost reduction op-portunities that arise by the use of uninterrupted power supply(UPS) units as energy storage devices. This represents a deviationfrom the usual use of these devices as mere transitional fail-overmechanisms between utility and captive sources such as dieselgenerators. We consider the problem of opportunistically usingthese devices to reduce the time average electric utility bill in adata center. Using the technique of Lyapunov optimization,wedevelop an online control algorithm that can optimally exploitthese devices to minimize the time average cost. This algorithmoperates without any knowledge of the statistics of the workloador electricity cost processes, making it attractive in the presenceof workload and pricing uncertainties. An interesting feature ofour algorithm is that its deviation from optimality reduces as thestorage capacity is increased. Our work opens up a new area indata center power management.

I. I NTRODUCTION

Data centers spend a significant portion of their overalloperational costs towards their electricity bills. As an ex-ample, one recent case study suggests that a large 15MWdata center (on the more energy-efficient end) might spendabout $1M on its monthly electricity bill. In general, a datacenter spends between 30-50% of its operational expensestowards power. A large body of research addresses theseexpenses by reducing the energy consumption of these datacenters. This includes designing/employing hardware withbetter power/performance trade-offs [1]–[3], software tech-niques for power-aware scheduling [4], workload migration,resource consolidation [5], among others. Power prices exhibitvariations along time, space (geography), and even acrossutility providers. As an example, consider Fig. 1 that showstheaverage hourly spot market prices for the Los Angeles ZoneLA1 obtained from CAISO [6]. These correspond to the weekof 01/01/2005-01/07/2005 and denote the average price of 1MW-Hour of electricity. Consequently, minimization of energyconsumption need not coincide with that of the electricity bill.

Given the diversity within power price and availability,attention has recently turned towardsdemand response(DR)within data centers. DR within a data center (or a set ofrelated data centers) attempts to optimize the electricitybillby adapting its needs to the temporal, spatial, and cross-utilitydiversity exhibited by power price. The key idea behind thesetechniques is to preferentially shift power draw (i) to times andplaces or (ii) from utilities offering cheaper prices. Typically

0 20 40 60 80 100 120 140 1600

50

100

150

Hour

Price

($

/MW

−H

ou

r)

Fig. 1. Average hourly spot market price during the week of 01/01/2005 -01/07/2005 for LA1 Zone [6].

some constraints in the form of performance requirements forthe workload (e.g., response times offered to the clients ofaWeb-based application) limit the cost reduction benefits thatcan result from such DR. Whereas existing DR techniqueshave relied on various forms of workload scheduling/shifting,a complementary knob to facilitate such movement of powerneeds is offered byenergy storage devices, typically uninter-rupted power supply (UPS) units, residing in data centers.

A data center deploys captive power sources, typically dieselgenerators (DG), that it uses for keeping itself powered upwhen the utility experiences an outage. The UPS units serveas a bridging mechanism to facilitate this transition from utilityto DG: upon a utility failure, the data center is kept poweredby the UPS unit using energy stored within its batteries, beforethe DG can start up and provide power. Whereas this transitiontakes only 10-20 seconds, UPS units have enough batterycapacity to keep the entire data center powered at its maximumpower needs for anywhere between 5-30 minutes. Tapping intothe energy reserves of the UPS unit can allow a data center toimprove its electricity bill. Intuitively, the data centerwouldstore energy within the UPS unit when prices are low and usethis to augment the draw from the utility when prices are high.

In this paper, we consider the problem of developing anonline control policy to exploit the UPS unit along with thepresence of delay-tolerance within the workload to optimizethe data center’s electricity bill. This is a challenging problembecause data centers experience time-varying workloads andpower prices with possibly unknown statistics. Even whenstatistics can be approximated (say by learning using pastobservations), traditional approaches to construct optimal con-trol policies involve the use of Markov Decision Theoryand Dynamic Programming [7]. It is well known that these

http://arxiv.org/abs/1103.3099v2

2

techniques suffer from the “curse of dimensionality” wherethe complexity of computing the optimal strategy grows withthe system size. Furthermore, such solutions result in hard-to-implement systems, where significant re-computation mightbeneeded when statistics change.

In this work, we make use of a different approach that canovercome the challenges associated with dynamic program-ming. This approach is based on the recently developed tech-nique of Lyapunov optimization [8] [9] that enables the designof online control algorithms for such time-varying systems.These algorithms operatewithout requiring any knowledge ofthe system statisticsand are easy to implement. We design suchan algorithm for optimally exploiting the UPS unit and delay-tolerance of workloads to minimize the time average cost. Weshow that our algorithm can get withinO(1/V ) of the optimalsolution where the maximum value ofV is limited by batterycapacity. We note that, for the same parameters, a dynamicprogramming based approach (if it can be solved) will yield abetter result than our algorithm. However, this gap reducesasthe battery capacity is increased. Our algorithm is thus mostuseful when such scaling is practical.

II. RELATED WORK

One recent body of work proposes online algorithms forusing UPS units for cost reduction via shaving workload“peaks” that correspond to higher energy prices [10], [11].This work is highly complementary to ours in that it offers aworst-case competitive ratio analysis while our approach looksat the average case performance. Whereas a variety of workhas looked at workload shifting for power cost reduction [2]or other reasons such as performance and availability [5], ourwork differs both due to its usage of energy storage as well asthe cost optimality guarantees offered by our technique. Someresearch has considered consumers with access to multipleutility providers, each with a different carbon profile, powerprice and availability and looked at optimizing cost subject toperformance and/or carbon emissions constraints [12]. Anotherline of work has looked at cost reduction opportunities offeredby geographical variations within utility prices for data centerswhere portions of workloads could be serviced from oneof several locations [12], [13]. Finally, [14] considers theuse of rechargeable batteries for maximizing system utilityin a wireless network. While all of this research is highlycomplementary to our work, there are three key differences:(i) our investigation of energy storage as an enabler of costreduction, (ii) our use of the technique of Lyapunov optimiza-tion which allows us to offer a provably cost optimal solution,and (iii) combining energy storage with delay-tolerance withinworkloads.

III. B ASIC MODEL

We consider a time-slotted model. In the basic model, weassume that in every slot, the total power demand generated bythe data center in that slot must be met in the current slot itself(using a combination of power drawn from the utility and thebattery). Thus, any buffering of the workload generated by thedata center is not allowed. We will relax this constraint later

Battery

Data

Center

-

+Grid

P(t) R(t) D(t)

P(t) - R(t)

W(t)

Fig. 2. Block diagram for the basic model.

in Sec. VI when we allow buffering of some of the workloadwhile providing worst case delay guarantees. In the following,we use the terms UPS and battery interchangeably.

A. Workload Model

Let W (t) be total workload (in units of power) generatedin slot t. Let P (t) be the total power drawn from the grid inslot t out of whichR(t) is used to recharge the battery. Also,let D(t) be the total power discharged from the battery in slott. Then in the basic model, the following constraint must besatisfied in every slot (Fig. 2):

W (t) = P (t)−R(t) +D(t) (1)

Every slot, a control algorithm observesW (t) and makesdecisions about how much power to draw from the grid inthat slot, i.e.,P (t), and how much to recharge and dischargethe battery, i.e.,R(t) andD(t). Note that by (1), having chosenP (t) andR(t) completely determinesD(t).

Assumptions on the statistics ofW (t): The workload pro-cessW (t) is assumed to vary randomly taking values froma setW of non-negative values and is not influenced by pastcontrol decisions. The setW is assumed to be finite, withpotentially arbitrarily large size. The underlying probabilitydistribution or statistical characterization ofW (t) is not nec-essarily known. We only assume that its maximum value isfinite, i.e.,W (t) ≤ Wmax for all t. Note that unlike existingwork in this domain, we do not make assumptions such asPoisson arrivals or exponential service times.

For simplicity, in the basic model we assume thatW (t)evolves according to an i.i.d. process noting that the algo-rithm developed for this case can be applied without anymodifications to non i.i.d. scenarios as well. The analysis andperformance guarantees for the non i.i.d. case can be obtainedusing the delayed Lyapunov drift andT slot drift techniquesdeveloped in [8] [9].

B. Battery Model

Ideally, we would like to incorporate the following idiosyn-crasies of battery operation into our model. First, batteriesbecome unreliable as they are charged/discharged, with higherdepth-of-discharge (DoD) - percentage of maximum chargeremoved during a discharge cycle - causing faster degradationin their reliability. This dependence between the useful lifetimeof a battery and how it is discharged/charged is expressedvia battery lifetime charts [15]. For example, with lead-acidbatteries that are commonly used in UPS units, 20% DoDyields1400 cycles [16]. Second, batteries have conversion loss

3

whereby a portion of the energy stored in them is lost whendischarging them (e.g., about 10-15% for lead-acid batteries).Furthermore, certain regions of battery operation (high rate ofdischarge) are more inefficient than others. Finally, the storageitself maybe “leaky”, so that the stored energy decreases overtime, even in the absence of any discharging.

For simplicity, in the basic model we will assume thatthere is no power loss either in recharging or discharging thebatteries, noting that this can be easily generalized to thecasewhere a fraction ofR(t), D(t) is lost. We will also assumethat the batteries are not leaky, so that the stored energy leveldecreases only when they are discharged. This is a reasonableassumption when the time scale over which the loss takesplace is much larger than that of interest to us. To model theeffect of repeated recharging and discharging on the battery’slifetime, we assume that with each recharge and dischargeoperation, a fixed cost (in dollars) ofCrc andCdc respectivelyis incurred. The choice of these parameters would affect thetrade-off between the cost of the battery itself and the costreduction benefits it offers. For example, suppose a new batterycostsB dollars and it can sustainN discharge/charge cycles(ignoring DoD for now). Then settingCrc = Cdc = B/Nwould amount to expecting the battery to “pay for itself” byaugmenting the utilityN times over its lifetime.

In any slot, we assume that one can either recharge ordischarge the battery or do neither, but not both. This meansthat for all t, we have:

R(t) > 0 =⇒ D(t) = 0, D(t) > 0 =⇒ R(t) = 0 (2)

Let Y (t) denote the battery energy level in slott. Then, thedynamics ofY (t) can be expressed as:

Y (t+ 1) = Y (t)−D(t) +R(t) (3)

The battery is assumed to have a finite capacityYmax so thatY (t) ≤ Ymax for all t. Further, for the purpose of reliability,it may be required to ensure that a minimum energy levelYmin ≥ 0 is maintained at all times. For example, this couldrepresent the amount of energy required to support the datacenter operations until a secondary power source (such asDG) is activated in the event of a grid outage. Recall that theUPS unit is integral to the availability of power supply to thedata center upon utility outage. Indiscriminate discharging ofUPS can leave the data center in situations where it is unableto safely fail-over to DG upon a utility outage. Therefore,discharging the UPS must be done carefully so that it stillpossesses enough charge so reliably carry out its role as atransition device between utility and DG. Thus, the followingcondition must be met in every slot under any feasible controlalgorithm:

Ymin ≤ Y (t) ≤ Ymax (4)

The effectiveness of the online control algorithm we presentin Sec. V will depend on the magnitude of the differenceYmax − Ymin. In most practical scenarios of interest, thisvalue is expected to be at least moderately large: recent worksuggests that storing energyYmin to last about a minute issufficient to offer reliable data center operation [17], while

Ymax can vary between 5-20 minutes (or even higher) due toreasons such as UPS units being available only in certain sizesand the need to keep room for future IT growth. Furthermore,the UPS units are sized based on themaximum provisionedcapacity of the data center, which is itself often substantially(up to twice [18]) higher than the maximum actual powerdemand.

The initial charge level in the battery is given byYinit andsatisfiesYmin ≤ Yinit ≤ Ymax. Finally, we assume that themaximum amounts by which we can recharge or discharge thebattery in any slot are bounded. Thus, we have∀t:

0 ≤ R(t) ≤ Rmax, 0 ≤ D(t) ≤ Dmax (5)

We will assume thatYmax − Ymin > Rmax + Dmax whilenoting that in practice,Ymax − Ymin is much larger thanRmax + Dmax. Note that any feasible control decision onR(t), D(t) must ensure that both of the constraints (4) and(5) are satisfied. This is equivalent to the following:

0 ≤ R(t) ≤ min[Rmax, Ymax − Y (t)] (6)

0 ≤ D(t) ≤ min[Dmax, Y (t)− Ymin] (7)

C. Cost Model

The cost per unit of power drawn from the grid in slottis denoted byC(t). In general, it can depend on bothP (t),the total amount of power drawn in slott, and an auxiliarystate variableS(t), that captures parameters such as time ofday, identity of the utility provider, etc. For example, theperunit cost may be higher during business hours, etc. Similarly,for any fixedS(t), it may be the case thatC(t) increases withP (t) so that per unit cost of electricity increases as more poweris drawn. This may be because the utility provider wants todiscourage heavier loading on the grid. Thus, we assume thatC(t) is a function of bothS(t) andP (t) and we denote thisas:

C(t) = C(S(t), P (t)) (8)

For notational convenience, we will useC(t) to denote the perunit cost in the rest of the paper noting that the dependenceof C(t) on S(t) andP (t) is implicit.

The auxiliary state processS(t) is assumed to evolveindependently of the decisions taken by any control policy.For simplicity, we assume that every slot it takes valuesfrom a finite but arbitrarily large setS in an i.i.d. fashionaccording to a potentially unknown distribution. This can againbe generalized to non i.i.d. Markov modulated scenarios usingthe techniques developed in [8] [9]. For eachS(t), the unit costis assumed to be a non-decreasing function ofP (t). Note thatit is not necessarily convex or strictly monotonic or continuous.This is quite general and can be used to model a variety ofscenarios. A special case is whenC(t) is only a function ofS(t). The optimal control action for this case has a particularlysimple form and we will highlight this in Sec. V-A1. The unitcost is assumed to be non-negative and finite for allS(t), P (t).

We assume that the maximum amount of power that can bedrawn from the grid in any slot is upper bounded byPpeak.

4

Thus, we have for allt:

0 ≤ P (t) ≤ Ppeak (9)

Note that if we consider the original scenario where batteriesare not used, thenPpeak must be such that all workload canbe satisfied. Therefore,Ppeak ≥ Wmax.

Finally, letCmax andCmin denote the maximum and min-imum per unit cost respectively over allS(t), P (t). Also letχmin > 0 be a constant such that for anyP1, P2 ∈ [0, Ppeak]whereP1 ≤ P2, the following holds for allχ ≥ χmin:

P1(−χ+ C(P1, S)) ≥ P2(−χ+ C(P2, S)) ∀S ∈ S(10)

For example, whenC(t) does not depend onP (t), thenχmin = Cmax satisfies (10). This follows by noting that(−Cmax + C(t)) ≤ 0 for all t. Similarly, supposeC(t)does not depend onS(t), but is continuous, convex, andincreasing inP (t). Then, it can be shown thatχmin =C(Ppeak) + PpeakC

′(Ppeak) satisfies (10) whereC′(Ppeak)denotes the derivative ofC(t) evaluated atPpeak. In thefollowing, we assume that such a finiteχmin exists for thegiven cost model. We further assume thatχmin > Cmin. Thecase ofχmin = Cmin corresponds to the degenerate casewhere the unit cost is fixed for all times and we do not considerit.

What is known in each slot?: We assume that the value ofS(t) and the form of the functionC(P (t), S(t)) for that slotis known. For example, this may be obtained beforehand usingpre-advertised prices by the utility provider. We assume thatgiven anS(t) = s, C(t) is a deterministic function ofP (t)and this holds for alls. Similarly, the amount of incomingworkloadW (t) is known at the beginning of each slot.

Given this model, our goal is to design a control algorithmthat minimizes the time average cost while meeting all theconstraints. This is formalized in the next section.

IV. CONTROL OBJECTIVE

Let P (t), R(t) andD(t) denote the control decisions madein slot t by any feasible policy under the basic model asdiscussed in Sec. III. These must satisfy the constraints (1), (2),(6), (7), and (9) every slot. We define the following indicatorvariables that are functions of the control decisions regardinga recharge or discharge operation in slott:

1R(t) =

{

1 if R(t) > 00 else

1D(t) =

{

1 if D(t) > 00 else

Note that by (2), at most one of1R(t) and 1C(t) can takethe value1. Then the total cost incurred in slott is given byP (t)C(t)+1R(t)Crc+1D(t)Cdc. The time-average cost underthis policy is given by:

limt→∞

1

t

t−1∑

τ=0

E {P (τ)C(τ) + 1R(τ)Crc + 1D(τ)Cdc} (11)

where the expectation above is with respect to the potentialrandomness of the control policy. Assuming for the time beingthat this limit exists, our goal is to design a control algorithmthat minimizes this time average cost subject to the constraints

described in the basic model. Mathematically, this can bestated as the followingstochastic optimization problem:

P1 :

Minimize: limt→∞

1

t

t−1∑

τ=0

E {P (τ)C(τ) + 1R(τ)Crc + 1D(τ)Cdc}

Subject to: Constraints(1), (2), (6), (7), (9)

The finite capacity and underflow constraints (6), (7) makethis a particularly challenging problem to solve even if thestatistical descriptions of the workload and unit cost processare known. For example, the traditional approach based onDynamic Programming [7] would have to compute the optimalcontrol action for all possible combinations of the batterycharge level and the system state(S(t),W (t)). Instead, wetake an alternate approach based on the technique of Lyapunovoptimization, taking thefinite size queues constraint explicitlyinto account.

Note that a solution to the problemP1 is a control policythat determines the sequence of feasible control decisionsP (t), R(t), D(t), to be used. Letφopt denote the value ofthe objective in problemP1 under an optimal control policy.Define the time-average rate of recharge and discharge underany policy as follows:

R = limt→∞

1

t

t−1∑

τ=0

E {R(τ)} , D = limt→∞

1

t

t−1∑

τ=0

E {D(τ)} (12)

Now consider the following problem:

P2 :


1

t

t−1∑

τ=0


Subject to: Constraints(1), (2), (5), (9)

R = D (13)

Let φ denote the value of the objective in problemP2 underan optimal control policy. By comparingP1 andP2, it can beshown thatP2 is less constrained thanP1. Specifically, anyfeasible solution toP1 would also satisfyP2. To see this, con-sider any policy that satisfies (6) and (7) for allt. This ensuresthat constraints (4) and (5) are always met by this policy. Thensumming equation (3) over allτ ∈ {0, 1, 2, . . . , t − 1} underthis policy and taking expectation of both sides yields:

E {Y (t)} − Yinit =t−1∑

τ=0

E {R(τ) −D(τ)}

SinceYmin ≤ Y (t) ≤ Ymax for all t, dividing both sides bytand taking limits ast → ∞ yieldsR = D. Thus, this policysatisfies constraint (13) ofP2. Therefore, any feasible solutionto P1 also satisfiesP2. This implies that the optimal value ofP2 cannot exceed that ofP1, so thatφ ≤ φopt.

Our approach to solvingP1 will be based on this observa-tion. We first note that it is easier to characterize the optimalsolution toP2. This is because the dependence onY (t) hasbeen removed. Specifically, it can be shown that the optimalsolution to P2 can be achieved by a stationary, randomized

5

control policy that chooses control actionsP (t), D(t), R(t)every slot purely as a function (possibly randomized) of thecurrent state(W (t), S(t)) and independent ofthe batterycharge levelY (t). This fact is presented in the followinglemma:

Lemma 1: (Optimal Stationary, Randomized Policy): If theworkload processW (t) and auxiliary processS(t) are i.i.d.over slots, then there exists a stationary, randomized policythat takes control decisionsP stat(t), Rstat(t), Dstat(t) everyslot purely as a function (possibly randomized) of the currentstate(W (t), S(t)) while satisfying the constraints (1), (2), (5),(9) and providing the following guarantees:

E{

Rstat(t)}

= E{

Dstat(t)}

(14)

E{

P stat(t)C(t) + 1statR (t)Crc + 1statD (t)Cdc

}

= φ (15)

where the expectations above are with respect to the station-ary distribution of(W (t), S(t)) and the randomized controldecisions.

Proof: This result follows from the framework in [8], [9]and is omitted for brevity.

It should be noted that while it is possible to characterizeand potentially compute such a policy, it may not be feasiblefor the original problemP1 as it could violate the constraints(6) and (7). However, the existence of such a policy can beused to construct an approximately optimal policy that meetsall the constraints ofP1 using the technique of Lyapunovoptimization [8] [9]. This policy is dynamic and does notrequire knowledge of the statistical description of the workloadand cost processes. We present this policy and derive itsperformance guarantees in the next section. This dynamicpolicy is approximately optimal where the approximationfactor improves as the battery capacity increases. Also notethat the distance from optimality for our policy is measuredin terms of φ. However, sinceφ ≤ φopt, in practice, theapproximation factor is better than the analytical bounds.

V. OPTIMAL CONTROL ALGORITHM

We now present anonline control algorithm that approx-imately solvesP1. This algorithm uses a control parameterV > 0 that affects the distance from optimality as shownlater. This algorithm also makes use of a “queueing” statevariableX(t) to track the battery charge level and is definedas follows:

X(t) = Y (t)− V χmin −Dmax − Ymin (16)

Recall thatY (t) denotes the actual battery charge level in slott and evolves according to (3). It can be seen thatX(t) issimply a shifted version ofY (t) and its dynamics is given by:

X(t+ 1) = X(t)−D(t) +R(t) (17)

Note that X(t) can be negative. We will show that thisdefinition enables our algorithm to ensure that the constraint(4) is met.

We are now ready to state the dynamic control algorithm.Let (W (t), S(t)) andX(t) denote the system state in slott.Then the dynamic algorithm chooses control actionP (t) as

Wmid

Wlow

Whigh

t

Fig. 3. PeriodicW (t) process in the example.

the solution to the following optimization problem:

P3 :

Minimize: X(t)P (t) + V[

P (t)C(t) + 1R(t)Crc + 1D(t)Cdc

]

Subject to: Constraints(1), (2), (5), (9)

The constraints above result in the following constraint onP (t):

Plow ≤ P (t) ≤ Phigh (18)

wherePlow = max[0,W (t) − Dmax] and Phigh =min[Ppeak,W (t) + Rmax]. Let P ∗(t) denote the optimalsolution to P3. Then, the dynamic algorithm chooses therecharge and discharge values as follows.

R∗(t) =

{

P ∗(t)−W (t) if P ∗(t) > W (t)0 else

D∗(t) =

{

W (t)− P ∗(t) if P ∗(t) < W (t)0 else

Note that ifP ∗(t) = W (t), then bothR∗(t) = 0 andD∗(t) =0 and all demand is met using power drawn from the grid. Itcan be seen from the above that the control decisions satisfythe constraints0 ≤ R∗(t) ≤ Rmax and0 ≤ D∗(t) ≤ Dmax.That the finite battery constraints and the constraints (6),(7)are also met will be shown in Sec. V-C.

After computing these quantities, the algorithm implementsthem and updates the queueing variableX(t) according to(17). This process is repeated every slot. Note that in solvingP3, the control algorithm only makes use of the current systemstate values and does not require knowledge of the statisticsof the workload or unit cost processes. Thus, it ismyopicandgreedyin nature. FromP3, it is seen that the algorithm triesto recharge the battery whenX(t) is negative and per unitcost is low. And it tries to discharge the battery whenX(t)is positive. That this is sufficient to achieve optimality willbe shown in Theorem 1. The queueing variableX(t) plays acrucial role as making decisions purely based on prices is notnecessarily optimal.

To get some intuition behind the working of this algorithm,consider the following simple example. SupposeW (t) cantake three possible values from the set{Wlow,Wmid,Whigh}where Wlow < Wmid < Whigh. Similarly, C(t) can takethree possible values in{Clow, Cmid, Chigh} whereClow <Cmid < Chigh and does not depend onP (t). We assume thatthe workload process evolves in a frame-based periodic fash-ion. Specifically, in every odd numbered frame,W (t) = Wmid

for all except the last slot of the frame whenW (t) = Wlow.

6

Ymax 20 30 40 50 75 100V 0 1.25 2.5 3.75 6.875 10.0

Avg. Cost 94.0 92.5 91.1 88.5 88.0 87.0

TABLE IAVERAGE COST VS. Ymax

In every even numbered frame,W (t) = Wmid for all exceptthe last slot of the frame whenW (t) = Whigh. This isillustrated in Fig. 3. TheC(t) process evolves similarly, suchthat C(t) = Clow whenW (t) = Wlow, C(t) = Cmid whenW (t) = Wmid, andC(t) = Chigh whenW (t) = Whigh.

In the following, we assume a frame size of5 slots withWlow = 10, Wmid = 15, and Whigh = 20 units. Also,Clow = 2, Cmid = 6, and Chigh = 10 dollars. Finally,Rmax = Dmax = 10, Ppeak = 20, Crc = Cdc = 5,Yinit = Ymin = 0 and we varyYmax > Rmax + Dmax. Inthis example, intuitively, an optimal algorithm that knowstheworkload and unit cost process beforehand would recharge thebattery as much as possible whenC(t) = Clow and dischargeit as much as possible whenC(t) = Chigh. In fact, it canbe shown that the following strategy is feasible and achievesminimum average cost:

• If C(t) = Clow,W (t) = Wlow, then P (t) = Wlow +Rmax, R(t) = Rmax, D(t) = 0.

• If C(t) = Cmid,W (t) = Wmid, then P (t) = Wmid,R(t) = 0, D(t) = 0.

• If C(t) = Chigh,W (t) = Whigh, thenP (t) = Whigh −Dmax, R(t) = 0, D(t) = Dmax.

The time average cost resulting from this strategy can be easilycalculated and is given by87.0 dollars/slot for allYmax >10. Also, we note that the cost resulting from an algorithmthat does not use the battery in this example is given by94.0dollars/slot.

Now we simulate the dynamic algorithm for this examplefor different values ofYmax for 1000 slots (200 frames).The value ofV is chosen to beYmax−Ymin−Rmax−Dmax

Chigh−Clow=

Ymax−208 (this choice will become clear in Sec. V-B when we

relateV to the battery capacity). Note that the number of slotsfor which a fully charged battery can sustain the data centerat maximum load isYmax/Whigh.

In Table I, we show the time average cost achieved fordifferent values ofYmax. It can be seen that asYmax in-creases, the time average cost approaches the optimal value.(This behavior will be formalized in Theorem 1) This isremarkable given that the dynamic algorithm operates withoutany knowledge of the future workload and cost processes.To examine the behavior of the dynamic algorithm in moredetail, we fixYmax = 100 and look at the sample paths of thecontrol decisions taken by the optimal offline algorithm andthe dynamic algorithm during the first200 slots. This is shownin Figs. 4 and 5. It can be seen that initially, the dynamic tendsto perform suboptimally. But eventually itlearnsto make closeto optimal decisions.

It might be tempting to conclude from this example that analgorithm based on a price threshold is optimal. Specifically,such an algorithm makes a recharge vs. discharge decision

0 20 40 60 80 100 120 140 160 180 20010

15

20

time

P(t

)

Fig. 4. P (t) under the offline optimal solution withYmax = 100.

0 20 40 60 80 100 120 140 160 180 20010

15

20

time

P(t

)

Fig. 5. P (t) under the Dynamic Algorithm withYmax = 100.

depending on whether the current priceC(t) is smaller orlarger than a threshold. However, it is easy to constructexamples where the dynamic algorithm outperforms such athreshold based algorithm. Specifically, suppose that theW (t)process takes values from the interval[10, 90] uniformly atrandom every slot. Also, supposeC(t) takes values fromthe set{2, 6, 10} dollars uniformly at random every slot. Wefix the other parameters as follows:Rmax = Dmax = 10,Ppeak = 90, Crc = Cdc = 1, Yinit = Ymin = 0 andYmax = 100. We then simulate a threshold based algorithmfor different values of the threshold in the set{2, 6, 10} andselect the one that yields the smallest cost. This was found tobe280.7 dollars/slot. We then simulate the dynamic algorithmfor 10000 slots with V = Ymax−20

10−2 = 10.0 and it yields anaverage cost of275.5 dollars/slot. We also note that the costresulting from an algorithm that does not use the battery inthis example is given by300.73 dollars/slot.

We now establish two properties of the structure of theoptimal solution toP3 that will be useful in analyzing itsperformance later.

Lemma 2:The optimal solution toP3 has the followingproperties:

1) If X(t) > −V Cmin, then the optimal solution alwayschoosesR∗(t) = 0.

2) If X(t) < −V χmin, then the optimal solution alwayschoosesD∗(t) = 0.

Proof: See Appendix A.

A. SolvingP3

In general, the complexity of solvingP3 depends on thestructure of the unit cost functionC(t). For many cases ofpractical interest,P3 is easy to solve and admits closed formsolutions that can be implemented in real time. We considertwo such cases here. Letθ(t) denote the value of the objectivein P3 when there is no recharge or discharge. Thusθ(t) =W (t)(X(t) + V C(t)).

7

1) C(t) does not depend onP (t): Suppose thatC(t)depends only onS(t) and not onP (t). We can rewrite theexpression in the objective ofP3 asP (t)(X(t) + V C(t)) +1R(t)V Crc + 1D(t)V Cdc. Then, the optimal solution has thefollowing simple threshold structure.

1) If X(t)+V C(t) > 0, thenR∗(t) = 0 so that there is norecharge and we have the following two cases:

a) If Plow(X(t) + V C(t)) + V Cdc < θ(t), then dis-charge as much as possible, so that we getD∗(t) =min[W (t), Dmax], P ∗(t) = max[0,W (t)−Dmax].

b) Else, draw all power from the grid. This yieldsD∗(t) = 0 andP ∗(t) = W (t).

2) Else ifX(t) + V C(t) ≤ 0, thenD∗(t) = 0 so that thereis no discharge and we have the following two cases:

a) If Phigh(X(t)+V C(t))+V Crc < θ(t), then rechargeas much as possible. This yieldsR∗(t) = min[Ppeak−W (t), Rmax] andP ∗(t) = min[Ppeak,W (t) +Rmax].

b) Else, draw all power from the grid. This yieldsR∗(t) =0 andP ∗(t) = W (t).

We will show that this solution is feasible and does notviolate the finite battery constraint in Sec. V-C.

2) C(t) convex, increasing inP (t): Next suppose for eachS(t), C(t) is convex and increasing inP (t). For exam-ple, C(S(t), P (t)) may have the formα(S(t))P 2(t) whereα(S(t)) > 0 for all S(t). In this case,P3 becomes a standardconvex optimization problem in a single variableP (t) and canbe solved efficiently. The full solution is provided in AppendixE.

B. Performance Theorem

We first define an upper boundVmax on the maximum valuethat V can take in our algorithm.

Vmax△

=Ymax − Ymin −Rmax −Dmax

χmin − Cmin

(19)

Then we have the following result.Theorem 1:(Algorithm Performance) Suppose the initial

battery charge levelYinit satisfiesYmin ≤ Yinit ≤ Ymax. Thenimplementing the algorithm above with any fixed parameterVsuch that0 < V ≤ Vmax for all t ∈ {0, 1, 2, . . .} results inthe following performance guarantees:

1) The queueX(t) is deterministically upper and lowerbounded for allt as follows:

−V χmin −Dmax ≤ X(t) ≤ Ymax − Ymin

−Dmax − V χmin (20)

2) The actual battery levelY (t) satisfiesYmin ≤ Y (t) ≤Ymax for all t.

3) All control decisions are feasible.4) If W (t) and S(t) are i.i.d. over slots, then the time-

average cost under the dynamic algorithm is withinB/Vof the optimal value:

limt→∞

1

t

t−1∑

τ=0


≤ φopt +B/V (21)

whereB is a constant given byB =max[R2

max,D2

max]2

andφopt is the optimal solution toP1 underany feasiblecontrol algorithm (possibly with knowledge of futureevents).

Theorem 1 part4 shows that by choosing largerV , the time-average cost under the dynamic algorithm can be pushed closerto the minimum possible valueφopt. However,Vmax limitshow largeV can be chosen.

C. Proof of Theorem 1

Here we prove Theorem 1.Proof: (Theorem 1 part1) We first show that (20) holds

for t = 0. We have that

Ymin ≤ Y (0) = Yinit ≤ Ymax (22)

Using the definition (16), we have thatY (0) = X(0) +V χmin +Dmax + Ymin. Using this in (22), we get:

Ymin ≤ X(0) + V χmin +Dmax + Ymin ≤ Ymax

This yields

−V χmin −Dmax ≤ X(0) ≤ Ymax − Ymin −Dmax

− V χmin

Now suppose (20) holds for slott. We will show that italso holds for slott + 1. First, suppose−V Cmin < X(t) ≤Ymax−Ymin−Dmax−V χmin. Then, from Lemma 2, we havethat R∗(t) = 0. Thus, using (17), we have thatX(t + 1) ≤X(t) ≤ Ymax−Ymin−Dmax−V χmin. Next, supposeX(t) ≤−V Cmin. Then, the maximum possible increase isRmax sothatX(t+ 1) ≤ −V Cmin + Rmax. Now for all V such that0 < V ≤ Vmax, we have that−V Cmin + Rmax ≤ Ymax −Ymin−Dmax−V χmin. This follows from the definition (19)and the fact thatχmin ≥ Cmin. Thus, we haveX(t + 1) ≤Ymax − Ymin −Dmax − V χmin.

Next, suppose−V χmin − Dmax ≤ X(t) < −V χmin.Then, from Lemma 2, we have thatD∗(t) = 0. Thus, using(17) we have thatX(t + 1) ≥ X(t) ≥ −V χmin − Dmax.Next, suppose−V χmin ≤ X(t). Then the maximum possibledecrease isDmax so thatX(t+1) ≥ −V χmin−Dmax for thiscase as well. This shows thatX(t+ 1) ≥ −V χmin −Dmax.Combining these two bounds proves (20).

Proof: (Theorem 1 parts2 and3) Part2 directly followsfrom (20) and (16). UsingY (t) = X(t) + V χmin +Dmax +Ymin in the lower bound in (20), we have:−V χmin−Dmax ≤Y (t)− V χmin −Dmax − Ymin, i.e.,Ymin ≤ Y (t). Similarly,using Y (t) = X(t) + V χmin + Dmax + Ymin in the upperbound in (20), we have:Y (t) − V χmin − Dmax − Ymin ≤Ymax − Ymin −Dmax − V χmin, i.e.,Y (t) ≤ Ymax.

Part3 now follows from part2 and the constraint onP (t)in P3.

Proof: (Theorem 1 part4) We make use of the techniqueof Lyapunov optimization to show (21). We start by defininga Lyapunov function as a scalar measure of congestion inthe system. Specifically, we define the following Lyapunovfunction:L(X(t))△=

12X

2(t). Define the conditional1-slot Lya-

8

punov drift as follows:

∆(X(t))△=E {L(X(t+ 1))− L(X(t))|X(t)} (23)

Using (17),∆(X(t)) can be bounded as follows (see AppendixB for details):

∆(X(t)) ≤ B −X(t)E {D(t)−R(t)|X(t)} (24)

whereB =max[R2

max,D2

max]2 . Following the Lyapunov opti-

mization framework, we add to both sides of (24) the penaltytermV E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)} to get thefollowing:

∆(X(t)) + V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)}

≤ B −X(t)E {D(t)−R(t)|X(t)}

+ V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)}(25)

Using the relationW (t) = P (t)−R(t)+D(t), we can rewritethe above as:

∆(X(t)) + V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)} ≤

B −X(t)E {W (t)|X(t)}+X(t)E {P (t)|X(t)}

+ V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)} (26)

Comparing this withP3, it can be seen that given any queuevalue X(t), our control algorithm is designed tominimizethe right hand side of (26) over all possible feasible controlpolicies. This includes the optimal, stationary, randomizedpolicy given in Lemma 1. Then, plugging the control decisionscorresponding to the stationary, randomized policy, it canbeshown that:

∆(X(t)) + V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|X(t)} ≤

B + V E{

P stat(t)Cstat(t) + 1statR (t)Crc + 1statD (t)Cdc|X(t)}

= B + V φ ≤ B + V φopt

Taking the expectation of both sides and using the law ofiterated expectations and summing overt ∈ {0, 1, 2, . . . , T −1}, we get:

T−1∑

t=0

V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc} ≤

BT + V Tφopt − E {L(X(T ))}+ E {L(X(0))}

Dividing both sides byV T and taking limit asT → ∞ yields:

limT→∞

1

T

T−1∑

t=0

E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc} ≤ φopt +B/V

where we have used the fact thatE {L(X(0))} is finite andthatE {L(X(T ))} is non-negative.

VI. EXTENSIONS TOBASIC MODEL

In this section, we extend the basic model of Sec. III to thecase where portions of the workload are delay-tolerant in thesense they can be postponed by a certain amount without af-fecting the utility the data center derives from executing them.We refer to such postponement as buffering the workload.Specifically, we assume that the total workload consists of both

Battery-

+Grid

P(t) R(t) D(t)

P(t) - R(t) U(t)

W1(t)

γ(t) 1-γ(t)

W2(t)

Data Center

Fig. 6. Block diagram for the extended model with delay tolerant and delayintolerant workloads.

delay tolerant and delay intolerant components. Similar totheworkload in the basic model, the delay intolerant workloadcannot be buffered and must be served immediately. However,the delay tolerant component may be buffered and served later.As an example, data centers run virus scanning programson most of their servers routinely (say once per day). Aslong as a virus scan is executed once a day, their purpose isserved - it does not matter what time of the day is chosen forthis. The ability to delay some of the workload gives moreopportunities to reduce the average power cost in additionto using the battery. We assume that our data center hassystem mechanisms to implement such buffering of specifiedworkloads.

In the following, we will denote the total workload gener-ated in slott by W (t). This consists of the delay tolerantand intolerant components denoted byW1(t) and W2(t)respectively, so thatW (t) = W1(t) +W2(t) for all t. Similarto the basic model, we useP (t), R(t), D(t) to denote the totalpower drawn from the grid, the total power used to rechargethe battery and the total power discharged from the battery inslot t, respectively. Thus, the total amount available to servethe workload is given byP (t)−R(t)+D(t). Let γ(t) denotethe fraction of this that is used to serve the delay tolerantworkload in slott. Then the amount used to serve the delayintolerant workload is(1 − γ(t))(P (t) − R(t) +D(t)). Notethat the following constraint must be satisfied every slot:

0 ≤ γ(t) ≤ 1 (27)

We next defineU(t) as the unfinished work for the delaytolerant workload in slott. The dynamics forU(t) can beexpressed as:

U(t+ 1) = max[U(t)− γ(t)(P (t)−R(t) +D(t)), 0] +W1(t)(28)

For the delay intolerant workload, there are no such queuessince all incoming workload must be served in the same slot.This means:

W2(t) = (1− γ(t))(P (t) −R(t) +D(t)) (29)

The block diagram for this extended model is shown in Fig. 6.Similar to the basic model, we assume that fori = 1, 2, Wi(t)varies randomly in an i.i.d. fashion, taking values from a setWi of non-negative values. We assume thatW1(t)+W2(t) ≤Wmax for all t. We also assume thatW1(t) ≤ W1,max <

9

Wmax andW2(t) ≤ W2,max < Wmax for all t. We furtherassume thatPpeak ≥ Wmax+max[Rmax, Dmax]. We use thesame model for battery and unit cost as in Sec. III.

Our objective is to minimize the time-average cost subjectto meeting all the constraints (such as finite battery size and(29)) and ensuring finite average delay for the delay tolerantworkload. This can be stated as:

P4 :


1

t

t−1∑

τ=0


Subject to: Constraints(2), (5), (6), (7), (9), (27), (29)

Finite average delay forW1(t)

Similar to the basic model, we consider the followingrelaxedproblem:

P5 :


1

t

t−1∑

τ=0



R = D (30)

U < ∞ (31)

whereU is the time average expected queue backlog for thedelay tolerant workload and is defined as:

U △

= lim supt→∞

1

t

t−1∑

τ=0

E {U(τ)} (32)

Let φext and φext denote the optimal value for problemsP4andP5 respectively. SinceP5 is less constrained thanP4, wehave thatφext ≤ φext. Similar to Lemma 1, the followingholds:

Lemma 3: (Optimal Stationary, Randomized Policy): If theworkload processW1(t),W2(t) and auxiliary processS(t)are i.i.d. over slots, then there exists a stationary, randomizedpolicy that takes control decisionsP (t), R(t), D(t), γ(t) everyslot purely as a function (possibly randomized) of the currentstate (W1(t),W2(t), S(t)) while satisfying the constraints(29), (2), (5), (9), (27) and providing the following guarantees:

E

{

R(t)}

= E

{

D(t)}

(33)

E

{

γ(t)(P (t)− R(t) + D(t))}

≥ E {W1(t)} (34)

E

{


}

= φext (35)

where the expectations above are with respect to the station-ary distribution of(W1(t),W2(t), S(t)) and the randomizedcontrol decisions.

Proof: This result follows from the framework in [8], [9]and is omitted for brevity.

The condition (34) only guarantees queueing stability, notbounded worst case delay. We will now design a dynamiccontrol algorithm that will yield bounded worst case delaywhile guaranteeing average cost that is withinO(1/V ) of φext.

A. Delay-Aware Queue

In order to provide worst case delay guarantees to the delaytolerant workload, we will make use of the technique ofǫ-persistent queue [19]. Specifically, we define a virtual queueZ(t) as follows:

Z(t+ 1) = [Z(t)− γ(t)(P (t)−R(t) +D(t)) + ǫ1U(t)]+

(36)

whereǫ > 0 is a parameter to be specified later,1U(t) is anindicator variable that is1 if U(t) > 0 and0 else, and[x]+ =max[x, 0]. The objective of this virtual queue is to enablethe provision of worst-case delay guarantee on any bufferedworkloadW1(t). Specifically, if any control algorithm ensuresthatU(t) ≤ Umax andZ(t) ≤ Zmax for all t, then the worstcase delay can be bounded. This is shown in the following:

Lemma 4: (Worst Case Delay) Suppose a control algorithmensures thatU(t) ≤ Umax andZ(t) ≤ Zmax for all t, whereUmax andZmax are some positive constants. Then the worstcase delay for any delay tolerant workload is at mostδmax

slots where:

δmax△

=⌈(Umax + Zmax)/ǫ⌉ (37)

Proof: Consider a new arrivalW1(t) in any slot t. Wewill show that this is served on or before timet+ δmax. Weargue by contradiction. Suppose this workload is not servedbyt+ δmax. Then for all slotsτ ∈ {t+1, t+2, . . . , t+ δmax}, itmust be the case thatU(τ) > 0 (elseW1(t) would have beenserved beforeτ ). This implies that1U(τ) = 1 and using (36),we have:

Z(τ + 1) ≥ Z(τ)− γ(τ)(P (τ) −R(τ) +D(τ)) + ǫ

Summing for allτ ∈ {t+ 1, t+ 2, . . . , t+ δmax}, we get:

Z(t+ δmax + 1)− Z(t+ 1) ≥ δmaxǫ

−

t+δmax∑

τ=t+1

[γ(τ)(P (τ) −R(τ) +D(τ))]

Using the fact thatZ(t+δmax+1) ≤ Zmax andZ(t+1) ≥ 0,we get:

t+δmax∑

τ=t+1

[γ(τ)(P (τ) −R(τ) +D(τ))] ≥ δmaxǫ− Zmax (38)

Note that by (28),W1(t) is part of the backlogU(t+1). SinceU(t + 1) ≤ Umax and since the service is FIFO, it will beserved on or before timet+δmax whenever at leastUmax unitsof power is used to serve the delay tolerant workload duringthe interval(t+1, . . . , t+ δmax). Since we have assumed thatW1(t) is not served byt + δmax, it must be the case that∑t+δmax

τ=t+1 [γ(τ)(P (τ)−R(τ) +D(τ))] < Umax. Using this in(38), we have:

Umax > δmaxǫ − Zmax

This implies thatδmax < (Umax + Zmax)/ǫ, that contradictsthe definition ofδmax in (37).

10

In Sec. VI-D, we will show that there are indeed constantsUmax, Zmax such that the dynamic algorithm ensures thatU(t) ≤ Umax, Z(t) ≤ Zmax for all t.

B. Optimal Control Algorithm

We now present an online control algorithm that approxi-mately solvesP4. Similar to the algorithm for the basic model,this algorithm also makes use of the following queueing statevariableX(t) to track the battery charge level and is definedas follows:

X(t) = Y (t)−Qmax −Dmax − Ymin (39)

whereQmax is a constant to be specified in (44). Recall thatY (t) denotes the actual battery charge level in slott andevolves according to (3). It can be seen thatX(t) is simply ashifted version ofY (t) and its dynamics is given by:

X(t+ 1) = X(t)−D(t) +R(t) (40)

We will show that this definition enables our algorithm toensure that the constraint (4) is met. We are now ready tostate the dynamic control algorithm. Let(W1(t),W2(t), S(t))be the system state in slott. DefineQ(t)△=(U(t), Z(t), X(t))as the queue state that includes the workload queue as well asauxiliary queues. Then the dynamic algorithm chooses controldecisionsP (t), R(t), D(t) and γ(t) as the solution to thefollowing optimization problem:

P6 :

Max:[U(t) + Z(t)]P (t)− V[


]

+ [X(t) + U(t) + Z(t)](D(t)− R(t))


whereV > 0 is a control parameter that affects the distancefrom optimality. LetP ∗(t), R∗(t), D∗(t) andγ∗(t) denote theoptimal solution toP6. Then, the dynamic algorithm allocates(1−γ∗(t))(P ∗(t)−R∗(t)+D∗(t)) power to service the delayintolerant workload and the remaining is used for the delaytolerant workload.

After computing these quantities, the algorithm implementsthem and updates the queueing variableX(t) according to(40). This process is repeated every slot. Note that in solvingP6, the control algorithm only makes use of the current systemstate values and does not require knowledge of the statisticsof the workload or unit cost processes.

We now establish two properties of the structure of theoptimal solution toP6 that will be useful in analyzing itsperformance later.

Lemma 5:The optimal solution toP6 has the followingproperties:

1) If X(t) > −V Cmin, then the optimal solution alwayschoosesR∗(t) = 0.

2) If X(t) < −Qmax (whereQmax is specified in (44)),then the optimal solution always choosesD∗(t) = 0.

Proof: See Appendix C.

C. SolvingP6

Similar to P3, the complexity of solvingP6 depends on thestructure of the unit cost functionC(t). For many cases ofpractical interest,P6 is easy to solve and admits closed formsolutions that can be implemented in real time. We considerone such case here.

1) C(t) does not depend onP (t): For notational conve-nience, letQ1(t) = [U(t) + Z(t) − V C(t)] and Q2(t) =[X(t) + U(t) + Z(t)].

Let θ1(t) denote the optimal value of the objective inP6when there is no recharge or discharge. WhenC(t) doesnot depend onP (t), this can be calculated as follows: IfU(t) + Z(t) ≥ V C(t), then θ1(t) = Q1(t)Ppeak. Else,θ1(t) = Q1(t)W2(t).

Next, let θ2(t) denote the optimal value of the objective inP6 when the option of recharge is chosen, so thatR(t) >0, D(t) = 0. This can be calculated as follows:

1) If Q1(t) ≥ 0, Q2(t) ≥ 0, then θ2(t) = Q1(t)Ppeak −V Crc.

2) If Q1(t) ≥ 0, Q2(t) < 0, then θ2(t) = Q1(t)Ppeak −Q2(t)Rmax − V Crc.

3) If Q1(t) < 0, Q2(t) ≥ 0, then θ2(t) = Q1(t)W2(t) −V Crc.

4) If Q1(t) < 0, Q2(t) < 0, then we have two cases:

a) If Q1(t) ≥ Q2(t), then θ2(t) = Q1(t)(Rmax +W2(t)) −Q2(t)Rmax − V Crc.

b) If Q1(t) < Q2(t), thenθ2(t) = Q1(t)W2(t)− V Crc.

Finally, let θ3(t) denote the optimal value of the objectivein P6 when when the option of discharge is chosen, so thatD(t) > 0, R(t) = 0. This can be calculated as follows:

1) If Q1(t) ≥ 0, Q2(t) ≥ 0, then θ3(t) = Q1(t)Ppeak +Q2(t)Dmax − V Cdc.

2) If Q1(t) ≥ 0, Q2(t) < 0, then θ3(t) = Q1(t)Ppeak −V Cdc.

3) If Q1(t) < 0, Q2(t) ≥ 0, then θ3(t) =Q1(t)max[0,W2(t)−Dmax] +Q2(t)Dmax − V Cdc.

4) If Q1(t) < 0, Q2(t) < 0, then we have two cases:

a) If Q1(t) ≤ Q2(t), thenθ3(t) = Q1(t)max[0,W2(t)−Dmax] +Q2(t)min[W2(t), Dmax]− V Cdc.

b) If Q1(t) > Q2(t), thenθ3(t) = Q1(t)W2(t)− V Cdc.

After computing θ1(t), θ2(t), θ3(t), we pick the mode thatyields the highest value of the objective and implement thecorresponding solution.

D. Performance Theorem

We define an upper boundV maxext on the maximum value

thatV can take in our algorithm for the extended model.

V maxext

△

=Ymax − Ymin − (Rmax +Dmax +W1,max + ǫ)

χmin − Cmin

(41)

Then we have the following result.Theorem 2:(Algorithm Performance) SupposeU(0) = 0,

Z(0) = 0 and the initial battery charge levelYinit satisfiesYmin ≤ Yinit ≤ Ymax. Then implementing the algorithmabove with any fixed parameterǫ ≥ 0 such thatǫ ≤ Wmax −

11

0 5 10 15 2040

60

80

100

Hour

Pric

e ($

/MW

−H

our)

Fig. 7. One period of the unit cost process.

W2,max and a parameterV such that0 < V ≤ V maxext for

all t ∈ {0, 1, 2, . . .} results in the following performanceguarantees:

1) The queuesU(t) and Z(t) are deterministically upperbounded byUmax andZmax respectively for allt where:

Umax△

=V χmin +W1,max (42)

Zmax△

=V χmin + ǫ (43)

Further, the sumU(t) + Z(t) is also deterministicallyupper bounded byQmax where

Qmax△

=V χmin +W1,max + ǫ (44)

2) The queueX(t) is deterministically upper and lowerbounded for allt as follows:

−Qmax −Dmax ≤ X(t) ≤ Ymax − Ymin −Qmax

−Dmax (45)

3) The actual battery levelY (t) satisfiesYmin ≤ Y (t) ≤Ymax for all t.

4) All control decisions are feasible.5) The worst case delay experienced by any delay tolerant

request is given by:⌈2V χmin +W1,max + ǫ

ǫ

⌉

(46)

6) If W1(t),W2(t) and S(t) are i.i.d. over slots, then thetime-average cost under the dynamic algorithm is withinBext/V of the optimal value:

limt→∞

1

t

t−1∑

τ=0


≤ φext +Bext/V (47)

where Bext is a constant given byBext = (Ppeak +

Dmax)2+

(W1,max)2+ǫ2

2 +B andφext is the optimal solu-tion to P4 underany feasible control algorithm (possiblywith knowledge of future events).

Thus, by choosing largerV , the time-average cost under thedynamic algorithm can be pushed closer to the minimumpossible valueφopt. However, this increases the worst casedelay bound yielding aO(1/V, V ) utility-delay tradeoff. Alsonote thatVmax

ext limits how largeV can be chosen.Proof: See Appendix D.

VII. S IMULATION -BASED EVALUATION

We evaluate the performance of our control algorithms usingboth synthetic and real pricing data. To gain insights into the

0 5 10 15 200.4

0.6

0.8

1

Hour

Wor

kloa

d (M

W)

Fig. 8. One period of the workload process.

0 50 100 150 200 250 30032

33

34

35

36

37

38

39

40

41

Ymax

Ave

rage

Cos

t ($/

Hou

r)

Dynamic Control AlgorithmOptimal Offline CostMinimum CostCost with No Battery

Fig. 9. Average Cost per Hour vs.Ymax.

behavior of our algorithms and to compare with the optimaloffline solution, we first consider the basic model and use asimple periodic unit cost and workload process as shown inFigs. 7 and 8. These values repeat every24 hours and the unitcost does not depend onP (t). From Fig. 7, it can be seenthat Cmax = $100 and Cmin = $50. Further, we have thatχmin = Cmax = 100. We assume a slot size of1 minute sothat the control decisions onP (t), R(t), D(t) are taken onceevery minute. We fix the parametersRmax = 0.2 MW-slot,Dmax = 1.0 MW-slot, Crc = Cdc = 0, Ymin = 0. We nowsimulate the basic control algorithm of Sec. V-A1 for differentvalues ofYmax and with V = Vmax. For eachYmax, thesimulation is performed for a duration of4 weeks.

In Fig. 9, we plot the average cost per hour under thedynamic algorithm for different values of battery sizeYmax. Itcan be seen that the average cost reduces asYmax is increasedand converges to a fixed value for largeYmax, as suggestedby Theorem 1. For this simple example, we can compute theminimum possible average cost per hour (over all battery sizes)and this is given by$33.23 which is also the value to which thedynamic algorithm converges asYmax is increased. Moreover,in this example, we can also compute the optimal offline costfor each valueof Ymax. These also also plotted in Fig. 9.It can be seen that, for eachYmax, the dynamic algorithmperforms quite close to the corresponding optimal value, evenfor smaller values ofYmax. Note that Theorem1 provides suchguarantees only for sufficiently large values ofYmax. Finally,the average cost per hour when no battery is used is given by$39.90.

We next consider a six-month data set of average hourlyspot market prices for the Los Angeles Zone LA1 obtainedfrom CAISO [6]. These prices correspond to the period01/01/2005−06/30/2005 and each value denotes the averageprice of 1 MW-Hour of electricity. A portion of this datacorresponding to the first week of January is plotted in Fig. 1.We fix the slot size to 5 minutes so that control decisions on

12

0 20 40 60 80 100 120 140 160 1800

5

10

15x 10

4

Days

Tota

l Cos

t (D

olla

rs)

No Battery, No WPBattery Size 15, No WPNo Battery, WP with V=0.93Battery Size 15, WP with V=0.93Battery Size 30, No WPNo Battery, WP with V=1.98Battery Size 30, WP with V=1.98Battery Size 50No Battery, WP with V=3.39Battery Size 50, WP with V=3.39

Fig. 10. Total Cost over6 months with i.i.dW (t) and differentYmax

P (t), R(t), D(t), etc. are taken once every 5 minutes. The unitcostC(t) obtained from the data set for each hour is assumedto be fixed for that hour. Furthermore, we assume that the unitcost does not depend on the total power drawnP (t).

In our experiments, we assume that the data center receivesworkload in an i.i.d fashion. Specifically, every slot,W (t)takes values from the set [0.1,1.5] MW uniformly at random.We fix the parametersDmax andRmax to 0.5 MW-slot,Cdc =Crc = $0.1, andYmin = 0. Also, Ppeak = Wmax +Rmax =2.0 MW. We now simulate four algorithms on this setup fordifferent values ofYmax. The length of time the battery canpower the data center if the draw wereWmax starting fromfully charged battery is given byYmax

Wmaxslots, each of length

5 minutes. We consider the following four control techniques:(A) “No battery, No WP,” which meets the demand in everyslot using power from the grid and without postponing anyworkload, (B) “Battery, No WP,” which employs the algorithmin the basic model without postponing any workload, (C) “NoBattery, WP,” which employs the extended model for WPbut without any battery, and (D) “Complete,” the completealgorithm of the extended model with both battery and WP.For (C) and (D), we assume that during every slot, half of thetotal workload is delay-tolerant.

We simulate these algorithms to obtain the total cost overthe6 month period forYmax ∈ {15, 30, 50} MW-slot. For (B),we useV = Vmax while for (C) and (D), we useV = Vmax

ext

with ǫ = Wmax/2. Note that an increased battery capacityshould have no effect on the performance under (C). In orderto get a fair comparison with the other schemes, we assumethat the worst case delay guarantee that case (C) must providefor the delay tolerant traffic is the same as that under (D).

Fig. 10 plots the total cost under these schemes over the6month period. In Table II, we show the ratio of the total costunder schemes (B), (C), (D) to the total cost under (A) forthese values ofYmax over the6 month period. It can be seenthat (D) provides the most cost savings over the baseline case.

VIII. C ONCLUSIONS ANDFUTURE WORK

In this paper, we studied the problem of opportunisticallyusing energy storage devices to reduce the time average elec-tricity bill of a data center. Using the technique of Lyapunovoptimization, we designed an online control algorithm thatachieves close to optimal cost as the battery size is increased.

Ymax 15 30 50Battery, No WP 95% 92% 89%WP, No Battery 96% 92% 88%

WP, Battery 92% 85% 79%

TABLE IIRATIO OF COST UNDER SCHEMES(B), (C), (D) TO THE COST UNDER(A)

FOR DIFFERENT VALUES OFYmax WITH I .I .D.W (t).

We would like to extend our current framework alongseveral important directions including: (i) multiple utilities (orcaptive sources such as DG) with different price variationsand availability properties (e.g., certain renewable sources ofenergy are not available at all times), (ii) tariffs where theutility bill depends on peak power draw in addition to theenergy consumption, and (iii) devising online algorithms thatoffer solutions whose proximity to the optimal has a smallerdependence on battery capacity than currently. We also plantoexplore implementation and feasibility related concerns suchas: (i) what are appropriate trade-offs between investments inadditional battery capacity and cost reductions that this offers?(ii) what is the extent of cost reduction benefits for realisticdata center workloads? and (iii) does stored energy make senseas a cost optimization knob in other domains besides datacenters? Our technique could be viewed as a design tool which,when parameterized well, can assist in determining suitableconfiguration parameters such as battery size, usage rules-of-thumb, time-scale at which decisions should be made, etc.Finally, we believe that our work opens up a whole set ofinteresting issues worth exploring in the area of consumer-end (not just data centers) demand response mechanisms forpower cost optimization.

ACKNOWLEDGMENTS

This work was supported in part by the NSF Career grantCCF-0747525.

REFERENCES

[1] S. Park, W. Jiang, Y. Zhou, and S. Adve, “Managing energy-performancetradeoffs for multithreaded applications on multiprocessor architectures,”in Proc. ACM SIGMETRICS, 2007.

[2] Q. Zhu, F. David, C. Devaraj, Z. Li, Y. Zhou, and P. Cao, “Reducingenergy consumption of disk storage using power-aware cachemanage-ment,” in Proc. HPCA, 2004.

[3] S. Gurumurthi, A. Sivasubramaniam, M. Kandemir, and H. Franke,“Drpm: Dynamic speed control for power management in serverclassdisks,” in Proc. ISCA ’03, 2003.

[4] A. R. Lebeck, X. Fan, H. Zeng, and C. Ellis, “Power aware pageallocation,” SIGOPS Oper. Syst. Rev., vol. 34, pp. 105–116, Nov. 2000.

[5] J. S. Chase, D. C. Anderson, P. N. Thakar, A. M. Vahdat, andR. P. Doyle,“Managing energy and server resources in hosting centers,”SIGOPSOper. Syst. Rev., vol. 35, pp. 103–116, Oct. 2001.

[6] California ISO Open Access Same-time Information System (OASIS)Hourly Average Energy Prices. http://oasisis.caiso.com.

[7] D. P. Bertsekas,Dynamic Programming and Optimal Control, vols. 1and 2. Athena Scientific, 2007.

[8] L. Georgiadis, M. J. Neely, and L. Tassiulas, “Resource allocationand cross-layer control in wireless networks,”Found. and Trends inNetworking, vol. 1, pp. 1–144, 2006.

[9] M. J. Neely, Stochastic Network Optimization with Application toCommunication and Queueing Systems. Morgan & Claypool, 2010.

[10] A. Bar-Noy, M. P. Johnson, and O. Liu, “Peak shaving through resourcebuffering,” in Proc. WAOA, 2008.

http://oasisis.caiso.com

13

[11] A. Bar-Noy, Y. Feng, M. P. Johnson, and O. Liu, “When to reap andwhen to sow: Lowering peak usage with realistic batteries,”in Proc. 7thInternational Conference on Experimental Algorithms, 2008.

[12] K. Le, R. Bianchini, M. Martonosi, and T. Nguyen, “Cost-and energy-aware load distribution across data centers,” inWorkshop on Power-Aware Computing and Systems (HOTPOWER), 2009.

[13] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B.Maggs,“Cutting the electric bill for internet-scale systems,” inProc. SIGCOMM,2009.

[14] M. Gatzianas, L. Georgiadis, and L. Tassiulas, “Control of wirelessnetworks with rechargeable batteries,”IEEE Trans. Wireless. Comm.,vol. 9, pp. 581–593, Feb. 2010.

[15] D. Linden and T. B. Reddy,Handbook of Batteries. McGraw HillHandbooks, 2002.

[16] Lead-acid batteries: Lifetime vs. Depth of discharge.http://www.windsun.com/Batteries/BatteryFAQ.htm.

[17] M. Marwah, P. Maciel, A. Shah, R. Sharma, T. Christian, V. Almeida,C. Araujo, E. Souza, G. Callou, B. Silva, S. Galdino, and J. Pires,“Quantifying the sustainability impact of data center availability,” SIG-METRICS Perform. Eval. Rev., vol. 37, pp. 64–68, March 2010.

[18] U. Hoelzle and L. A. Barroso,The Datacenter as a Computer: AnIntroduction to the Design of Warehouse-Scale Machines. Morgan andClaypool Publishers, 2009.

[19] M. J. Neely, A. S. Tehrani, and A. G. Dimakis, “Efficient algorithmsfor renewable energy allocation to delay tolerant consumers,” in Proc.IEEE SmartGridComm, 2010.

APPENDIX A -PROOF OFLEMMA 2

We can rewrite the expression in the objective ofP3 as[X(t)+V C(t)]P (t)+V [1R(t)Crc+1D(t)Cdc]. To show part1, supposeR∗(t) = δ > 0 whenX(t) > −V Cmin, so thatwe haveP ∗(t) − δ = W (t), D∗(t) = 0, 1R(t) = 1, and1D(t) = 0. Then the value of the objective is given by:

[X(t) + V C(P ∗(t))]P ∗(t) + V Crc =

[X(t) + V C(W (t) + δ)](W (t) + δ) + V Crc >

[X(t) + V C(W (t))]W (t)

where the last step follows by noting thatX(t)+V C(W (t)) >0 whenX(t) > −V Cmin and thatC(t) in non-negative andnon-decreasing inP (t). The last expression denotes the valueof the objective whenR∗(t) = D∗(t) = 0 and all demandis met using power drawn from the grid and is smaller. Thisshows that whenX(t) > −V Cmin, then the optimal solutioncannot chooseR∗(t) > 0.

Next, to show part2, supposeD∗(t) = δ > 0 whenX(t) <−V χmin, so that we haveP ∗(t)+ δ = W (t), R∗(t) = 0, and1D(t) = 1. Then the value of the objective is given by:

[X(t) + V C(P ∗(t))]P ∗(t) + V Cdc =

[X(t) + V C(W (t) − δ)](W (t)− δ) + V Cdc ≥

[X(t) + V C(W (t) − δ)](W (t)− δ) >

[X(t) + V C(W (t))]W (t)

where in the last step, we used the property (10) together withthe fact thatX(t) < −V χmin. The last expression denotesthe value of the objective whenR∗(t) = D∗(t) = 0 andall demand is met using power drawn from the grid and issmaller. This shows that whenX(t) < −V χmin, then theoptimal solution cannot chooseD∗(t) > 0.

APPENDIX B - PROOF OFBOUND (24)

Squaring both sides of (17), dividing by2, and rearrangingyields:

X2(t+ 1)−X2(t)

2=

(D(t)−R(t))2

2−X(t)[D(t)−R(t)]

Now note that under any feasible algorithm, at most oneof R(t) and D(t) can be non-zero. Further, sinceR(t) ≤Rmax, D(t) ≤ Dmax for all t, we have:

(D(t)−R(t))2

2≤

max[R2max, D

2max]

2= B

Taking conditional expectations of both sides givenX(t), wehave:∆(X(t)) ≤ B −X(t)E {D(t)−R(t)|X(t)}.

IX. PROOF OFLEMMA 5

SupposeX(t) > −V Cmin and R∗(t) > 0, D∗(t) = 0.Then, we have thatW2(t) = (1−γ∗(t))(P ∗(t)−R∗(t)). Afterrearranging, the value of the objective ofP6 can be expressedas:

[U(t) + Z(t)](P ∗(t)−R∗(t))− V P ∗(t)C(P ∗(t))− V Crc

−X(t)R∗(t) < [U(t) + Z(t)](P ∗(t)−R∗(t))

− V [P ∗(t)−R∗(t)]C(P ∗(t)−R∗(t))

where we used the inequalitiesP ∗(t)C(P ∗(t) − R∗(t)) <P ∗(t)C(P ∗(t)) andX(t)+V C(P ∗(t)−R∗(t)) > 0. The firstfollows from the non-negative and non-decreasing propertyof C(t) in P (t). The second follows by noting thatX(t) >−V Cmin. Now note that the right hand side denotes the valueof the objective when powerP (t) = P ∗(t) −R∗(t) is drawnfrom the grid and the battery is not recharged or discharged.This is a feasible option since by choosingγ(t) = 0, wehave thatW2(t) = (P ∗(t) − R∗(t)). This shows that whenX(t) > 0, R∗(t) > 0 is not optimal. This shows part1.

Next, suppose X(t) < −Qmax < −V χmin

and D∗(t) > 0, R∗(t) = 0. Then, we have thatW2(t) = (1 − γ∗(t))(P ∗(t) + D∗(t)). We consider twocases:

(1) P ∗(t) +D∗(t) ≤ Ppeak: After rearranging, the value ofthe objective ofP6 can be expressed as:

[U(t) + Z(t)](P ∗(t) +D∗(t))− V P ∗(t)C(P ∗(t))− V Cdc

+X(t)D∗(t) < [U(t) + Z(t)](P ∗(t) +D∗(t))

− V P ∗(t)C(P ∗(t))− V χminD∗(t)

where we used the fact thatX(t) < −V χmin andD∗(t) > 0.Using the property (10), we have:

(P ∗(t) +D∗(t))C(P ∗(t) +D∗(t))− P ∗(t)C(P ∗(t))

≤ χminD∗(t)

Using this in the inequality above, we have:

[U(t) + Z(t)](P ∗(t) +D∗(t))− V P ∗(t)C(P ∗(t)) − V Cdc

+X(t)D∗(t) < [U(t) + Z(t)](P ∗(t) +D∗(t))

− V (P ∗(t) +D∗(t))C(P ∗(t) +D∗(t))

http://www.windsun.com/Batteries/Battery_FAQ.htm

14

Note that the last term denotes the value of objective whenpower P (t) = P ∗(t) + D∗(t) ≤ Ppeak is drawn from thegrid and the battery is not recharged or discharged. This isa feasible option since by choosingγ(t) = 0, we have thatW2(t) = (P ∗(t) + D∗(t)). This shows that for this case,whenX(t) < −V χmin, D∗(t) > 0 is not optimal.

(2) P ∗(t) +D∗(t) > Ppeak: The value of the objective ofP6 is given by:

[U(t) + Z(t)]P ∗(t)− V P ∗(t)C(P ∗(t))− V Cdc

+ [U(t) + Z(t) +X(t)]D∗(t) < [U(t) + Z(t)]P ∗(t)

− V P ∗(t)C(P ∗(t))

where we used the fact that sinceX(t) < −Qmax andU(t)+Z(t) ≤ Qmax (Theorem 2 part1), we have[U(t) + Z(t) +X(t)]D∗(t) < 0. The last term in the inequality above denotesthe value of the objective when powerP (t) = P ∗(t) is drawnfrom the grid and the battery is not recharged or discharged.Tosee that this is feasible, note that we need(1− γ(t))P ∗(t) =W2(t) whereγ(t) must be≤ 1. SinceW2(t) ≤ W2,max, thisimplies:

1− γ(t) =W2(t)

P ∗(t)≤

W2,max

P ∗(t)≤

W2,max

Ppeak −Dmax

where we used the fact thatP ∗(t) ≥ Ppeak−D∗(t) ≥ Ppeak−Dmax. Now sinceW2,max ≤ Ppeak − Dmax, the last termabove is≤ 1, so that choosingP (t) = P ∗(t) andD(t) = 0is a feasible option. This shows that for this case as well,D∗(t) > 0 is not optimal.

APPENDIX D - PROOF OFTHEOREM 2, PARTS2-6

Here, we prove parts1− 6 of Theorem 2.

Proof: (Theorem 2 part1) We first show (42). Clearly,(42) holds fort = 0. Now suppose it holds for slott. We willshow that it also holds for slott + 1. First supposeU(t) ≤V χmin. Then, by (28), the most thatU(t) can increase in oneslot is W1,max so thatU(t + 1) ≤ V χmin + W1,max. Next,supposeV χmin < U(t) ≤ V χmin +W1,max. Now considerthe terms involvingP (t) in the objective ofP6: [U(t)+Z(t)−V C(P (t))]P (t). SinceU(t) > V χmin, using property (10),we have:

[U(t) + Z(t)−V C(P (t))]P (t) ≤

[U(t) + Z(t)− V C(Ppeak)]Ppeak

Thus, the optimal solution to problemP6 choosesP ∗(t) =Ppeak. Now, letR∗(t), D∗(t) andγ∗(t) denote the other con-trol decisions by the optimal solution toP6. Then the amountof power remaining for the data center (after recharging ordischarging the battery) isPpeak − R∗(t) + D∗(t). Out ofthis, a fraction1− γ∗(t) is used to serve the delay intolerantworkload. Thus:

(1− γ∗(t))[Ppeak −R∗(t) +D∗(t)] = W2(t)

Using this, and the fact thatR∗(t) ≤ Rmax, we have:

γ∗(t)[Ppeak −R∗(t) +D∗(t)]

= Ppeak −R∗(t) +D∗(t)−W2(t)

≥ Ppeak −Rmax −W2(t) ≥ W1(t)

where we used the fact thatW1(t) + W2(t) ≤ Wmax ≤Ppeak−Rmax. Thus, using (28), it can be seen that the amountof new arrivals toU(t) cannot exceed the total service and thisyieldsU(t+ 1) ≤ U(t).

(43) can be shown by similar arguments. Clearly, (43) holdsfor t = 0. Now suppose it holds for slott. We will show thatit also holds for slott + 1. First supposeZ(t) ≤ V χmin.Then, by (36), the most thatZ(t) can increase in one slot isǫ so thatZ(t + 1) ≤ V χmin + ǫ. Next, supposeV χmin <Z(t) ≤ V χmin+ǫ. Then, by a similar argument as before, theoptimal solution to problemP6 choosesP ∗(t) = Ppeak Now,let R∗(t), D∗(t) andγ∗(t) denote the other control decisionsby the optimal solution toP6. Then the amount of powerremaining for the data center isPpeak − R∗(t) +D∗(t). Outof this, a fraction(1− γ∗(t)) is used for the delay intolerantworkload. Thus:

(1− γ∗(t))[Ppeak −R∗(t) +D∗(t)] = W2(t)

Using this, and the fact thatR∗(t) ≤ Rmax, we have:

γ∗(t)[Ppeak −R∗(t) +D∗(t)] ≥ Ppeak −Rmax −W2(t)

≥ Ppeak −Rmax −W2,max ≥ Wmax −W2,max ≥ ǫ

where we used the fact thatW2(t) ≤ W2,max andWmax ≤Ppeak−Rmax. Thus, using (36), it can be seen that the amountof new arrivals toZ(t) cannot exceed the total service and thisyieldsZ(t+ 1) ≤ Z(t).

(44) can be shown by similar arguments and the proof isomitted for brevity.

Proof: (Theorem 2 part2) We first show that (45) holdsfor t = 0. Using the definition ofX(t) from (39), we have thatY (0) = X(0) +Qmax +Dmax + Ymin. SinceYmin ≤ Y (0),we have:

Ymin ≤ X(0) +Qmax +Dmax + Ymin

=⇒ −Qmax −Dmax ≤ X(0)

Next, we have thatY (0) ≤ Ymax, so that:

X(0) +Qmax +Dmax + Ymin ≤ Ymax

=⇒ X(0) ≤ Ymax − Ymin −Qmax −Dmax

Combining these two shows that−Qmax −Dmax ≤ X(0) ≤Ymax − Ymin −Qmax −Dmax.

Now suppose (45) holds for slott. We will show that italso holds for slott + 1. First, suppose−V Cmin < X(t) ≤Ymax − Ymin − Qmax − Dmax. Then, from Lemma 5, wehave thatR∗(t) = 0. Thus, using (40) we have thatX(t +1) ≤ X(t) ≤ Ymax − Ymin − Qmax −Dmax. Next, supposeX(t) ≤ −V Cmin. Then, the maximum possible increase isRmax so thatX(t+ 1) ≤ −V Cmin + Rmax. Now for all Vsuch that0 ≤ V ≤ V ext

max, we have that−V Cmin + Rmax ≤Ymax − Ymin − (Dmax +W1,max + ǫ) − V χmin = Ymax −Ymin −Qmax −Dmax. This follows from the definition (41)

15

and the fact thatχmin ≥ Cmin. Using this, we haveX(t +1) ≤ Ymax−Ymin−Qmax−Dmax for this case as well. Thisestablishes thatX(t+ 1) ≤ Ymax − Ymin −Qmax −Dmax.

Next, suppose−Qmax −Dmax ≤ X(t) < −Qmax. Then,from Lemma 5, we have thatD∗(t) = 0. Thus, using (40) wehave thatX(t+1) ≥ X(t) ≥ −Qmax−Dmax. Next, suppose−Qmax ≤ X(t). Then, the maximum possible decrease isDmax so thatX(t + 1) ≥ −Qmax − Dmax for this case aswell. This shows thatX(t+1) ≥ −Qmax−Dmax. Combiningthese two bounds proves (45).

Proof: (Theorem 2 parts3 and4) Part3 directly followsfrom (45) and (39). UsingY (t) = X(t)+Qmax+Dmax+Ymin

in the lower bound in (45), we have:

−Qmax −Dmax ≤ Y (t)−Qmax −Dmax − Ymin

=⇒ Ymin ≤ Y (t)

Similarly, usingY (t) = X(t) +Qmax +Dmax + Ymin in theupper bound in (45), we have:

Y (t)−Qmax −Dmax − Ymin ≤ Ymax − Ymin −Qmax −Dmax

=⇒ Y (t) ≤ Ymax

Part4 now follows from part3 and the constraint onP (t) inP6.

Proof: (Theorem 2 part5) This follows from part1 andLemma 4.

Proof: (Theorem 2 part6) We use the following Lyapunovfunction: L(Q(t))△=

12 (U

2(t) + Z2(t) + X2(t)). Define theconditional1-slot Lyapunov drift as follows:

∆(Q(t))△=E {L(Q(t+ 1))− L(Q(t))|Q(t)} (48)

Using (28), (36), (40), the drift + penalty term can be boundedas follows:

∆(Q(t)) + V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|Q(t)} ≤

Bext − [U(t) + Z(t)]E {P (t)|Q(t)} −W2(t)[U(t) + Z(t)]

− [X(t) + U(t) + Z(t)]E {D(t)−R(t)|Q(t)}

+ V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|Q(t)} (49)

whereBext = (Ppeak + Dmax)2 +

(W1,max)2+ǫ2

2 + B. Com-paring this withP6, it can be seen that given any queue valueX(t), our control algorithm is designed tominimizethe righthand side of (49) over all possible feasible control policies.This includes the optimal, stationary, randomized policy givenin Lemma 3. Using the same argument as before, we have thefollowing:

∆(Q(t)) + V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|Q(t)} ≤

Bext + V E

{

P (t)C(t) + 1R(t)Crc + 1D(t)Cdc|Q(t)}

= Bext + V φext

Taking the expectation of both sides and using the law ofiterated expectations and summing overt ∈ {0, 1, 2, . . . , T −

1}, we get

T−1∑

t=0

V E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc} ≤

BextT + V T φext − E {L(Q(T ))}+ E {L(Q(0))}

Dividing both sides byV T and taking limit asT → ∞ yields:

limT→∞

1

T

T−1∑

t=0

E {P (t)C(t) + 1R(t)Crc + 1D(t)Cdc} ≤

φext +Bext/V

where we used the fact thatE {L(Q(0))} is finite and thatE {L(Q(T ))} is non-negative.

APPENDIX E - FULL SOLUTION FOR CONVEX,INCREASINGC(t)

LetC′(S, P ) be the derivative ofC(S, P ) with respect toP .Also, letP ′ denote the solution to the equationC′(S, P ) = 0and C(P ′) = C(S, P ′). Then, the optimal solution can beobtained as follows:

1) If Plow ≤ P ′ ≤ W (t), thenR∗(t) = 0 and we have thefollowing two cases:

a) If P ′(X(t) + V C(P ′)) + V Cdc < θ(t), thenP ∗(t) =P ′, D∗(t) = W (t)− P ′.


2) If W (t) < P ′ ≤ Phigh, thenD∗(t) = 0 and we have thefollowing two cases:

a) If P ′(X(t) + V C(P ′)) + V Crc < θ(t), thenP ∗(t) =P ′, R∗(t) = P ′ −W (t).


3) If Phigh < P ′, thenR∗(t) = 0 and we have the followingtwo cases:

a) If Phigh(X(t) + V C(Phigh)) + V Cdc < θ(t), thenP ∗(t) = Phigh, D∗(t) = W (t)− Phigh.


4) If P ′ < Plow, thenD∗(t) = 0 and we have the followingtwo cases:

a) If Plow(X(t) + V C(Plow)) + V Crc < θ(t), thenP ∗(t) = Plow, R∗(t) = Plow −W (t).


Date post:	23-May-2018
Category:	Documents
Upload:	votruc
View:	216 times
Download:	3 times

1 Optimal Power Cost Management Using Stored Energy in … · · 2011-03-22develop an online...

Documents