+ All Categories
Home > Documents > Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction...

Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction...

Date post: 26-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
137
Stochastic Scheduling and Dynamic Programming Ger Koole CWI Tract 113, CWI, Amsterdam, 1995
Transcript
Page 1: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Stochastic Scheduling and Dynamic ProgrammingGer Koole

CWI Tract 113, CWI, Amsterdam, 1995

Page 2: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Preface

This is a revision of my Ph.D. thesis, which was written in the winter of 1991-92,based on four years of research at Leiden University. During that time I studiedvarious routing and scheduling problems, for which I (partially) characterizedthe optimal policies using the same technique: dynamic programming.

Over the last three years I found several related articles of which I waspreviously unaware, some new interesting results appeared, and I strengtheneda few results myself. Based on that I prepared this revision. The sections whichchanged the most are 1.8, 1.9, 2.4, 3.7, and appendix A.

I would like to thank three people who contributed greatly to my thesis:my advisor Arie Hordijk, for his guidance, my office-mate Floske Spieksma,and Carly Giezen, for her support outside the office.

Sophia Antipolis, June 1995 Ger Koole

Page 3: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Contents

Introduction 1

1. Models with Markov Arrival Processes 51.1. Markov Arrival Processes 51.2. Symmetric customer assignment model 61.3. Customer assignment model without waiting room 91.4. Discrete customer assignment model 101.5. Pathwise optimality 121.6. Customer assignment model with rejection 141.7. Series of parallel processors 151.8. Customer assignment model with workloads 171.9. Customer assignment model without information 201.10. Markov Arrival Processes with multiple customer classes

and server vacations 211.11. Server assignment model with a single server 231.12. Server assignment model with multiple servers 29

2. Models with Markov Decision Arrival Processes 332.1. Markov Decision Arrival Processes 332.2. Symmetric customer assignment model 342.3. Tandems of customer assignment models 362.4. Networks of customer assignment models with workloads 422.5. Server assignment model with multiple servers 442.6. Tandems of server assignment models with a single server 482.7. Tandems of server assignment models with a single server

and identical centers 53

3. Models with Dependent Markov Decision Arrival Processes 573.1. Dependent Markov Decision Arrival Processes 573.2. Asymmetric customer assignment model 583.3. Symmetric customer assignment models 613.4. Customer assignment models without waiting room 663.5. Customer assignment models with rejections 683.6. Server assignment model with multiple servers 713.7. Server assignment model with a single server and a finite source 72

4. Proofs of dynamic programming results 754.1. Proofs of chapter 1 754.2. Proofs of chapter 2 84

Page 4: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

4.3. Proofs of chapter 3 88

5. Uniformization 1035.1. Introduction 1035.2. Uniformization with fixed parameter 1055.3. Continuous-time Bellman equation 107

Appendices 113A. The approximation of point processes 113B. Phase-type distributions of DFR/IFR distributions 118C. Majorization 121D. Computational issues 124

References 127Index 133

Page 5: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Introduction

The title of this monograph consists of two parts, stochastic scheduling anddynamic programming. The former refers to a class of models, the latter refersto the method used to find optimal policies for these models. The modelsstudied here can be divided in two classes: those in which customers at arrivalare to be assigned to one of a number of queues and those in which one or moreservers are to be assigned to different customer classes or queues. Of greatimportance is the way in which customers arrive at the stations. Models withindependent arrival streams are studied in chapter 1. Then we allow the arrivalstream to depend on the numbers of customers in the queues in such a way thatcontrollable networks can be modeled with it. These and other network resultscan be found in chapter 2. In chapter 3 we generalize the arrival process evenmore, for example to include finite source models. Many results of chapter 1and 2 are special cases of the results of chapter 3. Chapter 4 contains the proofsof the dynamic programming results. Chapter 5 considers methods by which wecan translate the discrete-time results of the chapters 1 to 3 to continuous-timeresults. We conclude with four appendices, respectively on weak convergenceof arrival streams, on phase-type distributions with a monotone failure rate, onmajorization, and on algorithms to compute optimal policies.

Summarizing, chapter 1 can be seen as an introduction, chapter 2 containsthe network results, and in chapter 3 the dynamic programming results arehandled in their greatest generality.

Chapter 1 starts by introducing the Markov Arrival Process (MAP), anarrival process based on a Markov process. In appendix A it is shown that theclass of Markov Arrival Processes is dense in the class of all independent arrivalprocesses. The MAP is taken as input to a model consisting of m parallelqueues, with possibly finite buffers, each with their own exponential server.These types of models, in which each arriving customer is to be assigned to oneof the queues, are called customer assignment models. When the service rates ofall servers are equal, the policy that assigns arriving customers to the shortestnon-full queue (the SQP) is optimal for a large class of cost functions, includingthe total number of customers. This is shown in section 1.2, by inductivelyproving properties of the discrete-time dynamic programming equation. Arelated model has no buffers at the servers, but different service rates. Herearriving customers should be sent to the fastest available server. For bothmodels, there is a complete characterization of the allowable cost functions, tobe found in appendix C.

Page 6: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

2 Introduction

The previous results are only interesting in continuous time, due to the wayof modeling. Section 1.4 considers a simple symmetric discrete-time model withsimultaneous events, in which the SQP is optimal. In section 1.5 we generalizethe result of section 1.2 to pathwise optimality of the SQP. So far, all costfunctions depend on the number of customers in the queues. We can alsoconsider the number of departed customers. In section 1.6 it is shown that theSQP is again optimal, and that we can allow rejections. Section 1.7 deals withmaintenance models closely related to the models of the earlier sections.

In sections 1.2 to 1.7 the information available to the controller is thenumbers of customers in each queue. In section 1.8 the amount of work in eachqueue is known. Here the policy that assigns to the queue with the shortestworkload (the SWP) is optimal. In section 1.9 there is no information at all,not even on previous assignments. It is shown that the optimal policy dividesthe arrivals equally among the queues. It is the only result in this chapter notobtained by dynamic programming.

Now we move to the server assignment models. First we generalize theMAP to be able to include server vacations and arrivals in multiple classes. Insection 1.11 and 1.12 we deal with the following model. Customers arrive in mdifferent classes, and all customers in the same class have an exponential servicetime with the same mean. There are one or more identical servers available,which have to be assigned to the customers present. Both models with a singleand with multiple servers are studied, giving conditions on the cost functionsfor list policies to be optimal. As special cases we find the following well knownresults. In the single server case the µc-rule minimizes the weighted number ofcustomers. In the multiple server case the makespan is minimized by the LEPTpolicy (LEPT stands for longest expected processing time first). In the singleserver case we generalize the results to IFR and DFR service time distributions.

In chapter 2 we consider controllable tandems and networks of centers,each center being of one of the types discussed in chapter 1. Consider the lastcenter in a tandem system, in which the control in each center is allowed todepend on the state of the whole network. Then we cannot use the optimalityresults of chapter 1 to obtain the optimal policy in the last center, because thearrivals, through the control in the previous centers, depend on the state ofthat center. With a Markov Decision Arrival Process (MDAP) we deal withthis type of dependency, by using it to model all but the last center with it.It is shown that the SQP, for the model of section 1.2, is still optimal for thistype of arrival stream. An interesting question is what the optimal policy isin the first of two centers in tandem. Some results and counterexamples aregiven in section 2.3. We also analyze the model where the policies are allowedto depend on the workloads. It appears that the results are stronger than theresults for the model based on the numbers of customers.

The results on the server assignment models are not as easily generalizedto arrivals according to an MDAP. More precisely, the generalization holdsonly if the policy which is optimal in the case of an MAP processes the jobsin decreasing order of expected processing times. This means that LEPT also

Page 7: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Introduction 3

minimizes the makespan for dependent arrivals, but in the single server case theµc-rule is only optimal if it coincides with LEPT. Counterexamples are givenin the case that it does not. In the sections 2.6 and 2.7 we consider tandemsystems with each center having a single server. Section 2.6 deals with heavytraffic results. In section 2.7 we assume that the service time distribution ofeach customer is the same in both centers. Then we have the striking resultthat each work-conserving policy minimizes the makespan.

Chapter 3 starts with generalizing the MDAP to a Dependent MarkovDecision Arrival Process (DMDAP). Now we can also model a finite source.In section 3.2 a customer assignment model is studied with asymmetric servicetimes. The following partial characterization of the optimal policy is given: ifqueue k has less customers and a faster server than queue l, then an arrivingcustomer can better be assigned to queue k than to queue l. From this result theresults of section 1.2 and 1.3 follow. In section 3.3 we study again symmetricmodels, but now with batch arrivals, and with non-routable arrivals and anassignable server. In section 3.4 we consider a model with asymmetric servers,multiple customer classes and no buffer space. Each customer has blockingcosts, depending on its class. Various monotonicity results are proved. Thenwe move again to the server assignment models. Results for the multiple servercase are generalized to partial availability of servers. Here we cannot model afinite source. We end the chapter by considering a model with a single serverand a finite source.

Most results are obtained by proving structural properties for discrete-time models. Typically, we formulate the dynamic programming equation andprove certain inequalities by induction, provided that they hold for the costfunctions. In most models we have an inequality giving the optimal policy, aninequality showing monotonicity, and, in the customer assignment models, aninequality showing symmetry of the costs, all in n steps. The decision pointsof the discrete-time model are the jump times of the original continuous-timemodel. In fact, the sojourn times of the embedded chain are all exponentiallydistributed with parameter α. By increasing this uniformization parameter weshow in section 5.3 that the optimal policies in the continuous-time modelshave the same properties as the optimal policies in the discrete-time models.If the optimal policy is myopic, that is, the same decision rule is optimal foreach horizon, then we can prove the continuous-time results by considering afixed α. This is the subject of section 5.2. All models considered in chapter 1have myopic optimal policies.

The main result of appendix A is already discussed. There multi-dimen-sional phase-type distributions are used, and it is shown that they are densein the class of all distributions. In appendix B we deal with one-dimensionalphase-type distributions. By the Markovian structure of our models, we cannotdeal with general service time distributions. To prove results for (service time)distributions with monotone failure rates, we need a characterization for theapproximating phase-type distributions. This is provided in appendix B.

Page 8: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

4 Introduction

As we said, our inductive results give conditions on cost functions. Forseveral customer assignment models, complete characterizations of the sets ofallowable cost functions are given in appendix C.

In some models where optimal policies could not be given, numerical ex-periments were done. Also to provide counterexamples computational methodswere used. Appendix D deals with these methods.

Most models of chapter 1 can already be found in the literature. Existingresults are generalized, for example to finite buffers and to more general costfunctions. Detailed discussions of the existing literature can be found in theappropriate sections of chapter 1. The main generalizations of the chapters 2and 3 are the dependent arrival processes. Chapter 5 adapts existing resultsfor use in the models of chapters 1, 2 and 3. Also in the appendices severalnew results are presented.

Page 9: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Chapter 1

Models with Markov Arrival Processes

1.1. Markov Arrival Processes

We start this chapter by introducing the arrival process.

1.1.1. Definition. (Markov Arrival Process) Let Λ be the countable statespace of a Markov process with transition intensities λxy with x, y ∈ Λ. Whenthis process moves from x to y with probability qxy an arrival occurs. We callthe triple (Λ, λ, q) a Markov Arrival Process (MAP).

Arrival processes with the arrivals on the jumps of a Markov process werefirst introduced by Rudemo [61]. For computational results we refer to anarticle by Neuts [51] and to chapter 5 of his latest book [53].

With the MAP the departure process of most queueing systems with ex-ponentially distributed sojourn times can easily be modeled, which can then beused as input to another system. As an example, take the M |M |1 queue with aPoisson(λ) arrival stream and service intensity µ. Construct the MAP (Λ, λ, q),corresponding to the departures, as follows: take Λ = 0, 1, . . ., λii+1 = λ andλii−1 = µ if i ≥ 1. All other transitions have intensity 0. Take qii+1 = 0,qii−1 = 1.

Now we show how to model a phase-type renewal process with an MAP.

Phase-type renewal processes. Assume we have a renewal process withindependent interarrival times of phase-type, as discussed in Neuts [52]. Phase-type distributions are defined as follows. We have a Markov process with m+1states, where state m + 1 is absorbing, the other m states are transient. Thetransition intensity from state x to y is denoted by txy, αx is the probabilitythat the system starts in state x. The time until absorption is the phase-typedistribution. Assume αm+1 = 0, i.e. there is no atom at 0. To model thisrenewal process with an MAP (Λ, λ, q), we have to take the parameters asfollows: Λ = 1, . . . ,m, λxy = txy + txm+1αy and qxy = (txm+1αy)/(txy +txm+1αy). We see that when the original state moves to m+ 1, the process isimmediately restarted and moves to state y with probability αy.

Also the Markov Modulated Poisson Process (MMPP) can be modeledwith an MAP.

Page 10: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

6 Models with Markov Arrival Processes

Markov Modulated Poisson Process. An MMPP is governed by a Markovprocess with state space Λ and transition intensities λxy. When the system isin state x customers arrive with intensity µx. As this does not change thearrival process we can assume λxx = 0 for all x. This process can easily bemodeled with an MAP (Λ, λ, q): take

Λ = Λ, λxy =µx if x = y,λxy otherwise and qxy =

1 if x = y,0 otherwise.

The MMPP is often used, both theoretically and practically, as it is easy toimplement. However, models like the M |M |1 queue above can not be modeledwith it. More details on the MMPP are given in Asmussen & Koole [3].

It can be shown that the class of MAPs is dense in the class of all arrivalprocesses. This is shown in appendix A. The approximating MAPs used therehave bounded rates in each state, i.e.

∑y λxy ≤ γ for all x for some constant

γ. By adding transitions from x to x with qxx = 0 we can modify the MAPsuch that

∑y λxy = γ in each x. This is assumed throughout.

1.2. Symmetric customer assignment model

Now consider the following model. Customers arrive according to an MAP toa system consisting of m parallel queues. On arrival the customers have to beassigned to one of the queues. This assignment may depend on the state ofthe MAP reached at the arrival instant, and on the previous queue lengths.Queue j has a buffer of size Bj , including the customer being served. We writeB = (B1, . . . , Bm). It is not allowed to assign a customer to a full queue,unless all queues are full. Each queue has a server which serves with rate µ.Our goal is to show that each arriving customer should be assigned to theshortest non-full queue, for various objective functions.

The total transition rate out of each state is bounded by γ+mµ. Now thesystem can be seen to operate as follows. The time between two transitionsis exponentially distributed with parameter α ≥ γ + mµ. The transitions atthe jump times have probabilities proportional to their rates. Central in ourapproach is the analysis of the Markov chain on the jump times, the embeddedMarkov chain. This method is called uniformization. (For more details, seechapter 5.)

At a jump, the probability of a transition from x to y in the arrival pro-cess is λxy/α. The probability of a departure at a queue is µ/α, the arrivalprobabilities remain qxy. For notational simplicity we assume α = 1, i.e. weuse the same variables for the embedded discrete-time model as for the originalcontinuous-time model. Note that a transition in the MAP and a departure atone of the queues cannot happen simultaneously. The state of our model willbe notated as (x, i), with x ∈ Λ the state of the MAP and i = (i1, . . . , im) thestate of the queues, ij being the number of customers in queue j. Then, ateach decision epoch, with probability λxy the arrival process moves from x to

Page 11: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Symmetric customer assignment model 7

y, giving an arrival with probability qxy, and there is a (potential) departureof a customer at each queue with probability µ. With probability 1− γ −mµa dummy transition occurs. Now define vn(x,i) as the expected costs over njumps of the embedded Markov chain, starting in state (x, i). The vn(x,i) can becomputed recursively, using the following dynamic programming (dp) equation:

vn+1(x,i) =

∑y

λxy

(qxy min

jvn(y,i+ej)+ (1− qxy)vn(y,i)

)+

m∑j=1

µvn(x,(i−ej)+) + (1− γ −mµ)vn(x,i).(1.2.1)

The minimization ranges over all j for which the queues are not full, i.e. forwhich ij < Bj . If i = B, add action 0 with e0 = 0.

Note that there are no immediate costs. The only costs are the v0, mean-ing that there are costs associated with the state reached in the end. Omittingthe immediate costs does not restrict generality, but makes the analysis moreelegant. Also note that relation (1.2.1) is not in the standard dynamic pro-gramming form because the action taken may depend on the current state ofthe arrival process y and not just on x. (In chapter 5 it is rewritten to bringit in the standard form.) The following lemma gives relations between theexpected minimal costs in different states of the model.

1.2.1. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (1.2.2)w(x,i) ≤ w(x,i+ej1 ) for i+ ej1 ≤ B (1.2.3)

and

w(x,i) = w(x,i∗) for i∗ a permutation of i, i∗ ≤ B (1.2.4)

hold for the cost function w = v0, then they hold for all vn.

For the proof we refer to the proof of corollary 3.3.2, because the modelstudied here is a special case of the model studied in section 3.3.

Note that the lemma gives conditions on the v0, the cost function. Let usinterpret the equations. Equation (1.2.2) gives the optimal policy. If we haveto decide between assigning a customer to queue j1 or to j2 we have to choosej1 if there are less customers in that queue. Thus amongst the non-full queues,the shortest is selected. This policy is called the Shortest Queue Policy (SQP).The SQP tries to balance the number of customers in the queues.

Equation (1.2.3) and (1.2.4) are needed to prove (1.2.2). The former givesa general objective: fewer customers is better. The latter shows that the valuefunction is symmetric, even though the buffer sizes can be different.

We assume that the costs are bounded, either from above or below. Thisensures that, in the continuous-time model, the costs at time T , for all T , are

Page 12: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

8 Models with Markov Arrival Processes

well defined. By (1.2.3), the costs are bounded from below by vn(x,0) for fixed x,meaning that the assumption is not very restrictive. We assume it throughoutthis chapter. Now we can prove that the SQP minimizes the costs at time T ,using corollary 5.2.2.

1.2.2. Theorem. For all T , the SQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (1.2.2) to (1.2.4).

It remains to study the cost functions that satisfy the conditions. Anobvious cost function is v0

(x,i) = i1 + · · · + im = |i|, meaning that the SQPminimizes the total number of customers in the system in expectation, bothat the time horizon T and from 0 to T . Another cost function that satisfiesthe conditions is v0

(x,i) = maxjij, the maximum queue length. Note thatthe dependence on the state of the arrival process can be quite general: ifwe associate costs cx with state x, cost functions like v0

(x,i) = cx + |i| andv0

(x,i) = cx maxjij for cx ≥ 0 are allowed. In fact, a necessary and sufficientcondition is that for x fixed the costs must be weak Schur convex in i, as shownin appendix C. Not only v0

(x,i) = |i| but also v0(x,i) = I|i| > s is allowed for

all s. (I· is the indicator function.) This means that the SQP minimizesthe probability that there are more than s customers in the system at T , i.e.the SQP stochastically minimizes the number of customers in the system. Itis easy to see that if v0

(x,i) = c(x,i) satisfies (1.2.2) to (1.2.4), so does v0(x,i) =

Ic(x,i) > s. This means that each cost function which is minimized by theSQP in expectation at T is minimized stochastically too. Summarizing, theSQP minimizes all Schur convex functions stochastically.

The first to prove the optimality of the SQP for minimizing the numberof customers in the system, was Winston [82], in 1977. He assumed Poissonarrivals and infinite buffers. Weber [75] extended this to arbitrary arrivals, buthis argument for service time distributions with an increasing failure rate wasshown to be false in Sparaggis & Towsley [68]. Whitt [81] showed that the SQPis not optimal for a model with U-shaped failure rates. Proposition 8.3.2 ofWalrand [74] gives a coupling proof for the exponential server case. Anotherproof of the pathwise optimality of the SQP is given in Hordijk & Koole [22].We give yet another coupling proof based on dynamic programming in section1.5. In Hordijk & Koole [21] finite buffers are introduced. There the number ofdeparted customers is considered, rather than the number of customers in thesystem. Blocking is allowed. This model is discussed in section 1.6. The modelof Towsley et al. [71] is exactly the model studied here. Johri [31] and recentlyMenich & Serfozo [46] weakened the conditions on the arrival and service rates.Similar conditions are studied in chapter 3. Finally, Sparaggis & Towsley [68]obtained the result for service times with an increasing likelihood ratio.

As said, in section 1.6 we consider a model in which the reward is relatedto the number of departed customers. Other customer assignment models canbe found in section 1.3 to 1.9. In chapter 2 we generalize the present result to amodel in which a certain dependency of the arrival process on the state of thequeues is allowed. In chapter 3 we study models with different service rates for

Page 13: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Customer assignment model without waiting room 9

different queues. There we give a partial characterization of the optimal policy.Together with this, assumptions on arrival and service rates are generalized.

1.3. Customer assignment model without waiting room

When we drop the condition that the service rates must be equal in eachqueue, we get an interesting problem. Numerical computations indicate thatthere is no optimal policy with a nice structure; for example the optimal policydepends, in the case of Poisson arrivals, on the arrival rate. In chapter 3 we givea partial characterization of the optimal policy using dynamic programming,and there we go into more details on the numerical results obtained by variousresearchers. Here we consider a special case where the optimal policy can becompletely described, namely the case where there is, besides the customer inservice, no space in the queues, i.e. B = (1, . . . , 1). Queue j has a server withservice rate µj , and we take µ1 ≥ · · · ≥ µm for convenience. We show thatfor various cost functions it is optimal to assign each arriving customer to thefastest available server. We call this policy the Fastest Queue Policy (FQP).

The first to address this problem was Seth, whose paper [64] appeared inthe same year as Winston’s seminal paper on the SQP [82], 1977. He analyzedthe model with m = 2 servers and Poisson arrivals. Then there are onlytwo policies to be considered, for which the stationary distribution is easilycomputed. The FQP minimizes the blocking probability. Derman et al. [17]generalize this result to multiple servers and general arrivals. Recent resultsfor this type of model are discussed in section 3.4, where we consider a similarmodel with class-dependent blocking costs.

Seth [64] also gives a counterexample to the optimality of the FQP for non-exponential service times. A similar result is obtained by Cooper & Palakurthi[14]. These results show the sensitivity of this model to the shape of the servicetime distributions.

Now we derive the optimality of the FQP. As in the previous section,the model is uniformizable. Assume γ + µ1 + · · · + µm ≤ 1. The dynamicprogramming formulation is:

vn+1(x,i) =

∑y

λxy

(qxy min

jvn(y,i+ej)+ (1− qxy)vn(y,i)

)+

m∑j=1

µjvn(x,(i−ej)+) + (1− γ − µ1 − · · · − µm)vn(x,i).

(1.3.1)

The minimization ranges over all queues for which ij = 0. Note the similaritywith (1.2.1).

The following lemma gives the optimality of the FQP.

1.3.1. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 = ij2 = 0, j1 < j2 (1.3.2)and

Page 14: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

10 Models with Markov Arrival Processes

w(x,i) ≤ w(x,i+ej1 ) for ij1 = 0 (1.3.3)

hold for the cost function w = v0, then they hold for all vn.

Equation (1.3.2) gives the optimality of the FQP. For the proof we referto the proof of the equivalent lemma for the more general model studied insection 3.4. There it is shown how the optimality of the FQP follows from amore general result on an asymmetric customer assignment problem. Also thesymmetric model of section 1.2 is a special case of that model.

1.3.2. Theorem. The FQP minimizes the costs at T (from 0 to T ) for allcost functions satisfying (1.3.2) and (1.3.3).

Let us consider the cost functions satisfying the conditions. As in theprevious section v0

(x,i) = i1 + · · · + im = |i| is allowed. Again, each allowablecost function is also minimized stochastically. This gives us, if we take v0

(x,i) =I|i| ≥ m, that the FQP minimizes the blocking probability at each T .

For the SQP we had a complete characterization of all allowable cost func-tions. Here something similar holds: the allowable cost functions are the set offunctions increasing in an ordering, which is called the partial sum ordering inChang et al. [12]. In appendix C the ordering is introduced and the equivalenceis shown.

1.4. Discrete-time customer assignment model

So far we studied a continuous-time model by analyzing a discrete-time one.Of course, discrete-time models themselves are also interesting. Unfortunately,the model of lemma 1.2.1 is not very realistic in discrete time: arrivals anddepartures cannot happen simultaneously and therefore they are not indepen-dent. The optimality result for this model is more involved than for the modelwithout simultaneous events. Therefore we analyze the following simple model.There are 2 identical parallel queues with infinite capacity, each with one server.When a customer is served during a time slot it leaves the system with prob-ability µ, giving geometric service times with average 1/µ. The interarrivaltimes are geometric with parameter λ. The state is denoted by (i, j), with i(j) the number of customers in queue 1 (2).

The dynamic programming equation becomes:

vn+1(i,j) =λmin

µ2vn((i−1)++1,(j−1)+) + µ(1− µ)vn((i−1)++1,j)+

(1− µ)µvn(i+1,(j−1)+) + (1− µ)2vn(i+1,j),

µ2vn((i−1)+,(j−1)++1) + µ(1− µ)vn((i−1)+,j+1)+

(1− µ)µvn(i,(j−1)++1) + (1− µ)2vn(i,j+1)

+

(1− λ)(µ2vn((i−1)+,(j−1)+) + µ(1− µ)vn((i−1)+,j)+

(1− µ)µvn(i,(j−1)+) + (1− µ)2vn(i,j)

).

(1.4.1)

Page 15: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Discrete-time customer assignment model 11

Note that if a queue is empty there is no departure even if an arrival occurs atthat queue. We have the same equations as in the model without simultaneousevents:

1.4.1. Lemma. If

w(i+1,j) ≤ w(i,j+1) for i ≤ j, (1.4.2)

w(i,j) ≤ w(i+1,j), (1.4.3)

and

w(i,j) = w(j,i) (1.4.4)

hold for the cost function v0, then they hold for all vn.

The proof can be found in chapter 4. The optimal policy is not immedi-ately clear from the equations. In the proof however it is shown that for i ≤ jwe have µ2vn((i−1)++1,(j−1)+) +µ(1−µ)vn((i−1)++1,j) +(1−µ)µvn(i+1,(j−1)+) +(1−µ)2vn(i+1,j) ≤ µ

2vn((i−1)+,(j−1)++1)+µ(1−µ)vn((i−1)+,j+1)+(1−µ)µvn(i,(j−1)++1)+(1 − µ)2vn(i,j+1), which are the terms in the minimization of the dynamic pro-gramming equation. Thus the SQP is optimal.

1.4.2. Theorem. The SQP minimizes the costs at each n for all cost functionssatisfying (1.4.2) to (1.4.4).

The equations derived here are equivalent to (1.2.2) to (1.2.4), for m = 2.Thus the same cost functions are allowed here.

The generalization to more than 2 queues seems to be straightforward,although we did not check that in full detail. When we introduce buffershowever, problems arise. For example, when some queues are full we haveto specify the allowable actions and the actual point in time at which thearrival occurs; before or after the departure. After the departure seems from amodeling point of view the most interesting; this results in a model where wedecide on the assignment after the departure of the customers. For example, ifall queues are full, this means assigning to a queue where a departure occurs.We conjecture that also in this case the SQP is optimal. If the assignmentoccurs before the departures take place, then the SQP might not be optimal.

In the next section we return to the study of continuous-time models.

Page 16: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

12 Models with Markov Arrival Processes

1.5. Pathwise optimality

In this section we want to prove the pathwise optimality of the SQP for thecontinuous-time model of section 1.2. There we showed that the SQP is stochas-tically optimal at T for all allowable cost functions. This is equivalent withsaying that, for an arbitrary policy R, we can couple the realizations such thatthe costs at T are lower under the SQP. To prove pathwise optimality, we haveto show that for coupled realizations the SQP has lower costs jointly acrosstime. Again, we want to use dynamic programming for our result. However,in the dynamic programming recursion we compute expected costs: vn are theexpected costs after n transitions. We give a similar recursion with randomvariables.

In the previous sections it was sufficient to know the transition rates. Herehowever we need to know the stochastic behavior and specify the underlyingprobability spaces. In section 1.2 it is argued that our model is governed bytwo independent processes: one governing the jump times and one governingthe transitions themselves. The jump process is the same as in section 1.2,we will not further specify it. The transitions are generated by independentuniformly distributed random variables. Assume the current state is (x, i).Let U be the r.v. generating the transition at the current jump time. Let(j) be the index of the jth smallest component of i. If i(j) = i(j+1), take(j) < (j + 1). For example, if i = (2, 1, 0, 1), then (1) = 3, (2) = 2, (3) = 4and (4) = 1. Note that i(1) ≤ · · · ≤ i(m), the usual definition of i(·). Assumethat the states of the MAP are numbered. The system moves to (y, i + ej)if U ∈ [

∑z<y λxz,

∑z<y λxz + λxyqxy) and if action j was chosen in state

(y, i). The system moves to (y, i) if U ∈ [∑z<y λxz + λxyqxy,

∑z≤y λxz), and

to (x, (i − e(j))+) if U ∈ [γ + (j − 1)µ, γ + jµ). A dummy transition occursif U ∈ [γ + mµ, 1]. Note that the actual coupling can be found in the termon departures: in different states, departures at the jth longest queue in bothmodels are coupled. Although the method of proof is different, this is the samecoupling as in Walrand [74] and Hordijk & Koole [22].

Let Un, n ≥ 1, be i.i.d. random variables, uniformly distributed on [0, 1].Choose random variables V 0

(x,i), for all x and i on the same probability space,and define V n(x,i), n ≥ 1 by the following recursion:

V n+1(x,i) =

minjV n(y,i+ej) if Un+1 ∈

[∑z<y

λxz,∑z<y

λxz + λxyqxy

), y ∈ Λ

V n(y,i) if Un+1 ∈[∑z<y

λxz + λxyqxy,∑z≤y

λxz

), y ∈ Λ

V n(x,(i−e(j))+) if Un+1 ∈[γ + (j − 1)µ, γ + jµ

), j = 1, . . . ,m

V n(x,i) if Un+1 ∈[γ +mµ, 1

]The allowable actions are the same as in section 1.2.

Page 17: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Pathwise optimality 13

The minimization in the recursion is taken on each sample path. In generalthis minimum need not be attained by a unique action. In the next lemma weshow that, in this case, it is attained by the same action in each state, namelythat action that assigns to the shortest queue, which gives the optimality ofthe SQP for the recursion.

1.5.1. Lemma. If

W(x,i+ej1 ) ≤W(x,i+ej2 ) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (1.5.1)W(x,i) ≤W(x,i+ej1 ) for i+ ej1 ≤ B, (1.5.2)

and

W(x,i) = W(x,i∗) for i∗ a permutation of i, i∗ ≤ B (1.5.3)

hold for the cost function W = V 0, then they hold for all V n.

The proof of lemma 1.5.1 can be found in chapter 4. To understand themeaning of this lemma, condition on a realization of the jump times. Numberthe r.v.’s governing the transitions in reverse and condition also on them. Thenthe lemma tells us that the costs are minimized by the SQP. Note that thecoupling is implicit in the recursion; for all policies the same Un are used. Thusthe lemma shows that the costs are lowest under the SQP for each realization.This gives of course the optimality at each T but also the optimality over thewhole path.

1.5.2. Theorem. The SQP minimizes the costs pathwise for all cost functionssatisfying (1.5.1) to (1.5.3).

The costs are allowed to be random variables. Apart from that the condi-tions are similar to the conditions of the previous sections.

Note that from the pathwise optimality it also follows that the sum of thewaiting times of the first n customers is minimized stochastically by the SQP.

Page 18: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

14 Models with Markov Arrival Processes

1.6. Customer assignment model with rejection

Here we study a model which is similar to that of section 1.2. However, thepolicies and the type of cost functions studied are different. Concerning thepolicies, it is allowed to send a customer to a full queue, meaning that it isrejected. By introducing an extra queue without waiting room in the buffer,we can add a rejection option in each state. The type of cost functions studiedhere is concerned with the number of customers that have already departed.This is the model studied in Hordijk & Koole [21], but we choose to prove ita little differently. In view of the objective it would be appropriate to havea model with rewards, but in order to agree with the other models we studycosts. We add an extra variable to the state space (x, i), which counts thenumber of departed customers, i.e. if a departure occurs at queue j the systemmoves from (x, i, k) to (x, i− ej , k+ 1). The dynamic programming equation is

vn+1(x,i,k) =

∑y

λxy

(qxy min

jvn(y,i+ej∧B,k)+ (1− qxy)vn(y,i,k)

)+

µm∑j=1

(δijv

n(x,i−ej ,k+1) + (1− δij )vn(x,i,k)

)+ (1− γ −mµ)vn(x,i,k).

(1.6.1)

Because of the rejection option the minimization ranges over all j. Of course,instead of adding the variable k, we could have taken immediate costs. Thishowever would only have given results in expectation instead of stochasticresults.

The analysis continues as usual:

1.6.1. Lemma. If

w(x,i+ej1 ,k) ≤ w(x,i+ej2 ,k) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (1.6.2)w(x,i,k+1) ≤ w(x,i+ej1 ,k) for i+ ej1 ≤ B, (1.6.3)

w(x,i+ej1 ,k) ≤ w(x,i,k) for i+ ej1 ≤ B (1.6.4)and

w(x,i,k) = w(x,i∗,k) for i∗ a permutation of i, i∗ ≤ B (1.6.5)

hold for the cost function v0, then they hold for all vn.

The present model is a special case of the model of section 3.5. Thusfor the proof of the lemma we refer to the derivation in the beginning of thatsection. Equation (1.6.2) is by now well known; we should assign to the shortestqueue. Equation (1.6.3) states that the costs are smaller when customers leavequickly. Equation (1.6.4) says that a full system is better. Note that it is thereverse of (1.2.3); it allows us to include rejection as an action without losingthe optimality of the SQP. Equation (1.6.5) is again symmetry.

Of course all cost functions satisfying (1.6.2) to (1.6.5) are allowed, butcost functions depending only on i are not of interest here, because (1.6.3) and

Page 19: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Series of parallel processors 15

(1.6.4) would give that the costs are constant in each state. Of interest here isv0

(x,i,k) = −k, meaning that, when starting in (x, i, 0), the SQP maximizes theexpected number of departed customers. Also Ik ≤ s is allowable for all s,giving the following theorem.

1.6.2. Theorem. The SQP maximizes the number of departed customersbetween 0 and T stochastically.

1.7. Series of parallel processors

A type of model related to the symmetric customer assignment model is thefollowing, introduced by Katehakis & Melolidakis [33]. We have a series of mgroups of components, group j consisting of Bj components. The system is upwhen at least one component in each group is functioning. New componentsarrive according to an MAP. The problem is how to assign the arriving compo-nents to the groups. Assigning a component to a group in which all componentsare functioning means that the component is lost. First we study a model inwhich all components are subject to failure, all with the same intensity. Thisis the model studied by Katehakis & Melolidakis. Then we consider the casewhere only the working components can fail.

For the first model we assume that Bj is finite for each j. Let γ + (B1 +· · ·+Bm)µ ≤ 1. The dynamic programming equation is:

vn+1(x,i) =

∑y

λxy

(qxy min

jvn(y,i+ej∧B)+ (1− qxy)vn(y,i)

)+

µm∑j=1

ijvn(x,i−ej) + (1− γ − (i1 + · · ·+ im)µ)vn(x,i).

(1.7.1)

As in the last section the minimization ranges over all j.

1.7.1. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (1.7.2)w(x,i+ej1 ) ≤ w(x,i) for i+ ej1 ≤ B (1.7.3)

and

w(x,i) = w(x,i∗) for i∗ a permutation of i, i∗ ≤ B (1.7.4)

hold for the cost function v0, then they hold for all vn.

The proof can be found in chapter 4. It is interesting to note that for theproof of (1.7.2) we do not need (1.7.3), because of the fact that each componentis handled in exactly the same way. Therefore we need (1.7.3) only to see thatthe optimal policy does not reject arriving components. If sending a componentto a full group were not allowed, as in section 1.2, we could omit (1.7.3). Inlemma 1.2.1 this cannot be done, as we need (1.2.3) in the proof of (1.2.2).

Equation (1.7.3) is the reverse of (1.2.3), and is again due to the servicemechanism.

Page 20: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

16 Models with Markov Arrival Processes

1.7.2. Theorem. The SQP minimizes the costs at T (from 0 to T ) for allcost functions satisfying (1.7.2) to (1.7.4).

One function we are interested in is v0(x,i) = I∃j with ij = 0. This cost

function is indeed allowable, giving that the SQP minimizes the probabilitythat the system is down. If the system is up when there are k out of n groupsfunctioning, rather than all n groups, we can take v0

(x,i) = 1 if there are morethan k non-empty groups in i, and 0 otherwise. This cost function is also al-lowable, thus the SQP maximizes also in this k-out-of-n system the probabilitythat the system is working. A related cost function is I∃j with ij < k. Thisis also an allowable choice, corresponding to a system in which each groupmuch have at least k working components. These results were also obtained byKatehakis & Melolidakis [33].

Now we consider a similar model, not studied in [33], in which only the mcomponents required for the system to function can fail. This means that nocomponent fails if the system is down. If we want to maximize the probabilitythat the system is up at T , the SQP might not be optimal, as the followingexample shows. Take m = 2, µ = λ = 1 and T = 2. With the computationalmethod described in appendix D, which amounts to computing the dynamicprogramming equations for a large uniformization parameter, we computedthe optimal policy. It followed that it is optimal in state (0, 1) to assign newcomponents to group 2. Customers arriving after 0.967 are assigned to group 1.

However, if we look at the expected time the system is up from 0 to Tthe SQP is optimal. To show this, we have to introduce immediate costs. Weprefer to incur all costs together at T , in a way similar to the model of theprevious section. Therefore we add an extra component to the state space,which is raised by 1 each time a component fails. The dynamic programmingequation is:

vn+1(x,i,k) =

∑y

λxy

(qxy min

jvn(y,i+ej∧B,k)+ (1− qxy)vn(y,i,k)

)+

µm∑j=1

vn(x,i−ej ,k+1) + (1− γ −mµ)vn(x,i,k) if ij > 0 for all j,

∑y

λxy

(qxy min

jvn(y,i+ej∧B,k)+ (1− qxy)vn(y,i,k)

)+

(1− γ)vn(x,i,k) if ij = 0 for some j.

Again, the minimization ranges over all j.

1.7.3. Lemma. If

w(x,i+ej1 ,k) ≤ w(x,i+ej2 ,k) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (1.7.5)m∑j1=1

w(x,i−ej1 ,k+1) ≤ mw(x,i,k) for i ≥ e, (1.7.6)

Page 21: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Customer assignment model with workloads 17

w(x,i+ej1 ,k) ≤ w(x,i,k) for i+ ej1 ≤ B, (1.7.7)w(x,i,k+1) ≤ w(x,i,k) (1.7.8)

and

w(x,i,k) = w(x,i∗,k) for i∗ a permutation of i, i∗ ≤ B(1.7.9)

hold for the cost function v0, then they hold for all vn.

The proof can be found in chapter 4. As in the first model of this section,we only need (1.7.7) to know that we should not use the rejection option. Asin the result of the previous section, we can take v0

(x,i,k) = −k, giving that theSQP maximizes the expected number of failed components. However, we areinterested in the time the system is up. But, components only fail if the systemis up, with rate mµ. Thus the policy that maximizes the number of departures,also maximizes the time that the system is up.

1.7.4. Theorem. The SQP maximizes the expected time that the system isup between 0 and T .

As in section 1.6, the second and third equation give, for cost functionsonly depending on i, constant costs. Thus v0

(x,i,k) = −k is the only cost functionof interest.

In Koole [39] the same model is studied, but there the queue to whichan arrival is assigned is determined at the time of the previous arrival. Thismodels the repair at the spot by a repairman, and results in a model with aspecific form of delayed information. Similar results as for the current modelare derived.

1.8. Customer assignment model with workloads

The information available to the controller in the model of section 1.2 are thenumbers of customers in the queues. Here we study a model in which theamount of work in the queues, the workload, is known. The characteristicsof the model are as follows. The service times of all customers are identicallyindependently distributed, the controller assigns not knowing the actual ser-vice times, and the servers all work at the same constant speed c. Daley [15]showed that a variant of the SQP, the Shortest Workload Policy (SWP), min-imizes the total workload at each T . In fact, he shows with forward inductionthat the workload under the SWP is weakly submajorized by the workload ofeach policy, giving the stochastic optimality for each Schur convex cost func-tion. (Appendix C deals with majorization.) Foss [18] obtains the same result.Also Wolff [84] shows that the SWP minimizes the workload, although he onlycompares the SWP with policies that are not allowed to depend on the work-load.

We also prove the optimality of the SWP, again with dynamic program-ming. However, as decision points we do not take the jumps of a Poisson

Page 22: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

18 Models with Markov Arrival Processes

process but the actual arrival instants. Thus, technically speaking, we condi-tion on the arrival process. We do this to avoid technical problems: when thesojourn time between two events is constant, the amount of work done in eachqueue is also a constant, thus simplifying the analysis. In general, many modelswith arrivals according to MAP’s can also be handled by taking arrivals at de-terministic times. Exceptions are the second model of the previous section andseveral models of chapter 3, the reason being that even for arrivals accordingto MAP’s the optimal policies are not myopic. We chose to use the MAP asmuch as possible, to link on with the forthcoming chapters.

Now we prove the optimality of the SWP at T . Let sn be the sojourntime between the nth and (n + 1)th arrival, counted backward from the timehorizon, let the amount of work done by a busy server in this time be un = csn,and assume that P is the distribution function of the service times. With i wedenote the vector of workloads, i ∈ IRm

+ . We have

vn+1i = min

j

∫ ∞0

vn(i+tej−une)+dP (t)

(1.8.1)

1.8.1. Lemma. If∫wi+tej1dP (t) ≤

∫wi+tej2dP (t) for ij1 ≤ ij2 , (1.8.2)

wi ≤ wi+tej1 for t ≥ 0, (1.8.3)and

wi = wi∗ for i∗ a permutation of i (1.8.4)

hold for the cost function w = v0, then they hold for all vn.

Note the resemblance to lemma 1.2.1. The proof can be found in chapter 4.In section 3.3 we give a different proof of the optimality of the SWP; there wesee it as the limiting case of the SQP model with batch arrivals.

Equation (1.8.2) without the integration, i.e. wi+tej1 ≤ wi+tej2 for all t,is not true; this means that it is essential that the controller does not knowthe actual service times of the arriving customers. To construct an exampleillustrating this, take m = 2, u0 = 2 and v0

(i1,i2) = i1 + i2, which indeed satisfiesthe conditions of lemma 1.8.1. Let the service time be equal to 2 a.s. Thenit is easily seen that, if we take i = (0, 1), t = 1, j1 = 1 and j2 = 2, thenv1i+tej1

= v1(1,1) = 1 > 0 = v1

(0,2) = v1i+tej2

.

1.8.2. Theorem. The SWP minimizes the costs (stochastically) at T for allcost functions satisfying (1.8.2) to (1.8.4).

The cost functions considered here are functions of IRm+ . It follows directly

that again all Schur convex functions satisfy the inequalities. See appendix Cfor an overview of these functions. If we require the inequalities to hold for allservice time distributions P , then the Schur convex functions are exactly the

Page 23: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Customer assignment model with workloads 19

allowable cost functions, which can be shown in the same way as theorem C.1.Note that the statement in the penultimate paragraph of p. 304 in Daley [15],on the functions that respect weak majorization, is not correct: for exampleindicator functions of allowable cost functions are in general not convex.

For the SQP we were able to prove pathwise optimality. Here however, asstated in Wolff [84], we have the striking result that the SWP minimizes thetotal workload stochastically but not pathwise. To construct a counterexampleto the pathwise optimality, take a model with initial workload i = (1, 2) andspeed c = 1. For the service time B we have IP(B = 1) = IP(B = 2) = 1

2 . Thefirst customer arrives at t = 0, the second at t = 1. No more arrivals occurbefore t = 4. When we fix the policy used, there are four different realizationsup to t = 3, each with probability 1

4 . To get a pathwise ordering, we have tocombine the realizations for the SWP and an arbitrary policy R such that theSWP is better for all t. Take R such that we start with assigning to the longestqueue, but the second customer is assigned to the shortest. Denote with bi (bi)the service time of the ith arriving customer in the model that uses the SWP(R). At t = 1 the amount of work is 1 + b1 + b2 (1 + b1 + b2). Therefore wehave to couple b1 = b2 = 1 with b1 = b2 = 1. Now we show that if b1 = 1 andb2 = 2, then there is no choice of b1 and b2 which is pathwise better. Take firstb1 = 1 and b2 = 2. Then, at t = 3, the system ruled by R is empty, but notthe model under the SWP. For both eventualities with b1 = 2 we have that theamount of work just after the first arrival is larger under the SWP.

Note that if we are allowed to let the coupling depend on t in this example,we find the optimality of the SWP. This is equivalent to saying that the SWP isstochastically optimal in this example, which follows also from theorem 1.8.2.

In the models of the next chapter where customers move through a net-work, it is of interest to consider the number of departed customers instead ofthe workloads. This model was studied by Wolff [83]. First he remarks thatthe SWP is stochastically equivalent to a single M |M |m queue with FCFS dis-cipline. Then he shows that FCFS is better than any policy in the model withparallel queues, using a coupling argument. In the coupling argument servicetimes are given to the customers the moment they start service. This meansthat the controller is allowed to assign knowing the number of customers in eachqueue, and the remaining service times of the customers presently in service.A policy in this class is the SQP, but not the SWP or other policies dependingon the workloads. This result is generalized to the class of all policies which donot depend on the service time of the arriving customer in Koole [36]. Theseresults are all pathwise.

Page 24: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

20 Models with Markov Arrival Processes

1.9. Customer assignment model without information

In the previous section we have shown that the SWP minimizes the totalamount of work in the system stochastically, at any T . For exponential servicetimes, we have seen in section 1.2 that the number of customers is minimizedby the SQP. In the latter model the workloads are not known to the controller,i.e. in the model where the SQP is optimal, the queue-length model, the con-troller has to decide based on different information than in the workload modelwhere the SWP is optimal. Note that because the SQP minimizes the numberof customers stochastically it also stochastically minimizes the amount of workstill to be done in the class of allowable policies. An interesting question isif either the SWP or the SQP is better with respect to minimizing the num-ber of customers in the system. This question is answered by Wolff [83]. Asmentioned in the previous section, he shows that the SWP is better than allpolicies that do not depend on the workload, amongst which is the SQP.

Besides the number of customers or the workload we have two more ob-vious models with a different amount of information. The first is where youhave no information at all. For exponential service times and an initially emptysystem we show at the end of this section that each arriving customer shouldbe assigned to each queue with probability 1

m to minimize the number of cus-tomers, and thus the total workload. We call this policy the Equal SplittingPolicy (ESP). When we know the previous assignments but not the state ofthe system the Cyclic Assignment Policy (CAP) minimizes the number of cus-tomers; proposition 8.3.4 of Walrand [74] has a simple proof for the case withexponential service times, a proof for IFR service times (see appendix B for adefinition of IFR) can be found in Liu & Towsley [42].

From standard results in Markov Decision Theory, we know that even ifthe class of policies in the models depending on the queue lengths are allowedto depend on the whole history, the SQP remains optimal. This means thatthe workload under the SQP is smaller than under the CAP (and the ESP).It is clear that the ESP is worse than the CAP. Thus if we list the policies inincreasing order of expected workload, we have: SWP, SQP, CAP, and ESP.

We end this section with showing that the ESP minimizes the number ofcustomers in the system, when there is no information available. For resultsfor cost function related to the workloads, we refer to Chang et al. [11] andChang [10]. A full proof of the result is given in Koole [38]. As this proof isbased on forward instead of backward recursion, we will only sketch it.

We confine ourselves to two queues. Consider first a single model, withassignment vector (p, 1 − p) with p ≥ 1

2 . Let Qp(n) = (Qp1(n), Qp2(n)) bethe queue lengths directly after the nth event (which can be an arrival or a(potential) departure from one of the queues), and initial state Qp(0). Definefor all i, j, s ∈ IN0

A(i, j, s) = (x, y) ∈ IN20 | x ≤ i, y ≤ j, x+ y ≤ s.

Page 25: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

MAP’s with multiple customer classes and server vacations 21

Now letP pn(i, j, s) = IP((Qp1(n), Qp2(n)) ∈ A(i, j, s)).

Take P p0 (i, j, s) = 1 for all i, j, s, which corresponds to starting with an emptysystem. Then it can be proven, with forward induction, that

P pn(i+ j, i+ k, s) ≥ P pn(i, i+ j + k, s) (1.9.1)

for all i, j, k, s, n ≥ 0. In a way this shows that if queue 1 has a higher assign-ment probability than queue 2, then this will result in a stochastically largerqueue length. This interpretation becomes clear if we take k = 0. Having shown(1.9.1), we can compare two systems with assignment probabilities q ≥ p ≥ 1

2 .Again with forward induction it can be shown that, for all i, j, s, n ≥ 0,

P pn(i, i+ j, s) ≥ P qn(i, i+ j, s).

For i ≥ s this states that the probability of having less than s customers in thesystem at any time is maximized by the ESP. In [38] it is shown how this resultcan be made pathwise, and how it can be generalized to an arbitrary numberof queues.

Let us compare the method of proof for the above result with dynamicprogramming. In general, dp determines the optimal action in each state. Inthe current setting, due to the information structure, distributions on stateswould serve as states. Equation (1.9.1) shows that certain distributions do notoccur, and in those that can occur it is advantageous to have a more balancedassignment.

If we were to apply dp to the model without state information (i.e., withdistributions as states) then we would find the CAP as optimal policy. Al-though the CAP uses no state information, it uses the previous assignments todetermine the current. Note that Bernoulli policies use no information at all.

1.10. MAP’s with multiple customer classes and server vacations

In the models we study after this section, we have multiple customer classesand server vacations. Therefore we add a mark to each arrival generated by theMAP to model the class of an arriving customer or the availability of a server.Let qkxy be the probability of an arrival in class k, given a transition from x toy. Then an arrival with mark k, 1 ≤ k ≤ m, denotes the arrival of a customerin class k. In some of our models servers can go on vacation at random times.There are s servers. With an arrival in class k, m + 1 ≤ k ≤ m + s, an eventfor server k−m is meant; if the server is working he goes on vacation and viceversa. We assume

∑s+mk=1 q

kxy ≤ 1. Simultaneous arrivals cannot occur. To give

a complete description of the current state of the system we have to specifythe state of the arrival process, of the servers and of the queues. Thus, besidesthe state of the arrival process x and the state of the queues i we have to add

Page 26: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

22 Models with Markov Arrival Processes

a variable to the state of the system denoting the availability of the servers.Because we are interested in optimally assigning the available servers, but notin controlling the number of servers, it is convenient to make this variable partof the arrival process. Thus, add a vector z = (z1, . . . , zs) of 0-1 variables tothe state of the MAP. Server k is available if and only if zk = 1. Concerningthe arrivals of customers, we want to address questions like: when is the firsttime that the system becomes empty after N arrivals? To deal with this type ofquestion, we also would like to identify the state of the arrival process with thenumbers of arrived customers. To do so, also add a variable n = (n1, . . . , nm)to the state of the MAP, where nk is the number of customers that have arrivedin class k. Assume we have an MAP (Λ, λ, q). The transition intensities of thenew arrival process (Λ, λ, q) with state space Λ = (x, z, n) become:

λ(x,z,n)(y,z,n+ek) = λxyqkxy, 1 ≤ k ≤ m

ql(x,z,n)(y,z,n+ek) = Il = k, 1 ≤ k ≤ m

λ(x,z,n)(y,z∗,n) = λxyqm+kxy , zj = z∗j , j 6= k, z∗k = (1− zk)+

qm+k(x,z,n)(y,z∗,n) =

1 if zj = z∗j , j 6= k, z∗k = (1− zk)+

0 otherwise

λ(x,z,n)(y,z,n) = λxy(1−m+s∑k=1

qkxy)

qk(x,z,n)(y,z,n) = 0

The arrival process just defined is again an MAP. Thus we have the followingequivalent definition:

1.10.1. Definition. (Markov Arrival Process) Let Λ be the countablestate space of a Markov process with transition intensities λxy with x, y ∈ Λ.When this process moves from x to y, with probability qkxy an arrival in class1 ≤ k ≤ m occurs, and with probability qm+k

xy an event with server 1 ≤ k ≤ soccurs. There are sets Λs1, . . . ,Λ

ss ⊂ Λ such that server k is available if and only

if x ∈ Λsk, and sets Λa1n, . . . ,Λamn ⊂ Λ, n ∈ IN, such that if x ∈ Λakn then there

have been n or more arrivals of class k. We call the triple (Λ, λ, q) an MAP.

Section 1.1 handled MAP’s with only one customer class and withoutserver vacations. We showed there how to model various types of arrival pro-cesses. If the arrival streams in different classes are independent of each otherwe can take the superposition of the m processes (i.e., the process with asstate space the product space, in which each component is independent of theothers), with the arrivals in process j having marks j. This is again an MAP.

The result in appendix A on the approximation of arrival processes is onmarked arrival streams, thus the weak convergence of MAPs to general arrivalprocesses holds for the present model too.

Page 27: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with a single server 23

1.11. Server assignment model with a single server

In this section we study a model in which a single server is to be assigned toone of m customer classes. Each customer in class j has an exponential servicetime with intensity µj . At each decision epoch the server can be reassigned.Customers arrive in the m classes according to an MAP. Server vacations arenot interesting because we have only one server, therefore we do not modelthem. This model has been studied extensively, mainly for linear costs, i.e.a cost function in which every customer of class j adds cj to the costs. It iswell known that the customers should be served in decreasing order of µjcj ,according to the µc-rule. This result can be found in Baras et al. [4] andBuyukkoc et al. [8], the last paper using a very simple interchange argument.Here we also show that the µc-rule is optimal, using dynamic programming.The µc-rule minimizes the costs stochastically only in the special case thatthe service rates and the costs are both decreasing. An interesting relatedmodel is that of Righter & Shanthikumar [57]. They have DFR service timedistributions and consider the number of successful departures. With pj theprobability that a departure in queue j is successful, they show that the µp-rule is optimal. This result holds stochastically, in all cases. Later on in thissection we also consider DFR service times, showing that the µc-rule, with µjthe current failure rate of a customer, is still optimal.

Take µ = maxj µj . We uniformize, and we assume therefore that γ+µ ≤ 1.We consider two models, one in which idleness of the server is allowed and onein which it is not allowed. We have as the dynamic programming equation:

vn+1(x,i) = min

l

∑y

λxy

( m∑j=1

qjxyvn(y,i+ej)

+(1−

m∑j=1

qjxy)vn(y,i)

)+

µlvn(x,i−el) + (1− γ − µl)vn(x,i)

=∑

y

λxy

( m∑j=1

qjxyvn(y,i+ej)

+(1−

m∑j=1

qjxy)vn(y,i)

)+

minl

µlv

n(x,i−el) + (µ− µl)vn(x,i)

+ (1− γ − µ)vn(x,i).

The minimization ranges over all l with il > 0. If idleness is allowed, action0 (with µ0 = 0) has to be added to the actions. Now we have the followinglemma:

1.11.1. Lemma. If idleness is not allowed or is suboptimal in each state and

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i) ≤ µj2w(x,i−ej2 ) + (µ− µj2)w(x,i) (1.11.1)for j1 < j2 and ij1 , ij2 > 0

hold for the cost function v0, then they hold for all vn.

Page 28: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

24 Models with Markov Arrival Processes

Thus, if idleness is not allowed, we have the optimality of the policy thatassigns to the non-empty queue with smallest index. We call this policy theSmallest Index Policy (SIP). Later on in this section we study a more generalmodel. Therefore lemma 1.11.1 follows from lemma 1.11.5. When idleness isallowed we have to add monotonicity to obtain the suboptimality of idleness.

1.11.2. Lemma. If

w(x,i−ej1 ) ≤ w(x,i) for ij1 > 0 (1.11.2)

holds for the cost function v0, then it holds for all vn.

Note that (1.11.2) is a special case of (1.11.1), for the cases that |i| ≥ 2, bygiving action 0 the lowest priority. For the proof, we refer to the proof of lemma1.11.6. We can have two separate lemmas because we do not need (1.11.2) inthe proof of (1.11.1). The same approach does not work in most customer as-signment models because monotonicity is needed to prove the inequality givingthe structure of the optimal policy. The same holds for the multiple servermodel of the next section. We summarize our results for the continuous-timemodel.

1.11.3. Theorem. The SIP minimizes the costs at T for all cost functionssatisfying (1.11.1), when idleness is not allowed.

1.11.4. Theorem. The SIP minimizes the costs at T for all cost functionssatisfying (1.11.1) and (1.11.2), when idleness is allowed.

Remark. In section 1.2 we assumed that all cost functions are bounded. Inthe model of theorem 1.11.3 however, it is natural to consider cost functionsof which both the positive and negative parts are unbounded. For example, ifm = 2, v0

(x,i) = i1 − i2 is an allowable cost function. In this case finiteness ofthe costs at T can be shown when the costs are ν-bounded, as is proved in thefirst part of the proof of theorem 5.3.2.

Also in the case of an infinite planning horizon, there are complications.Due to the unboundedness of the costs we cannot use the results for negativedynamic programming, as suggested in chapter 5. In the case of Poisson arrivalsν-geometric recurrence can be shown, giving average and Blackwell optimalityof the SIP. The ν-geometric recurrence of the discrete-time model is shown bySpieksma [70]. Her results are used in Dekker & Hordijk [16] to verify theirconditions for Blackwell optimality in the continuous-time semi-Markov model.

Now we study the cost functions. In general we have the following char-acterization. Define ∆jv

0(x,i) = v0

(x,i+ej)− v0

(x,i). Then (1.11.1) is equivalentto

µj1∆j1v0(x,i−ej1 ) ≥ µj2∆j2v

0(x,i−ej2 ) if j1 < j2 and ij1 , ij2 > 0 (1.11.3)

Page 29: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with a single server 25

and (1.11.2) to∆jv

0(x,i) ≥ 0 for all j. (1.11.4)

A simple cost function satisfying both conditions is the following: v0(x,i) =

I|i| > 0. As we minimize the expected cost for this function, we minimizethe probability that there are any customers present at time n. When thereare no arrivals, this coincides with minimizing the makespan stochastically.Of course this is of no interest here, as all work conserving policies minimizethe makespan. The analysis of this type of cost function is of interest to themultiple server case.

A cost function of interest here is the following: v0(x,i) =

∑mj=1 cjij . It is

easy to see that this function satisfies (1.11.3) if and only if µ1c1 ≥ · · · ≥ µmcm.This means that the µc-rule minimizes the costs in expectation. If idleness isallowed we have to add cj ≥ 0 to make the cost function satisfy (1.11.4). Inthe customer assignment models studied previously in this chapter every costfunction that was minimized in expectation was also minimized stochastically.Here this is not the case. To analyze the stochastic optimality, first assumecj ≥ 0. We distinguish three cases.

1. µ1 ≥ · · · ≥ µm, c1 ≥ · · · ≥ cm ≥ 0. Now I∑mj=1 cjij > k satisfies the

conditions too. Indeed, we have v0(x,i−ej1 ) ≤ v0

(x,i−ej2 ) ≤ v0(x,i). Therefore

∆j1v0(x,i−ej1 ) ≥ ∆j2v

0(x,i−ej2 ). Together with µj1 ≥ µj2 we have (1.11.3).

2. There are j1 and j2 such that j1 < j2 and µj1 < µj2 . For example, takem = 2, no arrivals, i = (1, 1), µ1 = 1, c1 = 5, µ2 = 2 and c2 = 2. The µc-rule prescribes class 1, however, if we want to minimize IP(i1c1 + i2c2 ≥ 6)for some T , we should start with class 2.

3. There are j1 and j2 such that j1 < j2 and cj1 < cj2 . For example, takem = 2, no arrivals, i = (1, 1), µ1 = 4, c1 = 1, µ2 = 1 and c2 = 3. Againthe µc-rule prescribes class 1, but we should choose class 2 to minimizeIP(i1c1 + i2c2 ≥ 2) for some T .

Thus the µc-rule is stochastically optimal only if µ1 ≥ · · · ≥ µm and c1 ≥ · · · ≥cm. We call the service rates and costs in this case agreeable. When there areno arrivals, the stochastic optimality also follows from Righter & Shanthikumar[58], by taking, in their notation, fj(Cj) = cjICj > T.

If we do not allow idleness, i.e. when the holding costs can be negative, thecondition for stochastic optimality is as follows. Take m1 such that c1 ≥ · · · ≥cm1 ≥ 0 ≥ cm1+1 ≥ · · · ≥ cm. If µ1 ≥ · · · ≥ µm1 and if µm1+1 ≤ · · · ≤ µm,then I

∑mj=1 cjij > k satisfies (1.11.3).

Several other interesting cost functions, like the expected weighted numberof late customers and the expected weighted sum of customer tardiness, alsosatisfy the conditions on the cost functions. See Chang et al. [12] for details.

We change the model as follows. When a customer in queue j is servedit leaves the system with rate µj , and joins queue f(j) with rate µ − µj . Weassume that f(j) ≥ j−1. If f(j) = j for all j we have the same model as above.This service mechanism can be formulated in terms of successful departures. If

Page 30: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

26 Models with Markov Arrival Processes

pj = µj/µ, then pj is the probability that a departure is successful. The valuefunction becomes

vn+1(x,i) =

∑y

λxy

( m∑j=1

qjxyvn(y,i+ej)

+(1−

m∑j=1

qjxy)vn(y,i)

)+

minl

µlv

n(x,i−el) + (µ− µl)vn(x,i−el+ef(l))

+ (1− γ − µ)vn(x,i).

(1.11.5)

Again, the minimization ranges over all l with il > 0. If idleness is allowed,action 0 (with µ0 = 0, f(0) = 0 and e0 = 0) has to be added to the actions.

1.11.5. Lemma. If idleness is not allowed or not optimal in each state and

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i−ej1+ef(j1)) ≤ (1.11.6)

µj2w(x,i−ej2 ) + (µ− µj2)w(x,i−ej2+ef(j2)) for j1 < j2 and ij1 , ij2 > 0

hold for the cost function v0, then they hold for all vn.

The proof can be found in chapter 4. Similar results for monotonicity hold:

1.11.6. Lemma. If

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i−ej1+ef(j1)) ≤ µw(x,i) for ij1 > 0 (1.11.7)

holds for the cost function v0, then it holds for all vn.

The proof can be found in chapter 4. Again we have:

1.11.7. Theorem. The SIP minimizes the costs at T for all cost functionssatisfying (1.11.6), when idleness is not allowed. When idleness is allowed,(1.11.7) should be added.

Similar results are obtained in section 3 of Nain [49]. Actually, he al-lows random routing of unsuccessfully served customers, but, as in the presentmodel, only to higher numbered queues. We chose not to model random routingso as to keep the notation simple.

An interesting case we can model is that of a single class of DFR servicetimes. We use the characterization of DFR distributions by phase-type distri-butions as shown in appendix B. There the transition intensity in each phaseis taken to be equal. After k phases of service a customer finishes service withprobability αk or receives one or more additional phases of service with proba-bility 1−αk. If a DFR distribution is approximated by phase-type distributionsin this way, then the αk are non-increasing. It does not restrict generality totake αk = αl for k > l, with l a constant (in appendix B l = m2).

Consider the following server assignment model. Take m = l, pj = αj ,f(j) = j + 1 if j < m, and f(m) = m. The costs are linear with cj = 1 for

Page 31: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with a single server 27

all j. This cost function satisfies (1.11.6), because the αj are decreasing. Thusthe expected number of customer in the system is minimized in expectation byserving the customer with the highest failure rate. Using the same argumentas for the model without routing of customers, it follows that this result holdsalso stochastically. Using a limiting argument, this gives that the number ofcustomers in an G|DFR|1 queue is minimized at T by the policy that servesthe customer with the least attained service time (the LAST policy).

Note that customers are generally not served until they leave the system,but only until they change phase. For the limiting case this gives processorsharing as the service discipline for all customers who have received the sameamount of service.

Taking pj = αm−j , f(j) = j − 1 and cj = −1 shows that MAST (mostattained service time) maximizes the number of customers in the system. Al-though the pk are increasing, also v0

(x,i) = I|i| ≤ s satisfies the conditions,and thus the result holds also stochastically. Note that MAST is equivalent toFCFS.

For IFR service times it is shown in appendix B that the αk are non-decreasing. In this case the above results are reversed.

1.11.8. Theorem. LAST (FCFS) stochastically minimizes the number ofcustomers at T in a G|G|1 queue in the case of DFR (IFR) service times;LAST (FCFS) stochastically maximizes the number of customers at T in aG|G|1 queue in the case of IFR (DFR) service times.

All these results can also be found in Righter & Shanthikumar [57].We continue with generalizing the above results to models with multiple

customer classes. To avoid certain technicalities we assume that each class haseither positive holding costs and a DFR service time distribution, or negativeholding costs and a IFR service time distribution. From the construction itwill be clear how to deal with the other two cases; in these cases however thecondition that f(j) ≥ j − 1 can easily be violated.

Thus assume first that each class has its own DFR service times, classn having ln phases, n = 1, . . . , r, r being the number of classes. The successprobability of phase k of class n is αnk , the holding costs are c∗n > 0, independentof the phase. We make a distinction between classes and queues. Now take foreach class and possible phase a queue, i.e. l1 + · · · + lr queues, with jnk thenumber of customers in the queue corresponding to the nth customer class andthe kth phase. Of course, we take f(jnk) = jnk+1 if k < ln and f(jnln) = jnln ,pnk = αnk , and cnk = c∗n. Order the queues in decreasing value of pnkcnk. Thenthe µc-rule is optimal. We use a limiting argument to get the result for the caseof general DFR service times. The optimal policy serves the customer with thehighest product of holding cost and failure rate. We call this policy again theµc-rule.

As indicated, it is also possible to have customer classes with IFR servicetime distributions and negative holding costs (requiring that idling is not al-lowed). In this case all DFR customers are first served, possibly using processor

Page 32: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

28 Models with Markov Arrival Processes

sharing. Then (as long as there are no arrivals) the customers with IFR servicetimes are served.

1.11.9. Theorem. The µc-rule minimizes the expected holding costs at Tif customers have either DFR or IFR service time distributions, provided theholding costs for customers with DFR (IFR) service time distributions arepositive (negative). Idleness is allowed if there are no customers with IFRservice times.

For a stochastic result we need that αn1k1≥ αn2

k2for all k1 and k2 if cn1 > cn2 .

In the limiting case this means that the failure rate of class n1 is always higherthan the failure rate of class n2. This is the case if we have a family of randomprocessing times with decreasing failure rate (see for example section 4.2 ofWeiss [79]).

The results of this section are a superset of those in Koole [37]: there it isassumed that f(j) ≥ j instead of f(j) ≥ j − 1.

Remark. Equation (1.11.5) can be written as vn+1(x,i) = γT1v

n(x,i) + µT2v

n(x,i),

with T1vn(x,i) =

(∑y λxy(· · ·)

)/γ and T2v

n(x,i) =

(minl· · ·

)/µ, if we assume

that γ + µ = 1. Here T1 and T2 themselves can be seen as dp operators. Inchapter 5 it is shown how convex combinations of dp operators result from acontinuous time model.

A discrete time model however would typically consist of a departure andan arrival event in succession, resulting in a dp equation of the form wn+1

(x,i) =T2T1w

n(x,i).

The proof of the lemmas 1.11.5 and 1.11.6 basically consists of showingthat the equations propagate for T1 and T2. Of course, this implies that thelemmas hold as well for wn, proving the optimality of the SIP for the discretetime model. A direct proof of this result can be found in Weishaupt [78].

The generalization to other models, as the one with multiple servers (stud-ied in the next section) or the customer asssignment models studied earlier, areless direct because in these models there are events which have to be dealt withsimultaneously. A more systematic study of different types of value functionbased on operators as T1 and T2 here can be found in Altman & Koole [2].

Page 33: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with multiple servers 29

1.12. Server assignment model with multiple servers

In this section we study again the first model of the previous section, but withmultiple servers. Server vacations are interesting here and therefore we modelthem as well. It was shown by Bruno et al. [7] that the policy that assignsthe available servers to the jobs with lowest service intensities, i.e. the jobswith the largest expected processing time, minimizes the expected makespan.The optimal policy is called LEPT. Weber [76] generalized this to stochasticoptimality. Giving conditions on the cost functions for LEPT to be optimal,for arrivals according to an MAP and arbitrary server vacations, is the mainsubject of this section. The cost function corresponding to the makespan willindeed appear to be allowable. Independently, similar results were derived byChang et al. [12].

We assume that µ1 ≤ · · · ≤ µm. As in section 1.10, s is the numberof servers, and s(x) is the number of servers currently available, i.e. s(x) isdetermined by x, the state of the MAP. We assume γ + sµm ≤ 1, i.e. wehave uniformized the model. Take µ = µm. An assignment action in (x, i)consists of the s(x) class numbers to which the available servers are assigned.We introduce again class 0, µ0 = 0. If a server is assigned to class 0 it idles.Now we can assume that there will be no more servers assigned to a class thenthere are customers in that class. These actions are called admissible. Thedynamic programming equation is:

vn+1(x,i) = min

l1,...,ls(x)

∑y

λxy

( m∑j=1

qjxyvn(y,i+ej)

+(1−

m∑j=1

qjxy)vn(y,i)

)+

s(x)∑k=1

(µlkv

n(x,i−elk ) + (µ− µlk)vn(x,i)

)+ (1− γ − s(x)µ)vn(x,i)

=

∑y

λxy

( m∑j=1

qjxyvn(y,i+ej)

+(1−

m∑j=1

qjxy)vn(y,i)

)+

minl1,...,ls(x)

s(x)∑k=1

(µlkv

n(x,i−elk ) + (µ− µlk)vn(x,i)

)+ (1− γ − s(x)µ)vn(x,i).

(1.12.1)To make the action unique we can assume l1 ≤ · · · ≤ ls(x).

1.12.1. Lemma. If

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i) ≤ µj2w(x,i−ej2 ) + (µ− µj2)w(x,i) (1.12.2)for j1 < j2 both admissible and |i| ≥ 2

and

w(x,i−ej1 ) ≤ w(x,i) for ij1 > 0 (1.12.3)

Page 34: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

30 Models with Markov Arrival Processes

hold for the cost function v0, then they hold for all vn.

If the number of customers is s(x) + 1 or more and there are admissibleactions j1, . . . , js(x) and j∗1 , j2, . . . , js(x) with j1 < j∗1 , then

s(x)∑k=1

(µjkv

n(x,i−ejk ) + (µ− µjk)vn(x,i)

)≤

µj∗1 vn(x,i−ej∗

1) + (µ− µj∗1 )vn(x,i) +

s(x)∑k=2

(µjkv

n(x,i−ejk ) + (µ− µjk)vn(x,i)

)is equivalent to (1.12.2). This means that (1.12.2) and (1.12.3) gives us theoptimal policy. Equation (1.12.3) says that, if possible, no server should idle.By (1.12.2) we know that, when there are more than s customers, we shouldserve the group of customers with indexes as small as possible. Thus the SIP,which is here equal to LEPT, is optimal.

As contrasted with the single server case, we need (1.12.3) in the proof of(1.12.2). The model here is a special case of the model of section 3.6, thus forthe proof we refer to the proof of lemma 3.6.1.

1.12.2. Theorem. The SIP minimizes the costs at T for all cost functionssatisfying (1.12.2) and (1.12.3).

Note that the inequalities (1.12.2) and (1.12.3) are the same as (1.11.1)and (1.11.2). Thus, (1.11.3) and (1.11.4) characterize again the allowable costfunctions. However, we have the extra condition µ1 ≤ · · · ≤ µm. This means,in the case of linear costs, that the µc-rule is optimal in the multiple servermodel if µ1 ≤ · · · ≤ µm and µ1c1 ≥ · · · ≥ µmcm. To satisfy the monotonicitywe assume cm ≥ 0. Note that if µ1 = 0 then (1.12.2) and (1.12.3) give that thecosts in each state must be equal.

Now we go into the details of cost functions of the type v0(x,i) = I|i| > 0.

As said in the previous section, we conclude that the probability that there areany customers present at T is minimized by LEPT.

We can modify the system such that it remains empty once it becomesempty, by taking vn+1

(x,0) =∑y λxyv

n(y,0) + (1 −

∑y λxy)vn(x,0). Lemma 1.12.1

still holds for this model. In section 3.6 another approach with the same resultis taken. Now we can study the probability that the system becomes emptybefore T . This means that the SIP minimizes the length of the busy period.

As shown in section 1.1, we can model the departure process of mostqueueing systems with an MAP. This way we can model tandem systems, ofwhich the center with state i is the last in line, although we cannot let theactions taken in the first centers depend on the state of the last center, as thiswould introduce a dependence on the last center. Tandem models with thistype of dependence are the subject of the next chapter. For tandem systemswithout this dependence, we might be interested in the moment the whole

Page 35: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with multiple servers 31

system becomes empty. Now we have to take v0(x,i) = I|i| > 0 or |x| > 0,

where x is the vector denoting the state of all centers but the last one. Thisgives similar results as above, but now for emptiness of the whole system.

As argued in section 1.10, we can take sets Λk ⊂ Λ, denoting the set ofstates for which the number of arrivals in all classes at reaching that stateis k or more. By taking v0

(x,i) = I|i| > 0 or x 6∈ Λk we can study the firsttime after the kth arrival at which the system becomes empty. If there are noarrivals after the kth we have the makespan in the release date model of Weber[76] and Chang et al. [12]. Note that the conditions on the cost functions inChang et al. [12] are the same as the conditions here. The generalization ofthis section consists of a more general arrival process.

When considering linear costs, we cannot take c1 = · · · = cm = 1 unlessµ1 = · · · = µm. This is not strange, because it is intuitively clear that LEPTdoes not minimize the number of customers at any T . The perhaps morelogical candidate for optimality, the policy that serves customers with highservice rates first (the SEPT policy), is not optimal either. This we show withthe following example.

Take the following model: s = 2, m = 2, µ1 = 2 and µ2 = 1. There areno arrivals, and we start with i1 = 2 and i2 = 1. The objective function isthe expected number of customers at T . The possible work-conserving policiesare LEPT which starts serving a class 1 and a class 2 customer at time 0 andSEPT which starts with both class 1 customers. In the continuous-time modelit is easy to compute the expected number of departed customers L at T usingthe following formula, with α1 and α2 the service rates of the customers servedfirst, α3 the rate of the other customer (note that α3 < α1 + α2):

L =∫ T

0

(α1 + α2)e−(α1+α2)tdt+

α1

α1 + α2

∫ T

0

(α1 + α2)e−(α1+α2)t(1− e−α2(T−t))dt+

α2

α1 + α2

∫ T

0

(α1 + α2)e−(α1+α2)t(1− e−α1(T−t))dt+

∫ T

0

(α1 + α2)e−(α1+α2)t(1− e−α3(T−t))dt =

1− e−(α1+α2)T+

1 + e−(α1+α2)T − e−α1T − e−α2T+

1− e−(α1+α2)T − α1 + α2

α1 + α2 − α3e−α3T (1− e−(α1+α2−α3)T ).

The first line in the last expression is the probability that the first departuretakes place before T . The second line is equal to (1 − e−α1T )(1 − e−α2T ), the

Page 36: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

32 Models with Markov Arrival Processes

probability that both first scheduled jobs finish. The last term is concernedwith the customer scheduled last.

Using a small computer program we computed L for LEPT (α1 = 1,α2 = α3 = 2) and the SEPT (α1 = α2 = 2, α3 = 1). For T small SEPT isbetter at 0 (for T = 0.1 we have L = 0.380 ≈ 4T for SEPT and L = 0.302 ≈ 3Tfor LEPT as can be expected from the infinitesimal properties). However, forT larger, LEPT is better (for T = 3 we have 2.929 for SEPT vs. 2.941 forLEPT). Thus there is no myopic optimal policy. Typically, the optimal policyis equal to LEPT at time 0 (if T is large enough), and change to SEPT as timegoes on: if we are at T − ε with ε small and still no customer have left SEPTis optimal. It is well known, see e.g. Weber [76] and Chang et al. [13], that ifwe replace the number of customers at T by the integral from 0 to T of thenumber of customers, i.e. if we consider flowtime, then SEPT is stochasticallyoptimal.

Page 37: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Chapter 2

Models with Markov Decision Arrival Processes

2.1. Markov Decision Arrival Processes

In the previous chapter we studied models with arrivals which were modeledby an MAP. In an MAP the arrival times depend only on x, the state of theMAP, and not on i, the state of the queues. In this chapter we generalize theMAP to allow for a certain type of dependency on the state of the queues. Ofcourse, this dependency cannot be taken completely general. Take for examplea customer assignment model in which arrivals occur more frequently if thequeues are balanced. Then it is clear that it might be optimal to assign anarriving customer to the longest queue to suppress future arrivals. Thereforewe model the dependence using actions in the arrival process, while keeping, fora fixed action, the transition intensities independent of the state of the queues.This leads to the following definitions. First we describe the arrival processwithout multiple customer classes or server vacations.

2.1.1. Definition. (Markov Decision Arrival Process) Let Λ be thecountable state space of a Markov decision process with transition intensitiesλxay with x, y ∈ Λ and a ∈ A(x), the set of actions in x. When this processmoves from x to y, while action a was chosen, then with probability qxay anarrival occurs. We call the quadruple (Λ, A, λ, q) a Markov Decision ArrivalProcess (MDAP).

Note the similarity with the definition of the MAP in section 1.1: if wetake |A(x)| = 1 for all x, we have an MAP. We use definition 2.1.1 in thesections on the customer assignment models. In the server assignment modelswe need again arrivals in multiple classes and server vacations. The equivalentof definition 1.10.1 is:

2.1.2. Definition. (Markov Decision Arrival Process) Let Λ be thecountable state space of a Markov decision process with transition intensitiesλxay with x, y ∈ Λ and a ∈ A(x), the set of actions in x. When this processmoves from x to y, while action a was chosen, then with probability qkxay anarrival in class 1 ≤ k ≤ m occurs, and with probability qm+k

xay an event withserver 1 ≤ k ≤ s occurs. There are sets Λs1, . . . ,Λ

ss such that server k is available

if and only if x ∈ Λsk, and sets Λa1n, . . . ,Λamn, n ∈ IN, such that if x ∈ Λakn then

there have been n or more arrivals of class k. We call the quadruple (Λ, A, λ, q)a Markov Decision Arrival Process (MDAP).

Page 38: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

34 Models with Markov Decision Arrival Processes

In the next section we start by illustrating the use of an MDAP for cus-tomer assignment models. Again we assume that the transition intensities ineach state are equal, i.e.

∑y λxay = γ for all x ∈ Λ and a ∈ A(x).

2.2. Symmetric customer assignment model

We consider the model of section 1.2, but with arrivals according to an MDAP.Thus, we have m queues, with buffer sizes B = (B1, . . . , Bm) and serviceintensity µ. The results for this model are quite similar to the results of section1.2, the only difference being the arrival process. To obtain optimality resultsboth at T and from 0 to T we now need to introduce immediate costs c(x,i).See section 5.3 for more details. Before illustrating the use of the MDAP, wegive the dynamic programming equation. Note the similarity with (1.2.1).

vn+1(x,i) =c(x,i) + min

a

∑y

λxay

(qxay min

jvn(y,i+ej)+ (1− qxay)vn(y,i)

)+

m∑j=1

µvn(x,(i−ej)+) + (1− γ −mµ)vn(x,i). (2.2.1)

The second minimization ranges again over all j for which the queues are notfull, i.e. for which ij < Bj .

The MDAP is especially designed to model the arrivals at the last centerof a tandem network. To show this, assume there are m (m) queues in thefirst (second) center, with state (ı1, . . . , ım) ((i1, . . . , im)), service intensities µ(µ) and buffer sizes B (B). The arrival process at the first center is Poissonwith rate λ. Assignment actions are taken in both centers, and these actionsare allowed to depend on the whole state of the system. Then the dynamicprogramming recursion is:

vn+1(ı,i) =c(ı,i)+ min

a

λvn

(ı+ea∧B,i)+

m∑j1=1

(δıj1 µmin

jvn(ı−ej1 ,i+ej∧B)+ (1− δıj1 )µvn(ı,i)

)+

m∑j=1

µvn(ı,(i−ej)+) + (1− λ− mµ−mµ)vn(ı,i).

Now, if we take λı,a,ı+ea = λ and qı,a,ı+ea = 0, λı,a,ı−ej = µ and qı,a,ı−ej =1 if ıj > 0, λıaı = γ− λ− µ

∑mj=1 δıj and qıaı = 0, and all other transition rates

0, then this recursive equation has the form of (2.2.1). Thus we have modeledthe first center as an MDAP.

It is easy to see that, instead of a tandem system, we can model anynetwork in which i is the state of a center without feedback to the network.

We return to the general model with an MDAP. As in the case with anMAP, we have the following result:

Page 39: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Symmetric customer assignment model 35

2.2.1. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 , i+ ej1 + ej2 ≤ B, (2.2.2)w(x,i) ≤ w(x,i+ej1 ) for i+ ej1 ≤ B (2.2.3)

and

w(x,i) = w(x,i∗) for i∗ a permutation of i, i∗ ≤ B (2.2.4)

hold for the cost functions c and v0, then they hold for all vn.

For the proof we refer to the more general model of section 3.3. The resultsays again that an SQP is optimal, for suitable cost functions. An SQP, becauseSQP refers only to the assignment of the customers to the queues, and not tothe action in the MDAP. For the tandem model described above it follows thatit is optimal in the second center to employ the SQP, if the first center is alsocontrolled optimally. How this first center should be controlled is the subjectof the next section.

Because the optimal actions in the MDAP can depend on n, we cannot usethe method of section 5.2, but we need a limiting argument. To use this, we havesome minor restrictions on the cost functions. All cost functions consideredhere satisfy these conditions. Note that some of our cost functions, like |i|, areunbounded. Still they satisfy the conditions, which are given in assumption5.3.1. Throughout this chapter we assume that this assumption holds.

2.2.2. Theorem. For all T , an SQP minimizes the costs at T (and from 0 toT ) for all cost functions satisfying (2.2.2) to (2.2.4).

The conditions on the cost functions are exactly the same as in section 1.2,thus we refer to that section for a discussion of the allowable cost functions.Regarding stochastic optimality however, results are not as easy, as the optimalpolicy depends on the horizon. Of course, if v0 is allowable, an SQP minimizesIv0

(x,i) > s for each value of s, but for different values of s different SQP’scan be optimal. Examples showing this are easily given. Thus there is nosingle policy that is better than all policies R and all values of s. It is an openquestion if there is for every fixed policy R an SQP which is better for all s.

Page 40: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

36 Models with Markov Decision Arrival Processes

2.3. Tandems of customer assignment models

In this section we consider the tandem system introduced in the previous sec-tion, with µ = µ and B = B = ∞. Customers arrive at the first centeraccording to a Poisson process with intensity λ. In the previous section we sawthat in the second center the SQP should be used. Here we study the optimalassignment in the first center. We use the same notation. First we show thatthe SQP is not optimal in the first center.

Consider a system in which there is only one arrival at time 0. We computethe expected flowtime (which is the sum of the departure times). As the initialstate we take ı = (1, 0), i = (5, 5). Thus we have to decide whether to routethe arriving customer to queue 2 or to queue 1 at the first center. Take µ = 1.Let us denote the expected flowtime if we start with (ı, i) with f( ı1i1ı2i2

). Thesenumbers can be calculated with the recursive formulae

f( 0000 ) = 0,

f( ı1i1ı2i2) =

(ı1 + ı2 + i1 + i2 + δı1f( ı1−1

ı2

i1+1i2

) + δı2f( ı1ı2−1

i1+1i2

)+

δi1f( ı1ı2i1−1i2

) + δi2f( ı1ı2i1i2−1 )

)/(δı1 + δı2 + δi1 + δi2

)if i1 ≤ i2,

f( ı1i1ı2i2) =

(ı1 + ı2 + i1 + i2 + δı1f( ı1−1

ı2

i1i2+1 ) + δı2f( ı1

ı2−1i1i2+1 )+

δi1f( ı1ı2i1−1i2

) + δi2f( ı1ı2i1i2−1 )

)/(δı1 + δı2 + δi1 + δi2

)if i1 ≥ i2,

We found that f( 2505 ) = 41.63 < 41.67 = f( 15

15 ). Because the flowtime is theintegral of the number of customers over time, there are T ’s for which thenumber of customers at T is not minimized by the SQP in both centers. Thisis because if it were, the expected number of customers would be smaller underthe SQP for all T , and so the flowtime would also be smaller.

Define fT ( ı1i1ı2i2) as the expected flowtime up to T , i.e. the expected number

of customers integrated from 0 to T . Add an extra superscript A to denote themodel with Poisson(λ) arrivals. It is easily seen that f( ı1i1ı2i2

)−fT ( ı1i1ı2i2)→ 0, as

T increases. Take T such that this difference is smaller than 0.01. Take λ smallenough such that the expected flowtime of the arrivals before T is smaller than0.01. Then we have

fTA( 2505 ) ≤ fT ( 25

05 ) + 0.01 ≤ f( 2505 ) + 0.01 < f( 15

15 )− 0.01 ≤ fT ( 1515 ) ≤ fTA( 15

15 ),

where the first inequality follows by the choice of λ, and the fourth follows bythe choice of T . This shows that, for λ sufficiently small, there are states inwhich routing according to the SQP is suboptimal.

An intuitive explanation of this phenomenon is easily given. When bothqueues of the second center are heavily loaded, it pays to delay arriving cus-tomers, which allows one to see how the center evolves in time. This can bedone by assigning customers arriving at the first center to the longest queue.

To study the optimal policy for more realistic values of λ than considered inthe last section we did various numerical calculations on the two center model.

Page 41: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of customer assignment models 37

Again we fixed the service rate µ to 1 and varied the arrival rate λ. Becausewe used successive approximation (see appendix D for a practical discussion ofcomputational algorithms) we had to introduce buffers (in each queue equal)to make the state space finite. We also varied these buffer sizes to study theinfluence of the finite buffers on our model. To minimize blocking influence weassumed that no service takes place at the first center if the second center isfull. Note that this type of arrival process cannot be modeled with an MDAP,due to the blocking protocol. We computed the optimal policy in the firstcenter for discounted and average costs. Our results are summarized in the twotables below. First we consider discounted costs.

In Hordijk & Koole [22] we took as immediate reward the expected num-ber of departed customers. The advantage of taking this reward is that theoptimal policy does not seem to depend on the buffer sizes. Because we haveworked so far with the total number of customers we do the same in the presentcalculations. However, this means a stronger buffer influence, especially whenλ ≥ 2. Therefore we only considered λ < 2 here. Because µ = 1, it follows fromKingman [34] that a single center model operated by the SQP has a stationarydistribution under this assumption. (See Adan et al. [1] for a recent resultand references on computational issues regarding the SQP.) Thus, if we take Blarge enough, we expect to have little buffer influence.

The results for B varying from 20 to 45 are shown in table 2.3.1. For eachcombination of β, the discount factor, and λ the table contains the maximumrelative difference between the optimal policy and the SQP, and the state wherethis maximum is attained. These numbers are calculated with the formulamaxi(vβi (SQP)−vβi )/vβi , where vβi (vβi (SQP)) are the costs under the optimalpolicy (the SQP) and the minimization is taken over all possible states. It isclear from the table that the SQP is nearly optimal. In some cases the differencedecreases as B increases, thus in these cases the SQP might be optimal forB =∞. In these states we increased B, if possible, until the relative differencewas smaller than 10−15. In the cases with λ = 1.5 and β = 0.75, and λ = 1.9and β = 0.5 and 0.75 we were, due to computational difficulties, not able toincrease B any further.

β = 0.01 0.1 0.25 0.5 0.75

λ =0.1 <10−15 1.5·10−13,( 01

1414 ) 1.7·10−9,( 0

11010 ) 5.9·10−7,( 0

188 ) 1.8·10−5,( 0

299 )

0.25 <10−15 1.6·10−13,( 01

1414 ) 2.2·10−9,( 0

11010 ) 8.6·10−7,( 0

199 ) 2.6·10−5,( 0

299 )

0.5 <10−15 4.5·10−14,( 01

1515 ) 1.1·10−9,( 0

11111 ) 6.3·10−7,( 0

199 ) 2.2·10−5,( 0

21010 )

1 <10−15 <10−15 4.2·10−11,( 01

1313 ) 5.1·10−8,( 0

11111 ) 1.9·10−6,( 0

11111 )

1.5 <10−15 <10−15 2.2·10−14,( 01

1818 ) 8.4·10−11,( 0

11515 ) <10−11

1.9 <10−15 <10−15 <10−15 <10−14 <10−7

Table 2.3.1. Discounted costs

Page 42: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

38 Models with Markov Decision Arrival Processes

In the average cost case we again compared the optimal policy and theSQP. The relative difference between their average costs can be found in table2.3.2. Once again the SQP is nearly optimal. When comparing both tables,we see that the average cost case does not behave as the limiting case for thediscounted cost case. A possible explanation is the following. The difference inthe assignments in state ( 0

1kk ), with k ≈ 10, manifests itself if the second center

becomes empty. As there are many customers in the second center, this requiresthat β is close to 1. However, in the limiting average case the dependence onthe starting state has disappeared, and the effect of the few states where theoptimal action is not according to the SQP is small. It is interesting to notethat the states where the SQP is not optimal all have the same form, with fewcustomers in the first center, and heavily loaded, balanced queues in the secondcenter, as in the example at the beginning of this section.

λ = .1 .25 .5 1 1.5 1.9

<10−15 1.3·10−12 2.1·10−10 <10−9 <10−15 <10−4

Table 2.3.2. Average costs

Having seen that the SQP is not optimal in the case where the policiesdepend on the whole state of the system, we continue with studying the casewhere the policies only depend on local information, i.e. where the policy at acertain center depends only on the state of that center. We call this the partialinformation case, as contrasted to the full information case.

At the end of this section we study the general partial information case.We will see there that also in this case a counterexample to the optimality ofthe SQP can be constructed, for discounted costs. If we restrict the class ofadmissible policies even more, namely to static policies at the second center,we can prove the optimality of the SQP. A policy R is called a static policy ifit is defined by a sequence of random variables Πn, n ∈ IN, where Πn = jcorresponds to routing the nth arriving customer to queue j. The routingprobabilities are stochastically independent of the queue lengths and the ar-rival times. Both the Equal Splitting Policy and the Cyclic Assignment Policyof section 1.9 are static, as can be shown by taking all Πn independent andIP(Πn = 1) = 1

2 for all n for the ESP and IP(Πn+1 = j+ 1(mod 2)|Πn = j) = 1for all n for the CAP. The SQP is not static. We prove that, for partial infor-mation, the SQP in both centers gives an earlier departure process than thetwo center policy which uses a static policy in the second center. Because weuse coupling this means also that the number of customers is minimized by theSQP in both centers.

To show our result we need two theorems. The first states that the SQPgives a pathwise earlier departure process. The second theorem says that fora static policy an earlier arrival process gives an earlier departure process.Combining these theorems gives indeed the result on static policies. We seean arrival process as a sequence of arrival times. That is, the arrival processV = Vn, n ∈ IN has Vn as the time of the nth arrival. For arrival processes

Page 43: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of customer assignment models 39

V = Vn, n ∈ IN and W = Wn, n ∈ IN we say that V is pathwise earlier thanW and we write V ≤p W if there are arrival processes V ∗ and W ∗ with V ∗ d= V

and W ∗d= W such that V ∗n (ω) ≤ W ∗n(ω) for all ω in some probability space

and n ∈ IN. We use a similar definition and notation for departure processes.With d= we mean that the processes on either side have the same distribution.By theorem 1.5.2 we have:

2.3.1. Corollary. Consider a center with two parallel queues, arrival processU and policy SQP and a similar center with an arbitrary policy R. If V andV are the respective departure processes, then V ≤p V .

2.3.2. Theorem. Consider one center with two parallel queues and a staticpolicy R. For arrival processes T and T the departure processes are denotedby V and V , respectively. If T ≤p T , then V ≤p V .

Proof. Because T ≤p T there are arrival processes T ∗ and T ∗ with Td= T ∗

and Td= T ∗ such that T ∗n(ω) ≤ T ∗n(ω) for all n and ω. Fix ω ∈ Ω. We use

the following notation: T ∗n(ω) = tn, T ∗n(ω) = tn. Let Sn (Sn) be the servicetime of the nth customer and Un (Un) the queue to which the nth customeris routed. Of course Sn

d= Sn. Because R is static we also have Und= Un.

Hence by coupling arguments we may assume that Sn = Sn and Un = Un forall n. Denote an arbitrary realization of Sn, Un, n > 0 with sn, un, n > 0. Weomit the superscript ∗. Let ξT,V (t) (ξT,V (t)) be the number of arrived (served)customers at time t. A subscript j denotes a specific queue. Then

ξVj (t) =ξT (t)∑n=1

Iun = j Itk +n∑l=k

Iul = jsl ≤ t, k = 1, . . . , n

≥ξT (t)∑n=1

Iun = j Itk +n∑l=k

Iul = jsl ≤ t, k = 1, . . . , n

≥ξT (t)∑n=1

Iun = j Itk +n∑l=k

Iul = jsl ≤ t, k = 1, . . . , n

= ξVj (t), j = 1, 2.

Thus ξV (t) = ξV1 (t) + ξV2 (t) ≥ ξV (t) for all t and tVn ≤ tVn for all n.

This theorem is also true for general service times. Unfortunately, it doesnot hold for the SQP as the following counterexample shows.

Take T1 = T2 = T3 = T1 = T2 = 0; T3 = h; Tn = Tn > 1 + h for alln ≥ 4. Thus T ≤p T . Compare the probabilities that 2 customers have left att = 1 + h. Condition on the number of departures in [0, h]. If no departuresoccur in [0, h], the two systems are the same.

Page 44: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

40 Models with Markov Decision Arrival Processes

On the other hand, if exactly one departure occurs in [0, h], the time un-til the next departure in the T -model has with probability 1

2 an exponentialdistribution with parameter µ and with probability 1

2 an exponential distribu-tion with parameter 2µ. Indeed, the customer departing in [0, h] leaves thequeue with one customer with probability 1

2 and the queue with two customerswith probability 1

2 as well. In the T -model the at h arriving customer choosesthe empty queue, therefore the time until the next departure is Erlang(2µ)distributed. The difference between these two probabilities, say c, does notdepend on h, but only on µ. The probability that one customer leaves in [0, h]is equal to 2µh+ o(h).

The probability that two customers leave in [0, h] is o(h). Now we have:

IPT (2 customers leave in [0, 1 + h])−IPT (2 customers leave in [0, 1 + h]) = 2µhc+ o(h) > 0,

if h is small enough. Note that the idea behind this counterexample is similarto the counterexample in the full information case; there, by sending to thelonger queue, the arrivals of the customers at the second center were delayed.

Combining corollary 2.3.1 and theorem 2.3.2 gives the following result forthe two centers in tandem.

2.3.3. Theorem. Let R = (R1, R2) be the two center policy with staticpolicy R2 in center 2. Let R∗ = (SQP,SQP) be the two center policy whichuses the SQP in both centers. For a general arrival process T let W (W ) bethe departure processes of the second center under R (R∗). Then W ≤p W .

Proof. The proof follows easily from corollary 2.3.1 and theorem 2.3.2. Asdepicted in figure 2.3.1 let V (V ) denote the departure processes of the firstcenter under policy R (R∗). The departure process of the second center for thepolicy (SQP, R2) is denoted by W .

.......................................................................................................................T

.........................................................................................................

.................................................................... R1.......................................................................................................................

V..........................................................................

...............................

.................................................................... R2.......................................................................................................................

W

.......................................................................................................................T

.........................................................................................................

.................................................................... SQP .......................................................................................................................V

.........................................................................................................

.................................................................... R2.......................................................................................................................

W

.......................................................................................................................T

.........................................................................................................

.................................................................... SQP .......................................................................................................................V

.........................................................................................................

.................................................................... SQP .......................................................................................................................W

Figure 2.3.1.

From corollary 2.3.1 we have V ≤p V . Hence by theorem 2.3.2 W ≤p W .Corollary 2.3.1 also gives W ≤p W . Combining the last two inequalities yieldsW ≤p W .

This result can also be found in [22].

Page 45: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of customer assignment models 41

It is straightforward to generalize theorem 2.3.3 to a network of centers intandem. The proof goes by induction on the number of centers. Suppose it istrue for k centers. Assume that V , V is the departure process of the kth centerwhen using (R1, . . . , Rk) with Ri static for 2 ≤ i ≤ k, respectively the SQP ineach center. Then by the induction hypothesis V ≤p V and we can use againthe same arguments as in the proof of theorem 2.3.3.

In the partial information case the policy in each center is not allowed todepend on the state of the other center. But, the fact that the state of theother center is unknown does not mean that there is no information on theother center. For example, in the discounted cost case, decisions taken early intime weight more heavily than decisions taken later. This means that there isa dependence on the starting state. The same phenomenon occurs for averagecosts, for multichain models. Of course the model studied here is unichain, butfor discounted costs we were able to construct a policy R∗, which uses partialinformation, that is better for certain starting states than the SQP. In center 2R∗ uses the SQP. Also in center 1 the SQP is used, except in state (0, 1) and(1, 0). With the counterexample from the beginning of this section in mind, wemight expect that R∗ performs better than the SQP for starting states of theform (0, 1, k, k) and suitable choices of parameters. Indeed, take µ = 1, λ = 0.01and discount factor 0.9 in the discrete-time normalized model. Then the infinitehorizon expected discounted number of customers is smaller under the SQP forstarting states like (0, 1, 0, 0), but R∗ is better for starting state (0, 1, 10, 10).The relative differences are respectively 1.9 · 10−3 and −1.3 · 10−7. Whetherthe SQP is optimal for average costs remains an open question. On the onehand, the model is unichain, and therefore the optimal policy is independent ofthe starting state; on the other hand, the numbers of customers in the centersare not independent, meaning that some information on the state of the othercenter can be obtained from the state of the present center.

In Hordijk & Koole [22] we conjectured that the SQP is optimal in thepartial information case. This is clearly falsified by the present results. Ourconjecture was based on numerical results obtained by Loeve & Pols [44], whoused an algorithm derived by Kulkarni & Serin [41] to find local optima orsaddle points in the class of policies that use partial information. In all probleminstances they considered, the SQP is optimal.

Page 46: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

42 Models with Markov Decision Arrival Processes

2.4. Networks of customer assignment models with workloads

The model studied in section 1.8 is concerned with the workloads and not withthe actual departures. This means that we cannot distinguish the departureepochs of the customers. Therefore it is not of interest to the network modelsstudied in this chapter to generalize the dynamic programming result of the-orem 1.8.2 on the optimality of the SWP to arrivals according to an MDAP.For this reason we leave this generalization to chapter 3. Of interest here isto try to obtain similar results as in section 2.3 for tandem systems and othernetworks.

In section 1.5 we saw that the SQP is pathwise optimal in the queuelength model. The same holds for the workload model, as stated in section 1.8.However, we showed in the previous section that the SQP is not monotone inthe sense that earlier arrivals give earlier departures. This is not the case forthe SWP. Because the SWP is equivalent to a single multi-server queue withFCFS discipline, the monotonicity is easily shown by a coupling argument.This gives the following result.

2.4.1. Theorem. Let R = (R1, R2) be a two center policy with R1 andR2 not depending on the workload of the other conter (thus R uses partialinformation). Denote with R∗ = (SWP,SWP) the two center policy which usethe SWP in both centers. For a general arrival process T let W (W ) be thedeparture processes of the second center under R (R∗). Then W ≤p W .

Proof. Again, the result follows easily by considering a picture.

.......................................................................................................................T

.........................................................................................................

.................................................................... R1.......................................................................................................................

V..........................................................................

...............................

.................................................................... R2.......................................................................................................................

W

.......................................................................................................................T

.........................................................................................................

.................................................................... R1.......................................................................................................................

V..........................................................................

...............................

.................................................................... SWP .......................................................................................................................W

.......................................................................................................................T

.........................................................................................................

.................................................................... SWP .......................................................................................................................V

.........................................................................................................

.................................................................... SWP .......................................................................................................................W

Figure 2.4.1.

By the pathwise optimality of the SWP we have W ≤p W and V ≤p V .By the monotonicity of the SWP we have W ≤p W . Combining the inequalitiesyields W ≤p W .

Due to the fact that the SWP is monotone we can generalize the resultsto networks of centers with feedback to the network, and to policies using fullinformation. Related to this are the results of Righter & Shanthikumar [59].They also consider networks of centers, each with one server, and show that,in the case of a service time distribution with an increasing likelihood ratio,the departures are earlier if the customers are served non-preemptively. Mono-tonicity plays an important role there too.

Page 47: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Networks of customer assignment models with workloads 43

Consider c centers, where routing between the centers is according to staticrules. Remember that static policies were introduced in the previous sectionfor the assignment of customers to parallel queues within a center. Here theyare used for routing between different centers. The model is either open orclosed. Let R be an arbitrary policy, that possibly uses information on allcenters. In the case of random service times, the assignment decisions are notallowed to depend on the workloads. Now, let V (i, j) (V (i, j)) be the stream ofcustomers going from center i to center j, using the SWP (R). Outside arrivalsare assumed to be coming from center 0.

2.4.2. Theorem. V (i, j) ≤p V (i, j) for all i and j.

Proof. Due to the (possible) feedback in the network arrival times depend onprior departure times. Therefore we cannot use arguments similar to those inthe proof of theorem 2.4.1. We couple the networks, one using the SWP andone using R, by constructing V ∗(i, j) and V ∗(i, j) with V ∗(i, j) d= V (i, j) andV ∗(i, j) d= V (i, j) for all i and j. The routing is coupled by letting the nthcustomer that leaves center i go to the same center in both networks. Notethat, by taking i = 0, we have V ∗(0, j) = V ∗(0, j). In the case of deterministicservice times the models are completely coupled now. The service times arecoupled for each queue separately, such that the departures are earlier underthe SWP. Now consider a realization.

Events in the networks with streams V ∗ and V ∗ occur at points v1 <v2 < · · · and v1 < v2 < · · ·. Each event consists of a transition of a customerfrom one center to another. Transitions from center i to center j occur atv1(i, j) < v2(i, j) < · · · and v1(i, j) < v2(i, j) < · · ·. (If 2 or more events occurat the same time, we assume that they are logically ordered. For example, ifa customer arrives at a center, receives 0 processing time and leaves again, weassume that the arrival occurs before the departure.) We use the fact that ifthe arrivals up to T at a certain center are earlier in the SWP model, then thedepartures up to T are also earlier. The proof uses induction on the numberof events in the network operated by R. Choose n∗. Define n∗ij as follows:vn∗

ij(i, j) ≤ vn∗ < vn∗

ij+1(i, j). Suppose

vl(i, j) ≤ vl(i, j) for all l = 1, . . . , n∗ij , i and j.

Consider transition n∗ + 1 in the network operated by R. Suppose that acustomer moves from center i∗ to center j∗ at this transition. Consider centeri∗. By the induction hypothesis for j = i∗, the arrivals at i∗ before vn∗ areearlier under the SWP. Because there are no arrivals at center i∗ betweenvn∗ and vn∗+1 in the network operated by R, also the arrivals before vn∗+1

are earlier under the SWP. By the optimality and monotonicity of the SWP,the departures are also earlier, and thus vn∗

i∗j∗+1 ≤ vn∗i∗j∗+1, completing the

induction step.

Page 48: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

44 Models with Markov Decision Arrival Processes

Note that we can also have non-controllable centers in the network, aslong as they are monotone. Even more general, we can also insert control-lable centers of the type considered in Righter & Shanthikumar [59]. Anotherpossibility is the inclusion of centers with Bernoulli routing, as long as the as-signment in the center is more balanced for R∗ than for R. This follows fromthe monotonicity of this type of center (theorem 2.3.2), and from the pathwiseoptimality (as shown in section 1.9). The next corollary follows easily.

2.4.3. Corollary. In a closed network, the SWP maximizes the throughput.In an open network, the SWP minimizes the number of customers in the system.

For their model Righter & Shanthikumar [59] formulate a similar corollary.

2.5. Server assignment model with multiple servers

In section 2.2 the dynamic programming results of section 1.2 were easily gener-alized to arrivals according to an MDAP. The generalization is possible in mostcustomer assignment models. For the server assignment models it is more com-plicated. In this section we show that lemma 1.12.1, which shows the optimalityof LEPT, can be generalized to arrivals according to an MDAP. This meansthat, as in the customer assignment models, LEPT is optimal in the last cen-ter of a tandem system, where each center has its own servers and customerskeep their class. Lemma 1.11.1 however, which deals with single server models,cannot be generalized in its full generality as two counterexamples show.

We follow the analysis of section 1.12. Again assume µ1 ≤ · · · ≤ µm. Theother remarks made there are also valid here. The analogue of (1.12.1) is:

vn+1(x,i) = c(x,i) + min

a

∑y

λxay

( m∑j=1

qjxayvn(y,i+ej)

+(1−

m∑j=1

qjxay)vn(y,i)

)+

minl1,...,ls(x)

s(x)∑k=1

(µlkv

n(x,i−elk ) + (µ− µlk)vn(x,i)

)+ (1− γ − s(x)µ)vn(x,i).

The lemma which gives the optimal policy is also the same:

2.5.1. Lemma. If

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i) ≤ µj2w(x,i−ej2 ) + (µ− µj2)w(x,i) (2.5.1)for ij1 , ij2 > 0 and j1 < j2

and

w(x,i−ej1 ) ≤ w(x,i) for ij1 > 0 (2.5.2)

hold for the cost functions c and v0, then they hold for all vn.

The model studied in section 3.6 is a generalization of the present model,for example with partial availability of the servers. For a proof of the lemmawe refer to the proof of lemma 3.6.1.

Page 49: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with multiple servers 45

2.5.2. Theorem. A SIP minimizes the costs at T (and from 0 to T ) for allcost functions satisfying (2.5.1) and (2.5.2).

The same cost functions as in section 1.12 are allowable here. Of course,this includes the optimality of the SIP in the single server case if µ1 ≤ · · · ≤ µm.If the queues are not ordered this way, we have seen in section 1.12 that theSIP is in general not optimal in the multiple server case. Because the MDAP isa generalization of the MAP, this also holds for the present model. However, inthe single server case the SIP was optimal, independent of the ordering. Thisdoes not hold in the case of MDAP’s, as the following counterexamples show.Summarizing, in the case of dependent arrivals we need µ1 ≤ · · · ≤ µm both inthe multiple and in the single server case. This result can also be found in [25].

We consider a system of two centers in tandem, each with two queues,where each center has one server. There are no arrivals, and when a customerleaves queue j at the first center, it enters at the second center again queuej. We show, for certain choices of the service parameters, holding costs andstarting states, that the µc-rule in the second center is not optimal. Thiscontradicts the results in section 4 of Nain [49] and in section 2 of Nain etal. [50]. We show that the expected total costs over the infinite horizon are notminimized by a policy that uses the µc-rule in the second center. This meansthat there are T ’s for which the µc-rule does not minimize the expected costsat T . The first example (which can be found in [23]) is the simplest, althoughwe must assume that the policies allow idling in the first center. In the secondexample this is not the case.

We use a similar notation for the tandem system as in section 2.2, i.e. weadd a tilde to denote the first center. The parameters of the first example aregiven in figure 2.5.1.

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................•••

center 1 center 2µ

2c

.65

µ

2

1

c

1.05

2

Figure 2.5.1

Denote by Kijk the total expected holding cost when at time 0 there arei customers in the first queue of center 1, j customers in the first queue ofcenter 2 and k customers in the second queue of center 2. It follows from theoptimality of the µc-rule for a single center that the optimal policy in center 2is the µc-rule when center 1 is empty. Because the holding costs are positive,idleness in center 2 is not optimal. Hence the total expected holding cost forthe optimal policy in starting states with the first center empty are:

Page 50: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

46 Models with Markov Decision Arrival Processes

K001 =21

= 2;

K010 =1.05

2= 0.525;

K011 =3.05

2+K001 = 3.525;

K020 =2.12

+K010 = 1.575;

K021 =4.12

+K011 = 5.575.

In starting states with customers in both centers the total expected hold-ing cost is the minimum of terms corresponding to different actions. Denote by(i, j) the possible actions: i (j) is the queue served in center 1 (2). The succes-sive terms in the computation below correspond to the action pairs (1, 1), (1, 2),(2, 1) and (2, 2) respectively, where terms belonging to actions correspondingto idleness in center 2 are deleted. The optimal action pair in starting stateijk is denoted by aijk.

K100 =0.65

2+K010 = 0.85;

K101 = min2.653

+13K100 +

23K011;

2.651

+K100

= min3.51667; 3.5 = 3.5;a101 = (2, 2);

K110 = min1.74

+12K020 +

12K100;

1.72

+K100

= min1.6375; 1.7 = 1.6375;a110 = (1, 1);

K111 = min3.74

+12K021 +

12K101;

3.73

+13K110 +

23K021;

3.72

+K101;3.71

+K110

= min5.4625; 5.49583; 5.35; 5.3375 = 5.3375;a111 = (2, 2).

From a110 = (1, 1), a101 = (2, 2) and a111 = (2, 2) we conclude that theserver in center 1 starts serving the job in queue 1 after the job in queue 2 ofcenter 2 has finished its service. Hence the optimal action in center 1 dependson the state in center 2. Since a111 = (2, 2) the server in center 2 serves the jobin queue 2 before the job in queue 1, thus the µc-rule is not optimal at center2. Note that the optimal action in center 2 depends on the state in center 1.

The error in Nain [49] and Nain et al. [50] can best be explained with thehelp of the example. Basically, in both articles, the authors try to improve

Page 51: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with multiple servers 47

an arbitrary policy by keeping the behavior of the first center the same andchanging the policy in the second center to the µc-rule. In the example, thecustomer in center 1 (customer 1) is not served until the customer in center2, queue 2 (customer 3) has departed. We change the policy by serving thecustomer in center 2, queue 1 (customer 2) first, but now we cannot let theserver in center 1 be idle for the service time of customer 3, because we do notknow its service time yet.

Another possibility, by which we keep the stochastic behavior of center 1the same, is taking the idle time at center 1 independent of the service timeof customer 3, but with the same distribution. In the example, the server atcenter 1 idles during an exponentially distributed time with parameter 1, whilecustomer 2 is served at center 2. This trivially does not improve the optimalpolicy, but we also calculate it.

Let K∗1jk denote the total expected holding cost when the customer, ini-tially in center 1, is still there and when the server starts idling. With Kijk

we denote the same, if the customer in center 1 has already departed or if theserver at center 1 is servicing the customer. Since the policy is fixed thereis no minimization in the computation. The total expected holding cost forstates with i = 0 and state 100 are equal to those of the optimal policy. Thecomputation of the other values is as follows,

K∗100 =0.65

1+K100 = 1.5;

K101 =2.65

3+

23K011 +

13K100 = 3.51667;

K∗101 =2.65

2+

12K101 +

12K∗100 = 3.83333;

K111 =3.74

+12K021 +

12K101 = 5.47083;

K∗111 =3.73

+13K111 +

23K∗101 = 5.6125.

Indeed we see, when comparing K∗111 with K111 previously obtained, thatK∗111 is larger.

The second example has 4 customers present, one in each of the 4 queues.The parameters of the exponential distributions and the holding costs are givenin figure 2.5.2.

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

••

••

center 1 center 2µ c

0.5 10

2 4

µ c

3 1.05

1 3

Figure 2.5.2

Page 52: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

48 Models with Markov Decision Arrival Processes

Straightforward calculation gives in this model the following values andoptimal decisions:

K1100 = 30.683; a1100 = (2, 1);K1101 = 35.839; a1101 = (1, 2);K1110 = 31.491; a1110 = (2, 1);K1111 = 37.938; a1111 = (1, 2).

The µc-rule in center 2 gives priority to queue 1. However, the optimalpolicy serves queue 2 first if center 1 is occupied. Hence, in this model also theoptimal decision rule in center 2 depends on the state in center 1. Note thatthe optimal policy never idles. In the next section we will see that this is aconsequence of the fact that c1 ≥ c1 and c2 ≥ c2.

2.6. Tandems of server assignment models with a single server

In this section we consider a tandem of two centers, each with m queues, asingle server, and with arrivals according to an MAP at the first center. Theservice rates in queue j in the first center are µj , in the second µj . Thus thecounterexamples of the previous section are special cases of this model, withm = 2 and no arrivals. In the previous section it has been showed that (forsuitable cost functions) the SIP is optimal in the second center if µ1 ≤ · · · ≤ µm.No results were obtained on the optimal policy at the first center. In general theoptimal policy in the first center depends on the state of the second center, evenif the SIP is optimal in the second center, and is therefore hard to characterize.

In this section we first show monotonicity in both centers. In the case oflinear costs, the conditions for the first center are that the costs must be higherin each class than the costs, for the same class, in the second center; for thesecond center the costs must be positive. This leads in the linear case to anadaptation of the µc-rule, for which we show the optimality in the heavy trafficcase. With the help of calculations we investigate how this policy behaves forother values of the parameters.

We assume∑y λxy = γ for all x and that γ + µ + µ ≤ 1, where again

µ = maxj µj and µ = maxj µj . The dynamic programming equation is:

vn+1(x,ı,i) = min

l,l

∑y

λxy

( m∑j=1

qjxyvn(y,ı+ej ,i)

+ (1−m∑j=1

qjxy)vn(y,ı,i))

+

µlvn(x,ı−el,i+el)

+ (µ− µl)vn(x,ı,i)+

µlvn(x,ı,i−el) + (µ− µl)vn(x,ı,i) + (1− γ − µ− µ)vn(x,ı,i)

=

∑y

λxy

( m∑j=1

qjxyvn(y,ı+ej ,i)

+ (1−m∑j=1

qjxy)vn(y,ı,i))

+

Page 53: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of server assignment models with a single server 49

minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

+ (2.6.1)

minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

+ (1− γ − µ− µ)vn(x,ı,i).

The minimization ranges over all non-empty queues. Idleness correspondsto action 0 with µ0 = µ0 = 0. Now we prove monotonicity in both centers. Itis easily seen that the monotonicity in the second center can also be proven inthe more general case of an MDAP.

2.6.1. Lemma. If

w(x,ı−ej1 ,i+ej1 ) ≤ w(x,ı,i) for ıj1 > 0 (2.6.2)and

w(x,ı,i−ej1 ) ≤ w(x,ı,i) for ij1 > 0 (2.6.3)

hold for the cost function v0, then they hold for all vn.

The proof of this lemma can be found in chapter 4. We have the following:

2.6.2. Theorem. The optimal policy at T is non-idling in both centers forall cost functions satisfying (2.6.2) and (2.6.3).

Let us see what the inequalities mean for linear costs. Equation (2.6.3)requires that, as in the analysis in section 1.11, cj ≥ 0 for all j. It is easily seenthat (2.6.2) requires cj − cj ≥ 0 for all j. This is not surprising, as this numberis the cost reduction when a class j customer moves from center 1 to 2. Thisgives us a conjecture on how an optimal policy might be: in center 1 servingthe queue with highest µj(cj − cj) and in center 2 the µc-rule, i.e. serving thequeue with highest µjcj . We call this policy the tandem µc-rule. However,this policy is not optimal due to problems when the second center is almostempty, meaning that not only cost reduction is important, but so is keepingthe second server busy. Therefore we have the following lemma, in which it isassumed that there are enough customers in the second center.

2.6.3. Lemma. Assume idleness is not allowed. If

µj1w(x,ı−ej1 ,i+ej1 ) + (µ− µj1)w(x,ı,i) ≤ µj2w(x,ı−ej2 ,i+ej2 ) + (µ− µj2)w(x,ı,i)

for j1 < j2 and ıj1 , ıj2 > 0 and n ≤ i (2.6.4)

and, for some ,

µw(x,ı,i−e) + (µ− µ)w(x,ı,i) ≤ µj1w(x,ı,i−ej1 ) + (µ− µj1)w(x,ı,i)

for ij1 > 0 and n ≤ i (2.6.5)

Page 54: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

50 Models with Markov Decision Arrival Processes

hold for the cost function v0, then they hold for all vn.

The proof can be found in chapter 4. Note that because queue in center2 is never empty, (2.6.5) is weaker than usual. Thus the SIP is optimal in thefirst center, if i is large enough. In the second center, queue has highestpriority, and is always served because of the number of customers in the queue.The lemma is the basis of our heavy traffic theorem. The proof is included asthe theorem does not follow from uniformization. We assume that no idlenessis allowed.

2.6.4. Theorem. For all T , cost functions satisfying (2.6.4) and (2.6.5) andε > 0 there is a number N such that the tandem µc-rule in both centers isε-optimal at T , if there are more than N customers in queue at time 0.

Proof. Let N1 denote the fixed number of customers in the first center, attime 0. We compare the costs of two policies: the tandem µc-rule and theoptimal policy R∗. Let the r.v. ΦTx (µc) and ΦTx (R∗) denote their costs, wherex is the starting state of the whole system. We can use uniformization whichgives us the possibility of conditioning on the number of jumps. If this numberis smaller than N , then the expected costs under R∗ are larger, by lemma 2.6.3.Let AN denote the event that there are more than N jumps in [0, T ]. ThusIE(ΦTx (µc)|AcN )− IE(ΦTx (R∗)|AcN ) ≤ 0. Then

IEΦTx (µc)− IEΦTx (R∗) =

(IE(ΦTx (µc)|AN )− IE(ΦTx (R∗)|AN )

)IP(AN )+(

IE(ΦTx (µc)|AcN )− IE(ΦTx (R∗)|AcN ))

IP(AcN ) ≤(IE(ΦTx (µc)|AN )− IE(ΦTx (R∗)|AN )

)IP(AN ).

The expected number of arrivals, conditioned on AN , is smaller than N +T/γ.Thus the expected number of customers available at T , conditioned on AN , forboth the tandem µc-rule and R∗, is smaller than N1 +2N+T/γ. The expectedcosts are bounded by (N1 + 2N + T/γ)c for some c. It remains to show thatthere is a N such that IP(AN )(N1 + 2N + T/γ)c ≤ ε

2 . This follows easily asIP(AN ) and N IP(AN ) ↓ 0 as N →∞.

Indeed, it is easily checked in the case of linear costs that the tandem µc-rule is optimal if µ1(c1 − c1) ≥ · · · ≥ µm(cm − cm) and µc ≥ µjcj for all j. Ifidleness is allowed, we can, as usual, combine lemma 2.6.3 and 2.6.1:

2.6.5. Theorem. For all T , cost functions satisfying (2.6.4), (2.6.5), (2.6.2)and (2.6.3) and ε > 0 there is a number N such that the tandem µc-rule in

Page 55: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of server assignment models with a single server 51

both centers is ε-optimal at T , if there are more than N customers in queue at time 0.

Now we restrict ourselves to m = 2 queues, and we assume that the tandemµc-rule has the same priority in both centers, i.e. serving queue 1 is optimal inboth centers if there are enough customers. Then we do not need to assumethat there are more than n customers in the first queue of the second center,instead it suffices to assume that there are, in total, more than n customers inthe second center.

2.6.6. Lemma. Assume idleness is not allowed. If

µ1w(x,ı−e1,i+e1) + (µ− µ1)w(x,ı,i) ≤ µ2w(x,ı−e2,i+e2) + (µ− µ2)w(x,ı,i)

for ı1, ı2 > 0 and n ≤ i1 + i2 (2.6.6)

and

µ1w(x,ı,i−e1) + (µ− µ1)w(x,ı,i) ≤ µ2w(x,ı,i−e2) + (µ− µ2)w(x,ı,i)

for i1, i2 > 0 and n ≤ i1 + i2 (2.6.7)

hold for the cost function w = v0, then they hold for all vn.

The proof can be found in chapter 4. As in the previous case, we can showthe following.

2.6.7. Theorem. For all T , cost functions satisfying (2.6.6) and (2.6.7) (and(2.6.2) and (2.6.3) if idleness is allowed) and ε > 0 there is a number N suchthat the tandem µc-rule in both centers is ε-optimal at T , if there are morethan N customers in the second center at time 0.

An interesting question is how well the tandem µc-rule performs for othertraffic than heavy traffic. We did some computations on the model of figure2.6.1. The arrivals at both queues are Poisson with the same rate.

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

center 1 center 2λ

λ∗

λ∗

µ c

1 4

2 2

µ c

2 1.1

1 2

Figure 2.6.1

In table 2.6.1 the results for the discounted cost case are summarized. Forall combinations we computed the relative difference between the costs underthe optimal policy and under the µc-rule, for the starting states with each

Page 56: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

52 Models with Markov Decision Arrival Processes

queue empty (shown above) and 5 customers in each queue of both centers. Ofcourse we had to make the state space finite. We did this by giving an upperbound on the total number of customers in the system. By doing it this way,buffer influences are relatively small. Note that the average load is equal to32λ∗, and thus λ∗ = 0.6 gives an average load of 0.9.

β = 0.01 0.1 0.25 0.5 0.75

λ∗ =0.1 00

00

00

00

2.6·10−8

4.3·10−6

0.2 00

00

00

00

4.0·10−7

4.2·10−6

0.3 00

00

00

00

4.5·10−7

2.1·10−6

0.4 00

00

00

00

0<10−15

0.5 00

00

00

00

0<10−15

0.6 00

00

00

0<10−14

<10−14

<10−13

Table 2.6.1. Discounted costs

The results for the average cost case in table 2.6.2 indicate that theorem2.6.7 does not hold for average costs. The results for high traffic intensitiesare less accurate (indicated with ≈) due to the finite state space, although wehad a model with a maximum of 60 customers, giving more than 6 · 105 states.Note that not only the buffer influence, but also the relative differences in thesemodels are larger than in the customer assignment models.

λ∗ R∗ µc rel. diff.

0.1 0.886 0.889 3.4 · 10−3

0.2 2.134 2.171 1.7 · 10−2

0.3 4.024 4.202 4.4 · 10−2

0.4 7.248 ≈ 7.939 9.5 · 10−2

0.5 ≈ 14.092 ≈ 16.862 2.0 · 10−1

0.6 ≈ 36.6 ≈ 48.5 3.2 · 10−1

Table 2.6.2. Average costs

The results of this section are also published in [26].

Page 57: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of server assignment models with identical centers 53

2.7. Tandems of server assignment models with a single serverand identical centers

Here we consider the special case µ = µ. Consider recurrence relation (2.6.1).Instead of inequalities we consider equalities here.

2.7.1. Lemma. Assume idleness is not allowed. If

µj1w(x,ı−ej1 ,i+ej1 ) + (µ− µj1)w(x,ı,i) = µj2w(x,ı−ej2 ,i+ej2 ) + (µ− µj2)w(x,ı,i)

for j1 < j2 and ıj1 , ıj2 > 0, (2.7.1)

µj1w(x,ı,i−ej1 ) + (µ− µj1)w(x,ı,i) = µj2w(x,ı,i−ej2 ) + (µ− µj2)w(x,ı,i)

for j1 < j2 and ij1 , ij2 > 0, (2.7.2)

and

µ2j1w(x,ı−ej1 ,0) + µj1(µ− µj1)w(x,ı−ej1 ,ej1 ) + (µ− µj1)µw(x,ı,0) =

µ2j2w(x,ı−ej2 ,0) + µj2(µ− µj2)w(x,ı−ej2 ,ej2 ) + (µ− µj2)µw(x,ı,0)

for j1 < j2 and ıj1 , ıj2 > 0 (2.7.3)

hold for the cost function v0, then they hold for all vn.

Equation (2.7.1) and (2.7.2) give, for allowable cost functions, the opti-mality of all possible policies: (2.7.1) shows that serving queue j1 or queue j2in center 1 makes no difference. Similarly, (2.7.2) shows that serving any queuein center 2 is optimal. Equation (2.7.3) is needed in the proof of (2.7.1). Theproof of the lemma can be found in chapter 4.

Now consider allowable cost functions. The only interesting ones we couldfind are both I|ı|+ |i| = 0 and I|ı|+ |i| > 0. This means the following.

2.7.2. Theorem. For every non-idling policy the probability that there arecustomers present at T is the same.

Changing the system as in section 1.12 gives:

2.7.3. Corollary. The distribution of the length of the busy period is equalfor all non-idling policies.

Heuristically, we can say the following. If the policy in center 1 does notdepend on center 2, the arrivals at center 2 are independent (according to anMAP) and it is clear that every policy in center 2 minimizes the makespan.Thus, a possible explanation of theorem 2.7.3 centers around the first center. Ifthere is enough work at center 2, again the policy does not matter. However, incase the server at center 2 has little work, there are 2 possibilities. The first isto serve a fast customer in center 1, giving the server at center 2 work as soon

Page 58: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

54 Models with Markov Decision Arrival Processes

as possible. However, the amount of work is small. When a slow customer isserved in center 1 the situation is reversed: it takes long for the work to arrive,but the amount of work is large. Apparently, the two phenomena balance eachother.

As usual in single server models, we can combine equation (2.7.1) with(2.7.3), (2.6.2) and (2.6.3) when idleness is allowed. Note that, by (2.6.3),I|ı|+ |i| = 0 is not a valid cost function anymore. Therefore we have:

2.7.4. Corollary. The length of the busy period is stochastically identicalunder all work-conserving policies, if both centers have equal service rates andidleness is allowed.

Remark. When each queue in center one has one customer initially present,and no arrivals occur, and if the SIP is employed in both centers, then we canthink of the servers as going from queue to queue instead of the customers goingfrom center to center. Using this equivalence (which was pointed out to me byRhonda Righter, and which can by found in Pinedo & Schrage [55, p. 190]), andusing corollary 2.7.4, we see that reordering of the queues in a system wherethe service rate depends only on the server has no effect on the makespan. Thisinterchangeability of ·|M |1 queues is well known, see Weber [77] for references.Note that the equivalence is only valid under certain restrictions on the model.Similarly, the interchangeability is proven for a more general model. Thereforethe results on both models are of independent interest.

It appears that lemma 2.7.1 cannot be generalized easily to inequalities,although we conjecture that a similar lemma with equalities replaced by in-equalities holds. We give some numerical results supporting this conjecture,with m = 2, Poisson arrivals and c = c. By scaling we can fix µ2 = 1 andc1 = 1, giving the parameters as in figure 2.7.1.

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................

center 1 center 2λ

λ∗

λ∗

µ c

µ∗ 1

1 c∗

µ c

µ∗ 1

1 c∗

Figure 2.7.1

With value iteration we computed the average costs for the optimal policy,the policy that gives priority to queue 1 (R1), and to the policy that givespriority to queue 2 in both centers (R2). It appeared that for low values ofλ∗ the differences are most significant. Because of the computational methodwe had to introduce a number B equal to the maximum number of customersin the system. For B = 25 there was no influence from the buffer (when wetook λ∗ small), meaning that B = 30, 35 and 40 gave the same results. Takingµ∗ = 2 appeared to be satisfactory. We took λ∗ = 0.25, giving an average

Page 59: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Tandems of server assignment models with identical centers 55

workload of 0.375. In figure 2.7.2 the values of the different policies can beseen for various values of c∗.

For c∗ ≤ 2, R1 appeared to be optimal. If c∗ ≥ 2.66, then R2 is op-timal. For 2 < c∗ ≤ 2.66 the optimal policy is neither R1 nor R2. Thenumber 2 can easily be explained: below 2 R1 is both faster and costs less.The value 2.66 is explained as follows. When there are no arrivals, the to-tal costs can be computed. It appears that, for general µ∗, the optimal ac-tion in (1, 1, 0, 0) is queue 1 if c∗ ≤ 2µ∗2/(1 + µ∗) and queue 2 if c∗ ≥2µ∗2/(1 + µ∗). For µ∗ = 2 this number is indeed equal to 8/3. Computationsshow that 2µ∗2/(1+µ∗) is the turn-over point for various µ∗. This indicates that

1.75 2.00 2.25 2.50 2.75 3.00 3.25

c∗ →

1.71.81.92.02.12.22.32.42.52.62.72.82.93.03.1

averagecosts

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

........................

............. ............. R1

........

........

........

........

........

........

........

........

........

........

........

........

........

........

........

........

........

.....

......... R2

................................

................................

................................

.................................

..................................

..................................

..................................

....................................

....................................

....................................

......................................

......................................

......................................

.......................................

......................................

......................................

.......................................

......................................

......................................

..............................

........................................... optimal policy

Figure 2.7.2

(1, 1, 0, 0), the only state with 2 customers in which the action is non-trivial,plays an important role in this model.

Page 60: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

56 Models with Markov Decision Arrival Processes

Page 61: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Chapter 3

Models withDependent Markov Decision Arrival Processes

3.1. Dependent Markov Decision Arrival Processes

In some models we can generalize the arrival process even more, by letting thearrival probabilities depend on the state of the queues.

3.1.1. Definition. (Dependent Markov Decision Arrival Process) LetΛ be the countable state space of a Markov decision process with transitionintensities λxay with x, y ∈ Λ and a ∈ A(x), the set of actions in x. Whenthis process moves from x to y, while action a was chosen and the state of thequeues is i, then with probability qkxay;i an arrival in class 1 ≤ k ≤ m occurs,and with probability qm+k

xay an event with server 1 ≤ k ≤ s occurs. There aresets Λs1, . . . ,Λ

ss such that server k is available if and only if x ∈ Λsk, and sets

Λa1n, . . . ,Λamn, n ∈ IN, such that if x ∈ Λakn then there have been n or more

arrivals of class k. We call the quadruple (Λ, A, λ, q) a Dependent MarkovDecision Arrival Process (DMDAP).

Naturally, we can also let the transition rates and the probabilities of serverevents depend on the state of the queues i. Because we do not study modelswhere this is the case we did not allow this type of dependency.

How the arrival probabilities are allowed to depend on i will be specifiedfor each model. Note that if there is no dependency, we have an MDAP. It isclear that conditions on the DMDAP must be given; to give an example wherethe optimality result does not hold, assume that in the customer assignmentmodel of section 1.2 the arrival probabilities are higher in more balanced statesthan in unbalanced states. Then assigning to the longer queue might be morefavorable.

Page 62: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

58 Models with Dependent Markov Decision Arrival Processes

3.2. Asymmetric customer assignment model

In this section we deal with a customer assignment model with asymmetricservice rates. In the sections 3.3 and 3.4 we will see that the results of section1.2 and 1.3 are special cases of the result proved here. The present result givesonly a partial characterization of the optimal policy in the general model, evenif the arrivals are non-controlled. We discuss some computational results atthe end of the section.

The model is as follows. Customers arrive according to a DMDAP, allin one class (write q instead of q1). There are no server vacations. Arrivingcustomers have to be assigned to one of the non-full queues, where B are thebuffer sizes. In state (x, i), a customer in queue j is served with rate µji. Weend the description by giving conditions on the arrival probabilities and theservice rates.

To make the notation shorter (although it abuses notational conventionsa bit), let i∗ be the permutation of i with ij1 and ij2 switched, that is, i∗j =ij if j 6= j1, j2, i∗j1 = ij2 and i∗j2 = ij1 . Assume all vectors considered arecomponentwise smaller than B. Now we formulate the conditions on the arrivalprobabilities and the departure rates.

The q must satisfy the following conditions:

qxay;i+ej1≤ qxay;i+ej2

if ij1 ≤ ij2 and j1 < j2 (3.2.1)

qxay;i ≤ qxay;i∗ if ij1 > ij2 and j1 < j2 (3.2.2)

An interesting example which satisfies the conditions is a DMDAP with Λ =1, A(1) = 1, λ111 = λ and q111;i = (N − |i|)/N , the well known finitesource model. In fact, if qxay;i only depends on |i| (and x, a and y), every otherdependency is allowed.

The µ satisfy the following:

µji = 0 if ij = 0

µji+ej1 ≥ µji+ej2 if j 6= j1, j2, ij1 ≤ ij2 and j1 < j2

µj1i+ej1 + µj2i+ej1 ≥ µj1i+ej2 + µj2i+ej2 if ij1 ≤ ij2 and j1 < j2

µji ≥ µji+ej1 if j 6= j1

µji ≥ µji∗ if j 6= j1, j2, ij1 > ij2 and j1 < j2

µj1i ≥ µj2i∗ if ij1 > ij2 and j1 < j2

µj1i + µj2i ≥ µj1i∗ + µj2i∗ if ij1 > ij2 and j1 < j2

We also assume that µji ≤ µ for some constant µ. An interesting exampleis µji = minij , sjµj , with s1 ≥ · · · ≥ sm and µ1 ≥ · · · ≥ µm. Thus lowernumbered queues have more and faster working servers.

Another example is the following. Assume that customers which are notserved require a certain amount of attention which decreases the service rate

Page 63: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Asymmetric customer assignment model 59

of the customer being served. The amount of attention needed depends on thequeue. This results in µji = (1− pjij)µ, p1 ≤ · · · ≤ pm. We assume that B issuch that pjBj ≤ 1 for all j. Note that the service rate at queue j is decreasingin ij , an at first sight counterintuitive fact. This µji satisfies the conditions.

Again we assume that γ + mµ ≤ 1, where∑y λxay = γ for all x and a.

The dynamic programming equation is:

vn+1(x,i) =c(x,i) + min

a

∑y

λxay

(qxay;i min

jvn(y,i+ej∧B)+ (1− qxay;i)vn(y,i)

)+

m∑j=1

µjivn(x,i−ej) + (1−

m∑j=1

µji − γ)vn(x,i). (3.2.3)

3.2.1. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 and j1 < j2, (3.2.4)w(x,i) ≤ w(x,i+ej1 ) (3.2.5)

and

w(x,i) ≤ w(x,i∗) for ij1 > ij2 and j1 < j2 (3.2.6)

hold for the cost functions c and v0, then they hold for all vn.

The proof of lemma 3.2.1 can be found in chapter 4. Recall that we as-sumed that all vectors considered are smaller than B. Equation (3.2.4) gives apartial characterization of the optimal policy. It says that an arriving customershould be assigned to queue j1 instead of queue j2 if there are less customersin queue j1 and if j1 < j2. Usually, this does not specify the optimal policycompletely. Therefore we called the characterization partial. Note that send-ing the customer to queue j1 gives a higher total service rate, and in the caseµji = µj with µ1 ≥ · · · ≥ µm, the customer is sent to the faster queue. There-fore we call such a policy a Shorter Faster Queue Policy (SFQP). Equation(3.2.6) is needed to prove equation (3.2.4). Equation (3.2.5) is the well knownmonotonicity. Using corollary 5.3.4, we have the following.

3.2.2. Theorem. For all T , an SFQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (3.2.4) to (3.2.6).

A special case of this result is proven in [24].The conditions (3.2.4) to (3.2.6) are weaker than (1.2.2) to (1.2.4), meaning

that all Schur convex cost functions are allowable. It is easy to give non-Schur convex functions that are allowable (for example, v0

(x,i) =∑mj=1 cjij with

0 ≤ c1 < · · · < cm), meaning that the class of allowable functions is strictlybigger. In the present case however, we were not able to give a completecharacterization of all allowable cost functions, although we have a conjecture,stated in appendix C. Note that, for reasons explained in section 2.2, there areno stochastic results.

Page 64: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

60 Models with Dependent Markov Decision Arrival Processes

For the class of non-symmetric additive cost functions we have a sufficientcondition. We consider cost functions c which only depend on i, because thedependence on x can be arbitrary. We consider c(x,i) = f1(i1) + · · ·+ fm(im).Define ∆fj(i) = fj(i+ 1)− fj(i). Then the following conditions are sufficient:fj increasing, ∆f1(i) ≤ · · · ≤ ∆fm(i) for all i and m− 1 of the m functions areconvex. Since of any two functions one is convex, either ∆fj1(ij1) ≤ ∆fj1(ij2) ≤∆fj2(ij2) or ∆fj1(ij1) ≤ ∆fj2(ij1) ≤ ∆fj2(ij2) holds if j1 < j2 and ij1 ≤ ij2 ,and (3.2.4) follows. Equation (3.2.5) is immediate, and (3.2.6) follows because∆fj1(i) ≤ ∆fj2(i) for all i, and thus fj1(ij1)− fj1(ij2) ≤ fj2(ij1)− fj2(ij2).

Even in the case of an MAP, the optimal policy is not myopic. Considerthe following simple model with Poisson arrivals, m = 2, µ2 µ1, B = (∞,∞)and v0

(i1,i2) = i1 + i2. Now consider v2(1,0) and v3

(1,0). No matter how small µ2

is, if n = 2 action 2 is optimal because, if there is an arrival, there is at most 1service completion before the planning horizon. If n = 3 however, it is possiblethat queue 1 is served twice before n = 0, and we can choose the parameterssuch that action 1 is optimal.

Also in the continuous-time case (and again independent arrivals), there isno unique optimal policy. However, for the model with Poisson arrivals and asingle server at each queue (i.e., µji = minij , 1µj) attempts have been madeto describe the optimal policy in more detail. Theoretically it has been shownby Hajek [19] that the optimal policy is monotone, meaning that there is anincreasing switching curve, and it has been shown by Katehakis & Levine [32]that for an arrival rate which is sufficiently small the policy that assigns to thequeue with smallest expected workload is optimal.

In the papers Van Moorsel & De Vries [47], Nobel & Tijms [54], Houck[30] and Shenker & Weinreb [65] computational results are obtained, mostly form = 2 and B1 = B2. Van Moorsel & De Vries [47] and Nobel & Tijms [54] usesuccessive approximation, in the other two papers simulation is used. Nearlyoptimal policies are proposed, for example the policy that assigns each arrivingcustomer to the queue where its expected delay is minimal. It is clear thatsuccessive approximation is a better method than simulation, because withsimulation a policy cannot be compared with the optimal one, and becausesimulation is computationally less attractive. (Note that this contradicts aremark by Shenker & Weinreb [65], where it is stated that, using methods fromMarkov decision theory, it is difficult to find the optimal policy “even in thesmallest non-trivial case of just two non-identical servers”. In the previouschapter we had no problems finding optimal policies in models with 4 queues,with an accuracy which is hard to obtain with simulation.) All policies studiedin the cited papers are SFQP’s. Nobel & Tijms [54] also consider the case wherethere is more than one server in each queue, i.e. the case µji = minij , sjµj .

Page 65: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Symmetric customer assignment models 61

3.3. Symmetric customer assignment models

In this section we first analyze the symmetric model using lemma 3.2.1. Thenwe generalize this model by introducing batch arrivals. By a limiting argumentwe obtain results for models with workloads. Unfortunately we cannot allowfinite buffers in this model. In the third model of this section we allow non-routable arrivals at the queues, and we introduce an extra movable server, asMenich & Serfozo [46] did. All parameters are allowed to depend on the wholestate of the system, with conditions as general as possible. If we take an arrivalprocess without actions and with one state, we have the model of Menich &Serfozo [46].

To start with the first model, we modify the conditions of the previoussection as follows. Assume again that all vectors considered are componentwisesmaller than B.

The q satisfy the following:

qxay;i+ej1≤ qxay;i+ej2

if ij1 ≤ ij2 (3.3.1)

qxay;i = qxay;i∗ for all j1, j2Recall that i∗ agrees with i except for j1 and j2 being interchanged. The lastcondition is called symmetry. Note that the finite source model satisfies theseconditions.

Also µ is made symmetric:

µji = 0 if ij = 0

µji+ej1 ≥ µji+ej2 if j 6= j1, j2 and ij1 ≤ ij2µj1i+ej1 + µj2i+ej1 ≥ µj1i+ej2 + µj2i+ej2 if ij1 ≤ ij2

µji ≥ µji+ej1 if j 6= j1

µji = µji∗ if j 6= j1, j2

µj1i = µj2i∗ and µj2i = µj1i∗

The symmetric versions of the examples of the previous section, µji =minij , sµ and µji = (1− pij)µ for suitable B, are allowed here.

The present model is general enough to capture that of Johri [31]. TherePoisson arrivals are taken, together with the following assumptions on theservice rates: µji = µjı if ij = ıj , µji ≤ µji+ej and µji+2ej − µji+ej ≤ µji+ej −µji, i.e. the service rate in a queue depends only on the number of customers inthat queue, and is both increasing and concave. For example, the model withmultiple servers at each queue conforms to this description.

The dynamic programming equation remains the same as in the previ-ous section. The conditions are stronger than those in section 3.2, giving thevalidity of lemma 3.2.1 for the model studied here.

We can obtain the optimality result for the symmetric case from lemma3.2.1. Let Π be a permutation matrix. Assume that v0 and c are symmetric ini, i.e. v0

(x,i) = v0(x,iΠ) and c(x,i) = c(x,iΠ).

Page 66: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

62 Models with Dependent Markov Decision Arrival Processes

3.3.1. Lemma. Assume we have vectors B and B = BΠ, a permutation ofB. Let vn and vn be value function for identical models, except for the buffersizes, being B and B. Then vn(x,i) = vn(x,ı) with ı = iΠ for all n.

As the arrival and departure rates are symmetric the inductive proof istrivial.

Now consider equation (3.2.4). By exchanging queue j1 and queue j2 inthe ordering we have the reversed inequality. By doing the same with (3.2.6)we have rewritten the set of inequalities, giving the following.

3.3.2. Corollary. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 , (3.3.2)

w(x,i) ≤ w(x,i+ej) (3.3.3)

and

w(x,i) = w(x,i∗) (3.3.4)

hold for the cost functions c and v0, then they hold for all vn.

The equations (3.3.2) to (3.3.4) are the same as (1.2.2) to (1.2.4) and(2.2.2) to (2.2.4). Because the MAP and the MDAP are both special cases ofthe DMDAP, and because µji = minij , 1µ satisfies the conditions, lemma1.2.1 and 2.2.1 follow.

3.3.3. Theorem. For all T , an SQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (3.3.2) to (3.3.4).

Again, all Schur convex function are allowable cost functions.

In the second model of this section we want to generalize the results ofsection 1.8 to arrivals according to a DMDAP. If we want to do this straightfor-wardly, then we would have to generalize the uniformization results of chapter 5to include the model here, for example generalizing the countable state space toIRm. Instead of this, we show that the workload model is the limiting case of aqueue length model with batch arrivals, for which the SQP is optimal. Assumethat each batch consists with probability βk of k customers. It is essential thatthe whole batch is assigned to the same queue. If we want to model batch ar-rivals where each member of a batch can be assigned to another queue, we cansimply use the model without batch arrivals and regard the model with batcharrivals as a limiting case. We consider the simple case of each queue havinga single server. Because the size of the batch can be arbitrarily large, blockingcan always occur in the case of finite buffers, therefore we do not model them.The DMDAP has the same conditions as in the previous model.

Page 67: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Symmetric customer assignment models 63

The dynamic programming equation is:

vn+1(x,i) =c(x,i) + min

a

∑y

λxay

(qxay;i min

j∑k

βkvn(y,i+kej)

+

(1− qxay;i)vn(y,i))

+m∑j=1

µvn(x,(i−ej)+) + (1−mµ− γ)vn(x,i).

3.3.4. Lemma. If∑k

βkw(x,i+kej1 ) ≤∑k

βkw(x,i+kej2 ) for ij1 ≤ ij2 (3.3.5)

w(x,i) ≤ w(x,i+ej) (3.3.6)

andw(x,i) = w(x,i∗) (3.3.7)

hold for the cost functions c and v0, then they hold for all vn.

The proof can be found in chapter 4. Note that equation (3.3.5) is notvalid without the summation, for the same reason as that (1.8.2) was not validwithout the integration. Using corollary 5.3.4, we get the following.

3.3.5. Theorem. For all T , an SQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (3.3.5) to (3.3.7).

Again, all Schur convex cost functions satisfy the conditions.By lemma A.2 we can approximate any service time distribution arbitrarily

closely by phase-type distributions. Assigning to the shortest queue is, in thelimit, equivalent to assigning to the queue with the shortest workload. Thisgives the following result.

3.3.6. Theorem. For all T , an SWP minimizes the costs at T (from 0 to T )for all Schur convex cost functions.

Although the arrival process is a DMDAP with which we can model afinite source, we cannot model a finite source in the workload model, becausewe do not know the actual number of customers in the system. Note howeverthat we already proved in theorem 2.4.2 that the SWP is optimal in a finitesource model.

Now we look at the model that has additional non-routable arrival streams,and an extra movable processor. The combination of finite buffers and addi-tional arrivals is not allowed as the SQP might not be optimal anymore. Thiscan be seen from the following example: take m = 2, B = (3,∞). In state

Page 68: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

64 Models with Dependent Markov Decision Arrival Processes

(2, 1) it may be optimal to assign an arriving customer in the assignable streamto queue 1, because then future non-routable arrivals at queue 1 are blocked.

The extra arrival streams can easily be modeled with the DMDAP: arrivalsin class 0 are routable, customers arriving in class k, 1 ≤ k ≤ m, join queue k.The arrival probabilities of class 0 are allowed to depend on the assignmentaction, i.e. we have arrival probabilities q0

xay;ij , where j is the assignment.Assume that there are numbers qxay such that q0

xay;ij ≤ qxay for all i and j.We let the service probabilities depend also on x. Denote the service rate ofthe movable processor with µji;x, if it serves queue j in state i. Assume alsothat µji;x ≤ µ for all i, j and x. The dynamic programming equation is:

vn+1(x,i) = c(x,i) + min

a

∑y

λxay

(minj

q0xay;ijv

n(y,i+ej)

+ (qxay − q0xay;ij)v

n(y,i)

+

m∑j=1

qjxay;ivn(y,i+ej)

+ (1−m∑j=1

qjxay;i − qxay)vn(y,i))

+

minjµji;xvn(x,i−ej) + (µ− µji;x)vn(x,i)+ (3.3.8)

m∑j=1

µji;xvn(x,i−ej) + (1− γ − µ−

m∑j=1

µji;x)vn(x,i).

Now we give the conditions. Recall that i∗ is the vector equal to i, butwith queue j1 and j2 interchanged. First we have symmetry of all parametersinvolved (called interchangeability in Menich & Serfozo [46]):

q0xay;ij = q0

xay;i∗j if j 6= j1, j2 and q0xay;ij1 = q0

xay;i∗j2

qjxay;i = qjxay;i∗ if j 6= j1, j2 and qj1xay;i = qj2xay;i∗

µji;x = µji∗;x if j 6= j1, j2 and µj1i;x = µj2i∗;x

µji;x = µji∗;x if j 6= j1, j2 and µj1i;x = µj2i∗;x

We also assume the following on q0 (with, as in section 1.5, (j) the index ofthe jth smallest component of i):

q0xay;i(1) ≤ q

0xay;ij (3.3.9)

q0xay;i+ej1 (1) ≤ q

0xay;i+ej2 (1) if ij1 ≤ ij2 (3.3.10)

Assume l1 < l2. The conditions on the non-routable arrival probabilities are:

m∑j=k

q(j)xay;i+e(l1)

≤m∑j=k

q(j)xay;i+e(l2)

for k = 1, . . . , l1, l2 + 1, . . . ,m (3.3.11)

Page 69: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Symmetric customer assignment models 65

m∑j=k

q(j)xay;i ≤

m∑j=k

q(j)xay;i+e(l1)

for k = l1 + 1, . . . ,m (3.3.12)

Concerning servers, we assume, besides the interchangability:

µji;x = µji;x = 0 if ij = 0

The assumptions on the service rate of the routable server are the reverse ofthose on q0:

µ(m)i;x ≥ µji;x (3.3.13)

µ(m)i+ej1 ;x ≥ µ(m)i+ej2 ;x if ij1 ≤ ij2The assumptions on the fixed servers are much like those on the non-routablearrivals:

m∑j=k

µ(j)i+e(l1);x ≥m∑j=k

µ(j)i+e(l2);x for k = 1, . . . , l1, l2 + 1, . . . ,m

m∑j=k

µ(j)i;x ≥m∑j=k

µ(j)i+e(l1);x for k = l1 + 1, . . . ,m

Note that the conditions are the same as in Menich & Serfozo [46]. Now wecan formulate our inductive result.

3.3.7. Lemma. If

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for ij1 ≤ ij2 , (3.3.14)

w(x,i) ≤ w(x,i+ej1 ), (3.3.15)

andw(x,i) = w(x,i∗) (3.3.16)

hold for the cost functions c and v0, then they hold for all vn.

3.3.8. Theorem. For all T , an SQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (3.3.14) to (3.3.16).

By appendix C the class of allowable cost functions is the class of weakSchur convex functions.

Remark. As we saw, the arrival probabilities and service rates are allowed todepend both on the state of the arrival process and the state of the queues.Therefore the term environment instead of arrival process would be more ap-propriate. Typically, in an environment the arrivals are according to a MarkovModulated Poisson Process. Here however, we kept the arrivals occurring atthe transitions of the environment, in order to maintain the generality of thearrivals. In most other models studied in this thesis we can allow the servicerates to depend on the state of the arrival process. Because the generalizationis only minor and because of notational simplicity we refrained from doing so.

Page 70: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

66 Models with Dependent Markov Decision Arrival Processes

3.4. Customer assignment models without waiting room

The results of section 3.2 can also be used to obtain results in the model withoutwaiting room, much like the results on the symmetric model were obtained inthe previous section. We have the same condition on the arrival probabilitiesand the service rates, but because B = (1, . . . , 1) the inequalities of lemma3.2.1 simplify to

w(x,i+ej1 ) ≤ w(x,i+ej2 ) for j1 < j2 and ij1 = ij2 = 0 (3.4.1)

andw(x,i) ≤ w(x,i+ej1 ) for ij1 = 0. (3.4.2)

These are the same inequalities as (1.3.2) and (1.3.3). Lemma 1.3.1 followsbecause µji = minij , 1µj satisfies the conditions of lemma 3.2.1.

3.4.1. Theorem. For all T , an FQP minimizes the costs at T (from 0 to T )for all cost functions satisfying (3.4.1) to (3.4.2).

The class of allowable cost functions are the functions that respect thepartial sum ordering, as introduced and discussed in appendix C.

In Sobel [66] a model is studied in which customers of m classes arriveaccording to independent Poisson processes. Besides that, the model is similarto the model studied here. The analysis of that paper appears to be erroneous.(The basic theorem 1 does not hold as the derivation of B ≥ 1 is incorrect.)Sobel & Srivastava [67] wrote a revision. The model is essentially a single classmodel. The optimal policy does not depend on the class, and the only place theclass of a customer plays a role is in the cost function. However, no example isgiven of a cost function that indeed depends on the customer classes. Here weprefer to study a more complex model in which rejection is allowed.

Consider m exponential servers with decreasing service rates µ1 ≥ · · · ≥µm and arrivals according to an MAP. (At the end of this section we showthat the results cannot be generalized to (D)MDAPs.) Arrivals occur in mclasses. When a class k customer arrives, it can either be rejected or sent toone of the free servers. The service times depend only on the server, not onthe customer class. When a class k customer is rejected blocking costs bk areincurred, b1 ≥ · · · ≥ bm ≥ 0. (It can be shown that if a class has negativeblocking costs, it will always be blocked.) At the servers, an action has to bechosen for each class of customers. We denote with ak the free server to whichan arrival in class k is assigned, with action 0 corresponding to blocking. Thedynamic programming equation becomes (assume e0 = 0):

vn+1(x,i) =

∑y

λxy

( m∑k=1

qkxy minak

Iak = 0bk + vn(y,i+eak )

+

(1−m∑k=1

qkxy)vn(y,i))

+

Page 71: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Customer assignment models without waiting room 67

m∑j=1

µjvn(x,(i−ej)+) + (1− γ −

m∑j=1

µj)vn(x,i).

3.4.2. Lemma. If v0(x,i) = 0 for all x and i, then the following equations hold

for all n:

vn(x,i+ej1 ) ≤ vn(x,i+ej2 ) for j1 < j2 and ij1 = ij2 = 0 (3.4.3)

vn(x,i) ≤ vn(x,i+ej1 ) for ij1 = 0 (3.4.4)

vn(x,i+ej1 ) ≤ b1 + vn(x,i) for ij1 = 0 (3.4.5)

vn(x,i+ej1 ) − vn(x,i) ≤ v

n(x,i+ej2+ej1 ) − v

n(x,i+ej2 ) (3.4.6)

for j1 = minj|(i+ ej2)j = 0 and ij2 = 0

The proof can be found in chapter 4. Let us consider the consequencesof the lemma. As can be deduced from the dynamic programming equation,when considering assigning an arbitrary customer to one of the free servers, wehave to compare vn(x,i+ej) for various j. By (3.4.3) vn(x,i+ej) is minimal for thej corresponding to the fastest free server. Equation (3.4.4) is the well knownmonotonicity. (Because we did not use bk ≥ 0 in its proof, it follows from themonotonicity that blocking is always optimal if bk < 0.) Equation (3.4.5) isconcerned with the assignment of customers with the highest blocking costs.It says that assigning such a customer to an arbitrary server is better thanblocking, i.e. a class 1 customer should never be blocked unless the system isfull. Equation (3.4.6) says that when a class k customer is blocked in state (x, i),i.e. vn(x,i+ej1 )− bk− v

n(x,i) ≥ 0, it is also blocked when there are more customers

present (and the state of the MAP is the same). On the other hand, whena customer is admitted, it is admitted as well in states with less customers.Another monotonicity property is the following. If vn(x,i+ej1 ) − bk1 − vn(x,i) ≥ 0,then also vn(x,i+ej1 ) − bk2 − vn(x,i) ≥ 0, if k1 < k2. Thus, when blocking isfavorable for class k1, then blocking is also favorable for class k2. Similarly,when customers of a certain class are admitted, then all customer classes withhigher blocking costs are admitted as well. This gives the following.

Theorem. For all T , an optimal policy minimizing the blocking costs from 0to T exists and has the following properties:

If a customer is admitted it should be sent to the fastest free server;

Class 1 customers are never blocked, unless the system is full;

If a class k customer is blocked in (x, i1), it is blocked in (x, i1 + i2);If a class k customer is admitted in (x, i1 + i2), it is admitted in (x, i1);If a class k customer is blocked in (x, i), all classes with indices higher than kare blocked as well in (x, i);If a class k customer is admitted in (x, i), all classes with indices lower thanthan k are admitted as well in (x, i).

Page 72: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

68 Models with Dependent Markov Decision Arrival Processes

This result comes from [35].If the arrival stream is an MDAP, then (3.4.3), (3.4.4) and (3.4.5) still

hold, but (3.4.6) fails. We demonstrate this with the following example. Takem = 2 and µ1 = µ2 = 1. The arrival process is as follows. There is an arrivalin class 2 at t = 0, after which the arrival process moves to one of 2 states.If action 1 is chosen, the arrival process moves to state 1, where customers ofclass 1 arrive according to a Poisson process with rate λ1. There are no class2 arrivals. If action 2 is chosen, the arrival process moves to state 2, in whichthere are no class 1 arrivals, but where there are Poisson arrivals in class 2with rate λ2. This arrival process can easily be approximated by MDAP’s. Itappears that, for suitable values of b1, b2, λ1, λ2 and T , it is optimal to blockthe class 2 customer and choose action 1 if the system is empty, but to admitthe class 2 customer and choose action 2 if there is one customer available.This means that (3.4.6) does not hold. Using the uniformization method thedifferent strategies can easily be compared. Equation (3.4.6) fails for examplefor b1 = 10, b2 = 1, λ1 = 1, λ2 = 3.5 and T = 5. It is straightforward to givean intuitive explanation.

3.5. Customer assignment models with rejections

Here we want to generalize the model of section 1.6 to asymmetric servers.Section 1.6 deals with a symmetric customer assignment model, for which itis shown that the SQP maximizes the number of departures from the system.Thus, we analyze the model of section 3.2, but with a different objective func-tion. We will see however, that the conditions on the arrival probabilities andthe service rates need to be different. The model we study has as dynamicprogramming equation:

vn+1(x,i,k) = c(x,i,k) + min

a

∑y

λxay

(qxay;i min

jvn(y,i+ej∧B,k)+

(1− qxay;i)vn(y,i,k)

)+ (3.5.1)

m∑j=1

µji

(δijv

n(x,i−ej ,k+1) + (1− δij )vn(x,i,k)

)+ (1− γ −

m∑j=1

µji)vn(x,i,k).

The extra component of the state space k counts the number of departures. Asin the server assignment models with a single server we study both the case inwhich rejection is allowed and the case in which it is not allowed.

We start with the model in which rejection is not allowed, meaning that theminimization ranges over all j for which ij < Bj . To make the notation shorter,let i∗ again be the permutation of i with ij1 and ij2 interchanged. Assume allvectors considered are componentwise smaller than B. The conditions for qare:

qxay;i = qxay;ı if |i| = |ı|

Page 73: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Customer assignment models with rejections 69

qxay;i ≥ qxay;i+ej1

Although the conditions are more restrictive than in section 3.2, the finitesource model still satisfies them.

For µ, we take the same conditions as in section 3.2. Now we have:

3.5.1. Lemma. If

w(x,i+ej1 ,k) ≤ w(x,i+ej2 ,k) for ij1 ≤ ij2 and j1 < j2, (3.5.2)w(x,i,k+1) ≤ w(x,i+ej1 ,k), (3.5.3)w(x,i,k+1) ≤ w(x,i,k) (3.5.4)

and

w(x,i,k) ≤ w(x,i∗,k) for ij1 > ij2 and j1 < j2 (3.5.5)

hold for the cost functions c and v0, then they hold for all vn.

The proof can be found in chapter 4. The only meaningful cost functionis again v0

(x,i,k) = −k (and c(x,i,k) = 0).

3.5.2. Theorem. In the case of a DMDAP, an SFQP maximizes the expectednumber of departed customers between 0 and T , if rejection is not allowed.

If we want to allow rejections, we have to assume that qxay;i is independentof i, i.e. the arrival process is an MDAP, and we cannot model a finite source.This is intuitively clear: if the system is relatively full it might be better toreject a customer in order to make a better choice when the customer comesagain.

Concerning the service rates, we need the extra condition µji+ej1 ≥ µjifor all j. As we already assumed the reverse for j 6= j1, it amounts to:

µji = µjı if ij = ıj

andµj1i+ej1 ≥ µj1i.

This is not surprising. On one hand, if we assign to the shortest queue, cus-tomers leave the system fast. To agree with this, service rates must be high instates with few customers. On the other hand, states with few customers canbe reached by rejecting customers, and to agree with this, service rates shouldbe high in states with many customers. This reasoning intuitively explains thefact that the service rates must be constant.

For completeness, we give the other conditions as well.

µji = 0 if ij = 0

µj1i+ej1 + µj2i+ej1 ≥ µj1i+ej2 + µj2i+ej2 if ij1 ≤ ij2 and j1 < j2

µj1i ≥ µj2i∗ if ij1 > ij2 and j1 < j2

µj1i + µj2i ≥ µj1i∗ + µj2i∗ if ij1 > ij2 and j1 < j2

Page 74: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

70 Models with Dependent Markov Decision Arrival Processes

3.5.3. Lemma. If

w(x,i+ej1 ,k) ≤ w(x,i+ej2 ,k) for ij1 ≤ ij2 and j1 < j2, (3.5.6)w(x,i,k+1) ≤ w(x,i+ej1 ,k), (3.5.7)

w(x,i+ej1 ,k) ≤ w(x,i,k) (3.5.8)and

w(x,i,k) ≤ w(x,i∗,k) for ij1 > ij2 and j1 < j2 (3.5.9)

hold for the cost functions c and v0, then they hold for all vn.

The proof can be found in chapter 4. Equation (3.5.8) shows that thereexists an optimal policy that does not reject customers. Because of the arbitrarybuffers we need (3.5.8) in the proof of (3.5.7). Note that lemma 1.6.1 is a specialcase of lemma 3.5.3, using the equivalent of lemma 3.3.1.

3.5.4. Theorem. In the case of an MDAP, an SFQP maximizes the expectednumber of departed customers between 0 and T , if rejection is allowed.

Also the model with B = e gives a myopic optimal policy, as the firstmodel studied in the previous section. Because we did not handle this modelin chapter 1, we do it here. As is easily seen equation (3.5.6) to (3.5.9) simplifyto

w(x,i+ej1 ,k) ≤ w(x,i+ej2 ,k) if j1 < j2,

w(x,i,k+1) ≤ w(x,i+ej1 ,k)

andw(x,i+ej1 ,k) ≤ w(x,i,k).

Then we have, in case of an MAP:

3.5.5. Theorem. For all T , the FQP maximizes the number of departedcustomers between 0 and T stochastically.

Remark. We end this section by considering the differences between the mod-els of this section. In the second model, in which rejection is allowed, we needqxay;i+ej1

≥ qxay;i to prove (3.5.8), and qxay;i ≥ qxay;i+ej1to prove (3.5.7).

Thus qxay;i must be independent of i. In the first model, in which rejection isnot allowed, we do not have (3.5.8), and thus we only assume qxay;i ≥ qxay;i+ej1

.

Page 75: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with multiple servers 71

3.6. Server assignment model with multiple servers

In this section we generalize the result for the server assignment model withmultiple servers of section 2.5 to servers which are partially available, and toarrival processes that stop producing customers if i = 0. Let us first describethe model of section 2.5 again.

Customers arrive in m different classes. The service times of customers inclass j are exponentially distributed with rate µj , µ1 ≤ · · · ≤ µm. In the caseof arrivals according to an MDAP and multiple servers it is shown in section2.5 that the SIP is optimal.

Here we have arrivals according to a DMDAP, with the following condition.There are numbers qkxay such that qkxay;0 ≤ qkxay and qkxay;i = qkxay if i 6= 0. Ifwe take qkxay;0 = 0 the system stays empty once it becomes empty. This waywe can study the length of the busy period for the model with an MDAP, evenin the case that there are arrivals (in the system with qkxay;0 = qkxay) after thefirst emptiness.

Perhaps more interesting is the following generalization. In the model ofsection 1.12 we modeled server vacations, i.e. a server is either working at fullspeed or not working at all. Here we introduce more possibilities, by assumingthat server k is working at speed pk(x), 0 ≤ pk(x) ≤ 1. Note that this can alsobe modeled with the arrival process. The dynamic programming equation is:

vn+1(x,i) =c(x,i) + min

a

∑y

λxay

( m∑j=1

qjxay;ivn(y,i+ej)

+(1−

m∑j=1

qjxay;i

)vn(y,i)

)+

minl1,...,ls

s∑k=1

pk(x)(µlkv

n(x,i−elk ) + (µ− µlk)vn(x,i)

)+

(1− γ −s∑

k=1

pk(x)µ)vn(x,i),

with lk the queue to which server k is assigned, and the second minimizationtaken over all allowable actions (with possibly lk = 0, meaning that server kidles). We have again:

3.6.1. Lemma. If

µj1w(x,i−ej1 ) + (µ− µj1)w(x,i) ≤ µj2w(x,i−ej2 ) + (µ− µj2)w(x,i) (3.6.1)for ij1 , ij2 > 0 and j1 < j2

and

w(x,i−ej1 ) ≤ w(x,i) for ij1 > 0 (3.6.2)

hold for the cost functions c and v0, then they hold for all vn.

The proof can be found in chapter 4. There it is also shown that thepolicy that assigns the servers with the highest speed to the customers in low

Page 76: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

72 Models with Dependent Markov Decision Arrival Processes

numbered classes is optimal. We call such a policy a Fastest Server SmallestIndex Policy (FSSIP). The first to obtain this type of optimality result (withoutarrivals) were Weiss & Pinedo [80]. As a special case they showed that theFSSIP minimizes the expected makespan.

3.6.2. Theorem. A FSSIP minimizes the costs at T (from 0 to T ) for all costfunctions satisfying (3.6.1) and (3.6.2).

See section 1.12 for a discussion of the allowable cost functions.

Remark. In the proof of (3.6.2) we used neither (3.6.1) nor µ1 ≤ · · · ≤ µm,meaning that lemma 3.6.1 not only gives the optimality of LEPT, but also themonotonicity in the server assignment models of the sections 1.12, 2.5 and 2.6.

3.7. Server assignment model with a single serverand a finite source

In this chapter we introduced the DMDAP. The main motivation to do so wasto model a finite source in the customer assignment models. In the serverassignment models this cannot be done in general due to the multiple customerclasses. In this section we handle a special model with m customer classes,all of finite source type, and with a single server. The service parameters areas usual, λj is the rate at which each customer of class j enters queue j, andNj is the total number of customers of class j. We show that, for certain costfunctions, the SIP is optimal if λ1 ≤ · · · ≤ λm. The case λ1 ≤ · · · ≤ λmand µ1 ≥ · · · ≥ µm is studied in Righter [56]. We formulate the dynamicprogramming equation. The direct costs are not modeled because the optimalpolicy appears to be myopic, and thus we can use corollary 5.2.2 or 5.2.3.

vn+1i =

m∑j=1

(Nj − ij)λjvni+ej + minl

µlv

ni−el + (µ− µl)vni

+

(1−m∑j=1

(Nj − ij)λj − µ)vni .

As contrasted with the other single server models, we need monotonicity hereto prove the structure of the optimal policy, giving the following lemma.

3.7.1. Lemma. If

µj1wi−ej1 + (µ− µj1)wi ≤ µj2wi−ej2 + (µ− µj2)wi (3.7.1)for ij1 , ij2 > 0 and j1 < j2

and

wi−ej1 ≤ wi for ij1 > 0 (3.7.2)

Page 77: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Server assignment model with a single server and a finite source 73

hold for the cost function v0, then they hold for all vn.

3.7.2. Theorem. If λ1 ≤ · · · ≤ λm, then the SIP minimizes the costs at T(from 0 to T ) for all cost functions satisfying (3.7.1) and (3.7.2).

Equation (3.7.1) and (3.7.2) are the same as equation (1.11.1) and (1.11.2),thus the same cost functions are allowable.

An interesting interpretation for linear costs is given in Chakka & Mi-trani [9]. In this model the customer are the servers of a multi-server queue.They are subject to failure (with rates λj), and are repaired by a single re-pairman (with rates µj), which is the server in our model. If we assume thata class j server has a service rate cj , then minimizing

∑j ijcj corresponds to

maximizing the total service capacity.The condition λ1 ≤ · · · ≤ λm is essential; to illustrate this, we give an

example with linear costs where no list policy is optimal. (A list policy is apolicy if all customers are ordered (the list) and served according to this order.)For our example we choose a model with three customers and the followingparameters: λ1 = 2.00, λ2 = 1.00, λ3 = 0.10, µ1 = 3.15, µ2 = 2.00, µ3 = 1.00,c1 = 1.00, c2 = 1.00 and c3 = 0.05. We see that µ1c1 ≥ µ2c2 ≥ µ3c3 andλ1 ≥ λ2 ≥ λ3, making this model fall outside the scope of theorem 3.7.2. Foreach of the 24 different policies we computed the average holding costs. For thesix list policies the values are given below. Each list policy is characterized byits list, thus policy a, b, c indicates the policy which gives highest priority tocustomer a, and lowest priority to c, and its value is denoted by v(a, b, c). Thevalues are as follows: v(1, 2, 3) = 0.8803, v(1, 3, 2) = 0.9338, v(2, 1, 3) = 0.8806,v(2, 3, 1) = 0.9285, v(3, 1, 2) = 0.9569, and v(3, 2, 1) = 0.9559. Thus (1, 2, 3)is the best list policy. However, let us consider the policy that gives lowestpriority to the third customer, that serves customer 1 in state (1, 1, 0), butserves customer 2 in state (1, 1, 1). Computations show that this policy isoptimal, with value 0.8800. This shows that there need not be an optimal listpolicy.

We could leave it at that, but let us try to gain some more insight in themodel by giving a heuristic explanation for this phenomenon. Customer threeplays a role of little importance. It fails seldomly (as λ3 = 0.10), and if ithas failed, it has the lowest repair priority (as c3 = 0.05). The parameters arechosen such that if only the customers 1 and 2 are available for repair, thencustomer 1 gets served first. However, if customer 3 is also at the queue, thetime it takes to repair customers 1 and 2 plays a more important role, as thisdetermines the instant at which the repair of customer 3 begins. To start repairearly on customer 3, service should start with customer 2 (cf. theorem 3.7.2, asλ2 < λ1). The parameters for customer 3 are chosen such that the availabilityof customers changes the order in which customers 1 and 2 should be served.

In Koole & Vrijenhoek [40] these results are also derived, and additionalreferences are given. Furthermore, we derive policies which are asymptoticallyoptimal. For the case that the server idles most of the time, the µc-rule is opti-mal; for the heavy traffic case the SIP is optimal if µ1c1/λ1 ≥ · · · ≥ µmcm/λm.

Page 78: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

74 Models with Dependent Markov Decision Arrival Processes

Page 79: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Chapter 4

Proofs of dynamic programming results

4.1. Proofs of chapter 1

Proof of lemma 1.4.1. By induction on n. The case n = 0 is the conditionon the cost function. Now assume that (1.4.2) to (1.4.4) hold up to n. Firstwe determine the optimal action at vn+1 in (i, j). Consider the dynamic pro-gramming equation (1.4.1). If i = j then both terms in the minimization areequal by (1.4.4), symmetry. Thus, again by symmetry, it is enough to consideri < j.

It is easily seen that (i − 1)+ ≤ (j − 1)+, (i − 1)+ ≤ j, i ≤ (j − 1)+ andi ≤ j, from which follows, by (1.4.2), that the 4 terms in vn+1

(i,j) correspondingto assigning the arriving customer to queue 1 are one by one smaller thanthe terms corresponding to assigning to queue 2. This gives us that sending anarriving customer to the first queue is better, even if we knew where departureswould take place.

Note that combining (1.4.3), monotonicity in the first queue, and symme-try gives vn(i,j) ≤ vn(i,j+1), monotonicity in the second queue, and that (1.4.2)and symmetry gives vn(i,j+1) ≤ v

n(i+1,j) if i ≥ j.

Now we prove (1.4.2) for vn+1. The case i = j follows from symmetry.Thus assume i < j. Because of this, assignment to the first queue is not onlyoptimal in (i, j), but also in states like (i, j − 1). We have

λµ2vn(i+1,j−1) ≤ λµ2vn((i−1)++1,j)

by (1.4.2) if i > 0 and by monotonicity in the second queue if i = 0;

λµ(1− µ)vn(i+1,j) ≤ λµ(1− µ)vn((i−1)++1,j+1)

andλ(1− µ)µvn(i+2,j−1) ≤ λ(1− µ)µvn(i+1,j)

by (1.4.2) if i < j − 1; in case i = j − 1 we have

λµ(1− µ)vn(i+1,i+1) + λ(1− µ)µvn(i+2,i) ≤

λµ(1− µ)vn((i−1)++1,i+2) + λ(1− µ)µvn(i+1,i+1)

Page 80: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

76 Proofs of dynamic programming results

by symmetry if i > 0 and monotonicity if i = 0; we have

λ(1− µ)2vn(i+2,j) ≤ λ(1− µ)2vn(i+1,j+1)

by (1.4.2);(1− λ)µ2vn(i,j−1) ≤ (1− λ)µ2vn((i−1)+,j)

and(1− λ)µ(1− µ)vn(i,j) ≤ (1− λ)µ(1− µ)vn((i−1)+,j+1)

by (1.4.2) if i > 0 and monotonicity if i = 0;

(1− λ)(1− µ)µvn(i+1,j−1) ≤ (1− λ)(1− µ)µvn(i,j)

and(1− λ)(1− µ)2vn(i+1,j) ≤ (1− λ)(1− µ)2vn(i,j+1)

by (1.4.2). Summing all terms gives vn+1(i+1,j) ≤ v

n+1(i,j+1).

We continue with (1.4.3). If i + 1 < j, then also i < j and assignmentto the first queue is optimal in both (i, j) and (i + 1, j); if i + 1 > j, thenassignment to the second queue is optimal in both (i, j) and (i+ 1, j). Chooseaction 1 in (i+ 1, j) if i+ 1 = j. Then the optimal action in (i, j) is the sameas in (i + 1, j). Showing vn+1

(i,j) ≤ vn+1(i+1,j) can now be done by using (1.4.3)

on all corresponding terms, unless i = 0, then we have equality in all termscorresponding to departures in queue 1.

The last equation, vn+1(i,j) = vn+1

(j,i) , follows easily.

Proof of lemma 1.5.1. By induction. We will check (1.5.1) to (1.5.3) forall possible realizations of Un+1. From the induction hypothesis we have thatall relations given below hold for each realization of U1, . . . , Un. We start with(1.5.1). Assume ij1 < ij2 . The case ij1 = ij2 is a special case of (1.5.3). IfUn+1 ∈ [

∑z<y λxz,

∑z<y λxz+λxyqxy) an arrival occurs. Let j∗ be the shortest

queue in i+ ej2 , i.e. the optimal action in (y, i+ ej2). Then, if j∗ 6= j1,

V n+1(x,i+ej1 ) = min

jV n(y,i+ej1+ej)

≤ V n(y,i+ej1+ej∗ )

(1.5.1)

V n(y,i+ej2+ej∗ ) = minjV n(y,i+ej2+ej)

= V n+1(x,i+ej2 ).

If j∗ = j1 then (we omit the terms with V n+1)

minjV n(y,i+ej1+ej)

≤ V n(y,i+ej1+ej2 ) = minjV n(y,i+ej2+ej)

.

If Un+1 ∈ [∑z<y λxz + λxyqxy,

∑z≤y λxz), then trivially

V n(y,i+ej1 ) ≤ Vn(y,i+ej2 ).

Page 81: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 1 77

We took (j) < (j + 1) if i(j) = i(j+1). However, by (1.5.3), the ordering in caseof ties can be taken arbitrary. Now we can make sure that queue j1 is servedin i + ej1 and i + ej2 for the same value of Un+1, by taking queue j1 first ini+ ej1 amongst the queues with ij1 + 1 customers and by taking queue j1 lastin i+ ej2 amongst the queues with ij1 customers. Similarly, we can assure thatqueue j2 is served for the same values of Un+1 in i + ej1 and i + ej2 . Now, ifUn+1 ∈ [γ + (j − 1)µ, γ + jµ) with (j) 6= j1 or j2,

V n(x,(i+ej1−e(j))+)

(1.5.1)

≤ V n(x,(i+ej2−e(j))+).

If (j) = j1 then, if ij1 > 0,

V n(x,i)(1.5.1)

≤ V n(x,i+ej2−ej1 )

and, if ij1 = 0,

V n(x,i)(1.5.2)

≤ V n(x,i+ej2 ).

If (j) = j2 then

V n(x,i+ej1−ej2 )

(1.5.1)

≤ V n(x,i).

If Un+1 ≥ γ +mµ, then

V n(x,i+ej1 )

(1.5.1)

≤ V n(x,i+ej2 ).

We continue with (1.5.2). If Un+1 ∈ [∑z<y λxz,

∑z<y λxz + λxyqxy) and i +

ej1 = B, we haveminjV n(y,i+ej) = V n(y+ej1 ),

if i+ ej1 6= B then

minjV n(y,i+ej) ≤ V

n(y,i+ej1 )

(1.5.2)

≤ minjV n(y,i+ej1+ej)

.

The cases Un+1 ∈ [∑z<y λxz+λxyqxy,

∑z≤y λxz) and Un+1 ∈ [γ+mµ, 1] follow

easily. With respect to the departures we can again reorder the ties such thatall queues in i and i + ej1 are served for the same Un+1. Now look at thedepartures at queue j. If j 6= j1,

V n(x,(i−ej)+)

(1.5.2)

≤ V n(x,(i+ej1−ej)+).

If j = j1 and ij1 > 0, then

V n(x,i−ej1 )

(1.5.2)

≤ V n(x,i)

and if j = j1 and ij1 = 0 then

V n(x,i) ≤ Vn(x,i).

As for (1.5.3), the only non-trivial eventuality is when a customer arrives,because the buffers might give problems. However, it is easily checked that thesmallest non-full queue in i and i∗ have the same number of customers.

Page 82: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

78 Proofs of dynamic programming results

Proof of lemma 1.7.1. By induction. Assume the lemma holds up to n.We start with (1.7.2). Assume ij1 < ij2 . The case ij1 = ij2 can be done with(1.7.4). Let j∗ be the optimal action in (y, i+ej2) at stage n+1. Then j∗ 6= j2.If j∗ = j1, we have

qxy minjvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 ) ≤

qxyvn(y,i+ej1+ej2 ) + (1− qxy)vn(y,i+ej1 )

(1.7.2)

qxyvn(y,i+ej1+ej2 ) + (1− qxy)vn(y,i+ej2 ) =

qxy minjvn(y,i+ej2+ej∧B)+ (1− qxy)vn(y,i+ej2 ).

If j∗ 6= j1 we have

qxy minjvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 ) ≤

qxyvn(y,i+ej1+ej∗ ) + (1− qxy)vn(y,i+ej1 )

(1.7.2)

qxyvn(y,i+ej2+ej∗ ) + (1− qxy)vn(y,i+ej2 ) =

qxy minjvn(y,i+ej2+ej∧B)+ (1− qxy)vn(y,i+ej2 ).

Now it follows that∑y

λxy

(qxy min

jvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 )

)≤

∑y

λxy

(qxy min

jvn(y,i+ej2+ej∧B)+ (1− qxy)vn(y,i+ej2 )

).

Concerning the departures, note that each customer in (x, i+ej1) and (x, i+ej2)is served. We have

µvn(x,i+ej1−ej)(1.7.2)

≤ µvn(x,i+ej2−ej),

if ij > 0. Summing this for all customers in state i gives all terms, except thosecorresponding to the extra customers in queue j1 and j2. However, their termis easy:

µvn(x,i+ej1−ej1 ) = µvn(x,i+ej2−ej2 ).

The dummy term follows easily from (1.7.2). Summing the terms gives

vn+1(x,i+ej1 ) ≤ v

n+1(x,i+ej2 ).

Page 83: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 1 79

We continue with (1.7.3). Let j∗ be the optimal action in (y, i). If j∗ 6= j1,then j∗ is also optimal in (y, i+ ej1), and

qxy minjvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 ) =

qxyvn(y,i+ej1+ej∗ ) + (1− qxy)vn(y,i+ej1 )

(1.7.3)

qxyvn(y,i+ej∗ ) + (1− qxy)vn(y,i) =

qxy minjvn(y,i+ej∧B)+ (1− qxy)vn(y,i).

If j∗ = j1 we have

qxy minjvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 ) ≤ v

n(y,i+ej1 )

(1.7.3)

qxyvn(y,i+ej1 ) + (1− qxy)vn(y,i) =

qxy minjvn(y,i+ej∧B)+ (1− qxy)vn(y,i).

Note that this derivation also holds in case i+ ej1 = B.Now we have∑

y

λxy

(qxy min

jvn(y,i+ej1+ej∧B)+ (1− qxy)vn(y,i+ej1 )

)≤

∑y

λxy

(qxy min

jvn(y,i+ej∧B)+ (1− qxy)vn(y,i)

).

For all customers except for the extra customer in class j1 we have

µvn(x,i+ej1−ej)(1.7.3)

≤ µvn(x,i−ej).

The extra customer is considered together with a dummy term with coefficientµ:

µvn(x,i+ej1−ej1 ) = µvn(x,i).

The coefficients of the remaining dummy terms are equal and the inequalitiesfollow easily.

Equation (1.7.4) follows easily.

Page 84: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

80 Proofs of dynamic programming results

Proof of lemma 1.7.3. By induction. We start with (1.7.5). The arrivalterms go the same as in lemma 1.7.1. Now consider the departures. If i + ej1and i+ej2 both have an empty group, there is only the dummy term. If i+ej2 ,but not i+ ej1 has an empty group, (1.7.6) can be used, and what remains aredummy terms with equal coefficients. If the system is in both i+ej1 and i+ej2up, we have

vn(x,i+ej1−ej ,k+1) ≤ vn(x,i+ej2−ej ,k+1)

for each j, by (1.7.5).We continue with (1.7.6). Let j∗ be the optimal assignment in (y, i, k)at

step n+ 1. Then, if i 6= B,

m∑j1=1

(qxy min

jvn(y,i−ej1+ej∧B,k)+ (1− qxy)vn(y,i−ej1 ,k)

)≤

m∑j1=1

(qxyv

n(y,i−ej1+ej∗ ,k) + (1− qxy)vn(y,i−ej1 ,k)

) (1.7.6)

m(qxyv

n(y,i+ej∗ ,k) + (1− qxy)vn(y,i,k)

)=

m(qxy min

jvn(y,i+ej∧B,k)+ (1− qxy)vn(y,i,k)

).

If i = B then in each state (y, i− ej1 , k) we can send an arrival to a full group:

m∑j1=1

(qxy min

jvn(y,i−ej1+ej∧B,k)+ (1− qxy)vn(y,i−ej1 ,k)

)≤

m∑j1=1

(qxyv

n(y,i−ej1 ,k) + (1− qxy)vn(y,i−ej1 ,k)

) (1.7.6)

m(qxyv

n(y,i,k) + (1− qxy)vn(y,i,k)

)=

m(qxy min

jvn(y,i+ej∧B,k)+ (1− qxy)vn(y,i,k)

).

This gives the inequalities for the arrival terms.Concerning the departures, if ij1 > 1 we have

µm∑j2=1

vn(x,i−ej1−ej2 ,k+2)

(1.7.6)

≤ µmvn(x,i−ej1 ,k+1);

if ij1 = 1 we have

µmvn(x,i−ej1 ,k+1) = µmvn(x,i−ej1 ,k+1).

Page 85: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 1 81

Summation of these terms for j1 = 1, . . . ,m gives the terms concerning depar-tures, leaving dummy terms with the same coefficients.

We continue with (1.7.7). The arrival term can be shown similar to thearrival term of (1.7.3). When the system is up or down in both i + ej1 and ithe departure terms follow easily by induction. If i+ ej1 ≥ e and ij1 = 1, thenfirst (1.7.6) should be used.

Equation (1.7.8) follows easily by induction. Also (1.7.9) can be proveneasily.

Proof of lemma 1.8.1. By induction. First we will show that∫vn(i+tej1−se)+dP (t) ≤

∫vn(i+tej2−se)+dP (t) if ij1 ≤ ij2

holds for all s, i.e. that it is optimal to assign to the queue with the smallestworkload. First assume that ij1 − s ≥ 0. This means that (i + tej − se)+ =(i− se)+ + tej for j = j1 and j = j2. Then we have∫

vn(i+tej1−se)+dP (t) =∫vn(i−se)++tej1

dP (t)(1.8.2)

∫vn(i−se)++tej2

dP (t) =∫vn(i+tej2−se)+dP (t).

Now assume that ij1 − s < 0, but ij2 − s ≥ 0. By (1.8.3), monotonicity, wehave vn(i+tej1−se)+ ≤ vn(i−se)++tej1

. This gives∫vn(i+tej1−se)+dP (t) ≤

∫vn(i−se)++tej1

dP (t)(1.8.2)

∫vn(i−se)++tej2

dP (t) =∫vn(i+tej2−se)+dP (t).

Finally assume that ij2−s < 0. We can rewrite (i+tej2−se)+ as (i−se)++t∗ej2with t∗ = (t − s + ij2)+. Note that t∗ < t. Because (i + tej1 − se)+ ≤(i− se)+ + t∗ej1 we have, by (1.8.3), vn(i+tej1−se)+ ≤ vn(i−se)++t∗ej1

. Thus∫vn(i+tej1−se)+dP (t) ≤

∫vn(i−se)++t∗ej1

dP (t) =

∫vn(i−se)++t∗ej2

dP (t) =∫vn(i+tej2−se)+dP (t).

Having shown that assigning to the smallest queue is optimal, the inequalitieswill follow quite easily.

Consider (1.8.2). Let j∗ be the optimal assignment in i+ tej2 . If j∗ = j1,then ∫

minj

∫vn(i+tej1+sej−une)+dP (s)

dP (t) ≤

Page 86: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

82 Proofs of dynamic programming results∫ ∫vn(i+tej1+sej2−une)+dP (s)dP (t) =∫

minj

∫vn(i+tej2+sej−une)+dP (s)

dP (t).

If j∗ 6= j1, then ∫minj

∫vn(i+tej1+sej−une)+dP (s)

dP (t) ≤

∫ ∫vn(i+tej1+sej∗−une)+dP (s)dP (t) ≤∫ ∫vn(i+tej2+sej∗−une)+dP (s)dP (t) =∫

minj

∫vn(i+tej2+sej−une)+dP (s)

dP (t),

the second inequality by the optimality of the SWP as shown above.Concerning (1.8.3), if j∗ is the optimal action in i+ tej1 , we have

minj

∫vn(i+sej−une)+dP (s)

≤∫vn(i+sej∗−une)+dP (s)

(1.8.3)

∫vn(i+tej1+sej∗−une)+dP (s) = min

j

∫vn(i+tej1+sej−une)+dP (s)

.

Equation (1.8.4), symmetry, is as usual trivial to prove.

Proof of lemma 1.11.5. By induction. Assume the lemma holds up to n.We start with the arrivals. Because

µj1vn(y,i−ej1+ej)

+ (µ− µj1)vn(y,i−ej1+ef(j1)+ej)

(1.11.6)

µj2vn(y,i−ej2+ej)

+ (µ− µj2)vn(y,i−ej2+ef(j2)+ej)

the arrival term follows easily.Consider the terms concerning departures. Let j∗ be the optimal action

in (x, i − ej2). Because (i − ej2)j1 > 0, j∗ ≤ j1. Because f(j2) ≥ j2 − 1 wesee that j∗ is also optimal in (x, i − ej2 + ef(j2)). We distinguish two cases,j∗ < j1 and j∗ = j1. Assume j∗ < j1. Then j∗ is also optimal in (x, i − ej1)and (x, i− ej1 + ef(j1)). We have

µj1 minl

µlv

n(x,i−ej1−el)

+ (µ− µl)vn(x,i−ej1−el+ef(l))

+

(µ− µj1) minl

µlv

n(x,i−ej1+ef(j1)−el) + (µ− µl)vn(x,i−ej1+ef(j1)−el+ef(l))

=

Page 87: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 1 83

µj1(µj∗vn(x,i−ej1−ej∗ ) + (µ− µj∗)vn(x,i−ej1−ej∗+ef(j∗)))+

(µ− µj1)(µj∗vn(x,i−ej1+ef(j1)−ej∗ ) + (µ− µj∗)vn(x,i−ej1+ef(j1)−ej∗+ef(j∗)))

(1.11.6)

µj2(µj∗vn(x,i−ej2−ej∗ ) + (µ− µj∗)vn(x,i−ej2−ej∗+ef(j∗)))+

(µ− µj2)(µj∗vn(x,i−ej2+ef(j2)−ej∗ ) + (µ− µj∗)vn(x,i−ej2+ef(j2)−ej∗+ef(j∗))) =

µj2 minl

µlv

n(x,i−ej2−el)

+ (µ− µl)vn(x,i−ej2−el+ef(l))

+

(µ− µj2) minl

µlv

n(x,i−ej2+ef(j2)−el) + (µ− µl)vn(x,i−ej2+ef(j2)−el+ef(l))

.

Now consider j∗ = j1. Then j2 is an allowable action in (x, i − ej1) and(x, i− ej1 + ef(j1)). Then we have

µj1 minl

µlv

n(x,i−ej1−el)

+ (µ− µl)vn(x,i−ej1−el+ef(l))

+

(µ− µj1) minl

µlv

n(x,i−ej1+ef(j1)−el) + (µ− µl)vn(x,i−ej1+ef(j1)−el+ef(l))

µj1(µj2vn(x,i−ej1−ej2 ) + (µ− µj2)vn(x,i−ej1−ej2+ef(j2))

)+

(µ− µj1)(µj2vn(x,i−ej1+ef(j1)−ej2 ) + (µ− µj2)vn(x,i−ej1+ef(j1)−ej2+ef(j2))

) =

µj2 minl

µlv

n(x,i−ej2−el)

+ (µ− µl)vn(x,i−ej2−el+ef(l))

+

(µ− µj2) minl

µlv

n(x,i−ej2+ef(j2)−el) + (µ− µl)vn(x,i−ej2+ef(j2)−el+ef(l))

.

The dummy transition follows easily by induction.

Proof of lemma 1.11.6. With induction. Assume the lemma holds up to n.The arrival term follows easily, like in the proof of lemma 1.11.5. Let j∗ be theoptimal action in state (x, i). If j∗ 6= j1, then

µj1 minl

µlv

n(x,i−ej1−el)

+ (µ− µl)vn(x,i−ej1−el+ef(l))

+

(µ− µj1) minl

µlv

n(x,i−ej1+ef(j1)−el) + (µ− µl)vn(x,i−ej1+ef(j1)−el+ef(l))

µj1(µj∗vn(x,i−ej1−ej∗ ) + (µ− µj∗)vn(x,i−ej1−ej∗+ef(j∗)))+

(µ−µj1)(µj∗vn(x,i−ej1+ef(j1)−ej∗ ) +(µ−µj∗)vn(x,i−ej1+ef(j1)−ej∗+ef(j∗)))

(1.11.7)

µµj∗vn(x,i−ej∗ ) + µ(µ− µj∗)vn(x,i−ej∗+ef(j∗))

=

µminl

µlv

n(x,i−el) + (µ− µl)vn(x,i−el+ef(l))

.

If j∗ = j1, then, because idling is allowed now,

µj1 minl

µlv

n(x,i−ej1−el)

+ (µ− µl)vn(x,i−ej1−el+ef(l))

+

(µ− µj1) minl

µlv

n(x,i−ej1+ef(j1)−el) + (µ− µl)vn(x,i−ej1+ef(j1)−el+ef(l))

µj1µvn(x,i−ej1 ) + (µ− µj1)µvn(x,i−ej1+ef(j1))

=

µminl

µlv

n(x,i−el) + (µ− µl)vn(x,i−el+ef(l))

.

The dummy transition follows easily by induction.

Page 88: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

84 Proofs of dynamic programming results

4.2. Proofs of chapter 2

Proof of lemma 2.6.1. By induction. Assume the lemma holds up to n. Westart with (2.6.2). The terms regarding arrivals at the first center follow easily.

Consider the departures from the first center. Let j∗ be the optimal actionin the first center in state (x, ı, i). If j∗ 6= j1, j∗ is allowable in state (x, ı −ej1 , i + ej1) and the term follows by induction. If j∗ = j1, and ı = ej1 , thenidling is the only action in state in state (x, ı−ej1 , i+ej1) and the term followsby induction. If there is at least one more customer available, say in queue j2,and j∗ = j1, then

minj

µjv

n(x,ı−ej1−ej ,i+ej1+ej)

+ (µ− µj)vn(x,ı−ej1 ,i+ej1 )

µj2vn(x,ı−ej1−ej2 ,i+ej1+ej2 ) + (µ− µj2)vn(x,ı−ej1 ,i+ej1 )

(2.6.2)

µj2vn(x,ı−ej1 ,i+ej1 ) + (µ− µj2)vn(x,ı−ej1 ,i+ej1 )

(2.6.2)

µj1vn(x,ı−ej1 ,i+ej1 ) + (µ− µj1)vn(x,ı,i) =

minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

.

Consider the departures from the second center. The optimal action in (x, ı, i)is allowable in (x, ı−ej1 , i+ej1). Therefore the term follows easily by induction.

Equation (2.6.3) follows from a result in section 3.6.

Proof of lemma 2.6.3. By induction. Assume the lemma holds up to n. Westart with (2.6.4). Assume n+1 ≤ i. The terms concerning arrivals follow im-mediately, using induction, because n < i. Consider the terms correspondingto departures from center 1. Let j∗ be the optimal action in state ı. Becauseıj1 > 0, j∗ is also optimal in ı− ej2 . If j∗ 6= j1, then the terms follow easily byinduction. If j∗ = j1, then

µj1 minj

µjv

n(x,ı−ej1−ej ,i+ej1+ej)

+ (µ− µj)vn(x,ı−ej1 ,i+ej1 )

+

(µ− µj1) minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

µj1 µj2vn(x,ı−ej1−ej2 ,i+ej1+ej2 ) + µj1(µ− µj2)vn(x,ı−ej1 ,i+ej1 )+

(µ− µj1)µj2vn(x,ı−ej2 ,i+ej2 ) + (µ− µj1)(µ− µj2)vn(x,ı,i) =

µj2 minj

µjv

n(x,ı−ej2−ej ,i+ej2+ej)

+ (µ− µj)vn(x,ı−ej2 ,i+ej2 )

+

(µ− µj2) minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

.

Page 89: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 2 85

Consider the second center. By (2.6.5), serving queue is always optimal. Theterms follow by induction. Note that we used (2.6.5) at step n with at leastn+ 1 customers in queue . Also the dummy term follows easily.

Consider (2.6.5). Again the terms concerning arrivals and the dummy tran-sition follow easily. The optimal action in the first center of (x, ı, i) dependsonly on ı. Because the number of customers in queue in state (x, ı, i − ej1),(x, ı, i−e) and (x, ı, i) is i−1 or more, there are at least n customers available,meaning that, by (2.6.4), the same action is optimal in each state. Thereforealso the terms concerning departures from the first center follow easily. Con-cerning the second center, serving queue is optimal in each state. Also theseterms follow easily by induction.

Proof of lemma 2.6.6. By induction. Assume the lemma holds up to n.We start with (2.6.6). Assume n+ 1 ≤ i1 + i2. The terms concerning arrivalsfollow immediately, using induction, because n < i1 + i2. Consider the termscorresponding to departures from center 1. In ı and ı−e2 it is optimal to servequeue 1. Thus

µ1 minj

µjv

n(x,ı−e1−ej ,i+e1+ej)

+ (µ− µj)vn(x,ı−e1,i+e1)

+

(µ− µ1) minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

µ1µ2vn(x,ı−e1−e2,i+e1+e2) + µ1(µ− µ2)vn(x,ı−e1,i+e1)+

(µ− µ1)µ2vn(x,ı−e2,i+e2) + (µ− µ1)(µ− µ2)vn(x,ı,i) =

µ2 minj

µjv

n(x,ı−e2−ej ,i+e2+ej)

+ (µ− µj)vn(x,ı−e2,i+e2)

+

(µ− µ2) minj

µjv

n(x,ı−ej ,i+ej) + (µ− µj)vn(x,ı,i)

.

Consider the second center. If i1 > 0, serving queue 1 is optimal in i + e1, iand i+ e2, using that (2.6.7) holds for i1 + i2 ≥ n+ 1 at stage n. Then

µ1 minj

µjv

n(x,ı−e1,i+e1−ej) + (µ− µj)vn(x,ı−e1,i+e1)

+

(µ− µ1) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

=

µ1µ1vn(x,ı−e1,i+e1−e1) + µ1(µ− µ1)vn(x,ı−e1,i+e1)+

(µ− µ1)µ1vn(x,ı,i−e1) + (µ− µ1)(µ− µ1)vn(x,ı,i)

(2.6.6)

µ2µ1vn(x,ı−e2,i+e2−e1) + µ2(µ− µ1)vn(x,ı−e2,i+e2)+

(µ− µ2)µ1vn(x,ı,i−e1) + (µ− µ2)(µ− µ1)vn(x,ı,i) =

Page 90: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

86 Proofs of dynamic programming results

µ2 minj

µjv

n(x,ı−e2,i+e2−ej) + (µ− µj)vn(x,ı−e2,i+e2)

+

(µ− µ2) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

.

We wanted to prove the inequality for all i with n + 1 ≤ i1 + i2. We used atstage n (2.6.6) with i1 + i2 + 1 > n customers in the second center.

If i1 = 0, then i2 > 0. Thus serving queue 2 is optimal in i and i + e2.Then

µ1 minj

µjv

n(x,ı−e1,i+e1−ej) + (µ− µj)vn(x,ı−e1,i+e1)

+

(µ− µ1) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

µ1µ2vn(x,ı−e1,i+e1−e2) + µ1(µ− µ2)vn(x,ı−e1,i+e1)+

(µ− µ1)µ2vn(x,ı,i−e2) + (µ− µ1)(µ− µ2)vn(x,ı,i)

(2.6.6)

µ2µ2vn(x,ı−e2,i+e2−e2) + µ2(µ− µ2)vn(x,ı−e2,i+e2)+

(µ− µ2)µ2vn(x,ı,i−e2) + (µ− µ2)(µ− µ2)vn(x,ı,i) =

µ2 minj

µjv

n(x,ı−e2,i+e2−ej) + (µ− µj)vn(x,ı−e2,i+e2)

+

(µ− µ2) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

.

Also the dummy term follows easily.Consider (2.6.7). Again the terms concerning arrivals and the dummy

transition follow easily. The optimal action in the first center of (x, ı, i) dependsonly on ı. Because the number of customers in center 2 in state (x, ı, i − e1),(x, ı, i − e2) and (x, ı, i) is i1 + i2 − 1 or more, there are at least n customersavailable, meaning that, by (2.6.6), the same action is optimal in each state.Therefore also the terms concerning departures from the first center followeasily. Concerning the second center, we have

µ1 minj

µjv

n(x,ı,i−e1−ej) + (µ− µj)vn(x,ı,i−e1)

+

(µ− µ1) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

µ1µ2vn(x,ı,i−e1−e2) + µ1(µ− µ2)vn(x,ı,i−e1)+

(µ− µ1)µ2vn(x,ı,i−e2) + (µ− µ1)(µ− µ2)vn(x,ı,i) =

µ2 minj

µjv

n(x,ı,i−e2−ej) + (µ− µj)vn(x,ı,i−e2)

+

(µ− µ2) minj

µjv

n(x,ı,i−ej) + (µ− µj)vn(x,ı,i)

.

Page 91: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 2 87

Proof of lemma 2.7.1. By induction. Assume the lemma holds up to n.In all 3 equations the term corresponding to arrivals and the dummy term goeasily with induction, like in the proof of lemma 1.11.5. Therefore we onlyconsider the terms regarding departures at the first and the last center. Westart with equation (2.7.1). Serving queue j2 in (x, ı − ej1 , i + ej1) is optimal,thus the terms corresponding to departures from the first center of the l.h.s.are:

µj1µj2vn(x,ı−ej1−ej2 ,i+ej1+ej2 ) + µj1(µ− µj2)vn(x,ı−ej1 ,i+ej1 )+

(µ− µj1)µj2vn(x,ı−ej2 ,i+ej2 ) + (µ− µj1)(µ− µj2)vn(x,ı,i).

We have to show that this expression is equal to the one where all j1 and j2are exchanged. Number the 4 terms consecutively. The first and 4th term areboth symmetric in j1 and j2. Term 2 with j1 and j2 exchanged is term 3.

We continue with the second center. First assume i 6= 0, thus there is a j3such that ij3 > 0. Then we have:

µj1µj3vn(x,ı−ej1 ,i+ej1−ej3 ) + µj1(µ− µj3)vn(x,ı−ej1 ,i+ej1 )+

(µ− µj1)µj3vn(x,ı,i−ej3 ) + (µ− µj1)(µ− µj3)vn(x,ı,i).

We use (2.7.1) twice, once with ı, i− ej3 for terms 1 and 3 and once for terms2 and 4.

When i = 0, we need (2.7.3) to prove (2.7.1). The terms corresponding todepartures in the second center of the l.h.s. are:

µj1µj1vn(x,ı−ej1 ,0) + µj1(µ− µj1)vn(x,ı−ej1 ,ej1 ) + (µ− µj1)µvn(x,ı,0).

Equation (2.7.3) immediately gives the expression wanted.Now we prove (2.7.2). The terms corresponding to departures from the

first center go directly with induction. Thus the following terms remain:

µj1µj2vn(x,ı,i−ej1−ej2 ) + µj1(µ− µj2)vn(x,ı,i−ej1 )+

(µ− µj1)µj2vn(x,ı,i−ej2 ) + (µ− µj1)(µ− µj2)vn(x,ı,i).

This expression is symmetric in j1 and j2.We continue with (2.7.3). The terms concerning departures at both centers

are:

µ2j1µj2v

n(x,ı−ej1−ej2 ,ej2 ) + µ2

j1(µ− µj2)vn(x,ı−ej1 ,0) + µ2j1µv

n(x,ı−ej1 ,0)+

µj1(µ− µj1)µj2vn(x,ı−ej1−ej2 ,ej1+ej2 ) + µj1(µ− µj1)(µ− µj2)vn(x,ı−ej1 ,ej1 )+

µj1(µ− µj1)µj1vn(x,ı−ej1 ,0) + µj1(µ− µj1)(µ− µj1)vn(x,ı−ej1 ,ej1 )+

(µ− µj1)µµj1vn(x,ı−ej1 ,ej1 ) + (µ− µj1)µ(µ− µj1)vn(x,ı,0) + (µ− µj1)µ2vn(x,ı,0) =

µ2j1µj2v

n(x,ı−ej1−ej2 ,ej2 ) + µj1(µ− µj1)µj2v

n(x,ı−ej1−ej2 ,ej1+ej2 )+

µ2j1(3µ− µj1 − µj2)vn(x,ı−ej1 ,0) + µj1(µ− µj1)(3µ− µj1 − µj2)vn(x,ı−ej1 ,ej1 )+

(µ− µj1)µ(3µ− µj1 − µj2)vn(x,ı,0) − (µ− µj1)µ(µ− µj2)vn(x,ı,0).

Number the terms consecutively. For term 1 and 2 we use (2.7.2) for ı− ej1 −ej2 , ej1 + ej2 , for term 3, 4 and 5 we use (2.7.3). Term 6 is symmetric.

Page 92: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

88 Proofs of dynamic programming results

4.3. Proofs of chapter 3

Proof of lemma 3.2.1. By induction. We start with (3.2.4). Assume ij1 <ij2 . The case ij1 = ij2 is a special case of (3.2.6). We start with the termcorresponding to arrivals. Let j∗ be the optimal assignment in (y, i + ej2).Then we have

qxay;i+ej1minjvn(y,i+ej1+ej∧B)+ (1− qxay;i+ej1

)vn(y,i+ej1 ) ≤

qxay;i+ej1vn(y,i+ej1+ej∗ ) + (1− qxay;i+ej1

)vn(y,i+ej1 )

(3.2.4)

qxay;i+ej1vn(y,i+ej2+ej∗ ) + (1− qxay;i+ej1

)vn(y,i+ej2 )

(3.2.1)+(3.2.5)

qxay;i+ej2vn(y,i+ej2+ej∗ ) + (1− qxay;i+ej2

)vn(y,i+ej2 ) =

qxay;i+ej2minjvn(y,i+ej2+ej∧B)+ (1− qxay;i+ej2

)vn(y,i+ej2 )

if j∗ 6= j1 and

qxay;i+ej1minjvn(y,i+ej1+ej∧B)+ (1− qxay;i+ej1

)vn(y,i+ej1 ) ≤

qxay;i+ej1vn(y,i+ej1+ej2 ) + (1− qxay;i+ej1

)vn(y,i+ej1 )

(3.2.4)

qxay;i+ej1vn(y,i+ej1+ej2 ) + (1− qxay;i+ej1

)vn(y,i+ej2 )

(3.2.1)+(3.2.5)

qxay;i+ej2vn(y,i+ej2+ej1 ) + (1− qxay;i+ej2

)vn(y,i+ej2 ) =

qxay;i+ej2minjvn(y,i+ej2+ej∧B)+ (1− qxay;i+ej2

)vn(y,i+ej2 )

if j∗ = j1. Note that j∗ cannot be equal to j2. Now let a∗ be the optimalaction in (x, i+ ej2). We have

mina

∑y

λxay

(qxay;i+ej1

minjvn(y,i+ej1+ej∧B)+ (1− qxay;i+ej1

)vn(y,i+ej1 )

)≤

∑y

λxa∗y

(qxa∗y;i+ej1

minjvn(y,i+ej1+ej∧B)+ (1− qxa∗y;i+ej1

)vn(y,i+ej1 )

)≤

∑y

λxa∗y

(qxa∗y;i+ej2

minjvn(y,i+ej2+ej∧B)+ (1− qxa∗y;i+ej2

)vn(y,i+ej2 )

)=

mina

∑y

λxay

(qxay;i+ej2

minjvn(y,i+ej2+ej∧B)+ (1− qxay;i+ej2

)vn(y,i+ej2 )

).

Page 93: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 89

Consider a departure at queue j, j 6= j1, j2:

µji+ej1 vn(x,i+ej1−ej)

(3.2.4)

≤ µji+ej1 vn(x,i+ej2−ej)

(3.2.5)

µji+ej2 vn(x,i+ej2−ej)

+ (µji+ej1 − µji+ej2 )vn(x,i+ej2 ).

The terms corresponding to a departure from queue j1 and j2 will be con-sidered together. We have vn(x,i+ej1−ej2 ) ≤ vn(x,i+ej1−ej1 ) = vn(x,i+ej2−ej2 ) ≤vn(x,i+ej2−ej1 ), by (3.2.4). As µj1i+ej1 + µj2i+ej1 ≥ µj1i+ej2 + µj2i+ej2 , we have,together with (3.2.5),

µj1i+ej1 vn(x,i+ej1−ej1 ) + µj2i+ej1 v

n(x,i+ej1−ej2 ) ≤

µj1i+ej2 vn(x,i+ej2−ej1 ) + µj2i+ej2 v

n(x,i+ej2−ej2 )+

(µj1i+ej1 + µj2i+ej1 − µj1i+ej2 − µj2i+ej2 )vn(x,i+ej2 ).

Note that we did not use (3.2.6) in the above proof. However, we used it forthe case ij1 = ij2 .

Now we prove (3.2.5), monotonicity. The arrival term is easy. Let a∗ bethe optimal action in (x, i+ ej1).

mina

∑y

λxay

(qxay;i min

jvn(y,i+ej∧B)+ (1− qxay;i)vn(y,i)

)≤

∑y

λxa∗y

(qxa∗y;iv

n(y,i+ej1 ) + (1− qxa∗y;i)vn(y,i)

) (3.2.5)

∑y

λxa∗yvn(y,i+ej1 )

(3.2.5)

∑y

λxa∗y

(qxa∗y;i+ej1

minjvn(y,i+ej1+ej∧B)+ (1− qxa∗y;i+ej1

)vn(y,i+ej1 )

)=

mina

∑y

λxay

(qxay;i+ej1

minjvn(y,i+ej1+ej∧B)+ (1− qxay;i+ej1

)vn(y,i+ej1 )

).

Consider a departure at queue j, j 6= j1. Then

µjivn(x,i−ej)

(3.2.5)

≤ µjivn(x,i+ej1−ej)

(3.2.5)

µji+ej1 vn(x,i+ej1−ej)

+ (µji − µji+ej1 )vn(x,i+ej1 ),

because µji ≥ µji+ej1 . By vn(x,i−ej1 ) ≤ vn(x,i) ≤ vn(x,i+ej1 ) we have for the termcorresponding to a departure from queue j1

µj1ivn(x,i−ej1 ) ≤ µj1i+ej1 v

n(x,i) + (µj1i − µj1i+ej1 )vn(x,i+ej1 )

Page 94: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

90 Proofs of dynamic programming results

if µj1i − µj1i+ej1 ≥ 0 and

µj1ivn(x,i−ej1 ) + (µj1i+ej1 − µj1i)v

n(x,i) ≤ µj1i+ej1 v

n(x,i)

if µj1i+ej1 − µj1i ≥ 0.We continue with (3.2.6). Let j∗ be the optimal assignment in (y, i∗). We

know that j∗ 6= j2. We have

qxay;i minjvn(y,i+ej∧B)+ (1− qxay;i)vn(y,i) ≤

qxay;ivn(y,i+ej∗ ) + (1− qxay;i)vn(y,i)

(3.2.6)

qxay;ivn(y,i∗+ej∗ ) + (1− qxay;i)vn(y,i∗)

(3.2.2)+(3.2.5)

qxay;i∗vn(y,i∗+ej∗ ) + (1− qxay;i∗)vn(y,i∗) =

qxay;i∗ minjvn(y,i∗+ej∧B)+ (1− qxay;i∗)vn(y,i∗)

if j∗ 6= j1 and

qxay;i minjvn(y,i+ej∧B)+ (1− qxay;i)vn(y,i) ≤

qxay;ivn(y,i+ej2 ) + (1− qxay;i)vn(y,i)

(3.2.6)

qxay;ivn(y,i∗+ej1 ) + (1− qxay;i)vn(y,i∗)

(3.2.2)+(3.2.5)

qxay;i∗vn(y,i∗+ej1 ) + (1− qxay;i∗)vn(y,i∗) =

qxay;i∗ minjvn(y,i∗+ej∧B)+ (1− qxay;i∗)vn(y,i∗)

if j∗ = j1. The term wanted is derived in the same way as for (3.2.4).The departures at queue j, j 6= j1, j2, go similarly as those in (3.2.4). The

terms corresponding to a departure from queue j1 and j2 will be consideredtogether. Note that vn(x,i∗−ej2 ) ≤ vn(x,i∗−ej1 ), by (3.2.4). Thus, by µj1i ≥ µj2i∗

and µj1i + µj2i ≥ µj1i∗ + µj2i∗ , we have

µj1ivn(x,i−ej1 ) + µj2iv

n(x,i−ej2 )

(3.2.6)

µj1ivn(x,i∗−ej2 ) + µj2iv

n(x,i∗−ej1 ) ≤

µj2i∗vn(x,i∗−ej2 ) + (µj1i − µj2i∗ + µj2i)v

n(x,i∗−ej1 )

(3.2.5)

µj2i∗vn(x,i∗−ej2 ) + µj1i∗v

n(x,i∗−ej1 ) + (µj1i − µj2i∗ + µj2i − µj1i∗)vn(x,i∗).

Page 95: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 91

Proof of lemma 3.3.4. By induction. We start with (3.3.5). Let j∗ be theshortest queue in state i+ kej2 . Because ij1 ≤ ij2 , j∗ does not depend on k. Itis easily seen that qxay;i+kej1

≤ qxay;i+kej2. If j∗ 6= j1, we have∑

k

βk

(qxay;i+kej1

minj∑l

βlvn(y,i+kej1+lej)

+ (1− qxay;i+kej1)vn(y,i+kej1 )

)=

∑k

βk

(qxay;i+kej1

∑l

βlvn(y,i+kej1+lej∗ ) + (1− qxay;i+kej1

)vn(y,i+kej1 )

) (3.3.5)

∑k

βk

(qxay;i+kej1

∑l

βlvn(y,i+kej2+lej∗ ) + (1− qxay;i+kej1

)vn(y,i+kej2 )

) (3.3.1)(3.3.6)

∑k

βk

(qxay;i+kej2

∑l

βlvn(y,i+kej2+lej∗ ) + (1− qxay;i+kej2

)vn(y,i+kej2 )

)=

∑k

βk

(qxay;i+kej2

minj∑l

βlvn(y,i+kej2+lej)

+ (1− qxay;i+kej2)vn(y,i+kej2 )

).

If j∗ = j1, then∑k

βk

(qxay;i+kej1

minj∑l

βlvn(y,i+kej1+lej)

+ (1− qxay;i+kej1)vn(y,i+kej1 )

)≤

∑k

βk

(qxay;i+kej1

∑l

βlvn(y,i+kej1+lej2 ) + (1− qxay;i+kej1

)vn(y,i+kej1 )

) (3.3.5)

∑k

βk

(qxay;i+kej1

∑l

βlvn(y,i+kej1+lej2 ) + (1− qxay;i+kej1

)vn(y,i+kej2 )

) (3.3.1)(3.3.6)

∑k

βk

(qxay;i+kej2

∑l

βlvn(y,i+kej1+lej2 ) + (1− qxay;i+kej2

)vn(y,i+kej2 )

)=

∑k

βk

(qxay;i+kej2

minj∑l

βlvn(y,i+kej2+lej)

+ (1− qxay;i+kej2)vn(y,i+kej2 )

).

The departure term follows as in the proof of lemma 3.2.1.Consider the departures. Note that, by (3.3.7), we can assume ij1 < ij2 . If

ij1 > 0, the term follows easily by induction. If ij1 = 0, the term on all serversexcept j1 also follow by induction. For server j1 we have

∑k

βkw(x,(i+kej1−ej1 )+)

(3.3.6)

≤∑k

βkw(x,i+kej1 )

(3.3.5)

≤∑k

βkw(x,i+kej2 ).

The terms corresponding to the dummy transition and the immediate costsfollow easily.

Also (3.3.6) and (3.3.7) follow with induction.

Page 96: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

92 Proofs of dynamic programming results

Proof of lemma 3.3.7. By induction. We start with (3.3.14). Assumeij1 < ij2 . The case ij1 = ij2 is a special case of (3.3.16). We start with theterm corresponding to the routable arrivals. Let j∗ be the index of the shortestqueue in (y, i + ej2). First note that assigning to the shortest queue is stilloptimal: if a customer arrives, it is favorable by (3.3.14), and the probabilitythat a customer arrives is by (3.3.9) smallest, which is favorable by (3.3.15).Then, if j∗ 6= j1,

minjq0xay;i+ej1 j

vn(y,i+ej1+ej)+ (qxay − q0

xay;i+ej1 j)vn(y,i+ej1 ) ≤

q0xay;i+ej1 j

∗vn(y,i+ej1+ej∗ ) + (qxay − q0xay;i+ej1 j

∗)vn(y,i+ej1 )

(3.3.14)

q0xay;i+ej1 j

∗vn(y,i+ej2+ej∗ ) + (qxay − q0xay;i+ej1 j

∗)vn(y,i+ej2 )

(3.3.10)+(3.3.15)

q0xay;i+ej2 j

∗vn(y,i+ej2+ej∗ ) + (qxay − q0xay;i+ej2 j

∗)vn(y,i+ej2 ) =

minjq0xay;i+ej2 j

vn(y,i+ej2+ej)+ (qxay − q0

xay;i+ej2 j)vn(y,i+ej2 ).

If j∗ = j1, assume that is the shortest queue in i + ej1 . We will use thatq0xay;i+ej1

≤ q0xay;i+ej1 j1

≤ q0xay;i+ej2 j1

, the last inequality because j∗ is alsothe smallest queue in i.

minjq0xay;i+ej1 j

vn(y,i+ej1+ej)+ (qxay − q0

xay;i+ej1 j)vn(y,i+ej1 ) =

q0xay;i+ej1

vn(y,i+ej1+e)+ (qxay − q0

xay;i+ej1 )vn(y,i+ej1 )

(3.3.14)

q0xay;i+ej1

vn(y,i+ej1+ej2 ) + (qxay − q0xay;i+ej1

)vn(y,i+ej2 )

(3.3.15)

q0xay;i+ej2 j1

vn(y,i+ej2+ej1 ) + (qxay − q0xay;i+ej2 j1

)vn(y,i+ej2 ) =

minjq0xay;i+ej2 j

vn(y,i+ej2+ej)+ (qxay − q0

xay;i+ej2 j)vn(y,i+ej2 ).

Note that j∗ cannot be equal to j2.Concerning the non-routable arrivals, we have the following. We will show

that if there are numbers qjl1 , qjl2 such that∑mj=k q

jl1≤∑mj=k q

jl2

for 1 ≤ k ≤ mand l1 < l2 then

m∑j=k

qjl1vn(x,i+e(l1)+e(j))

+m∑j=k

(qjl2− qjl1

)vn(x,i+e(l1))≤

m∑j=k

qjl2vn(x,i+e(l2)+e(j))

. (4.3.1)

Suppose the relation holds for fixed k. Consider k−1. If qk−1l1≤ qk−1

l2, we have

by (3.3.14) and (3.3.15),

qk−1l1

vn(x,i+e(l1)+e(k−1))+ (qk−1

l2− qk−1

l1)vn(x,i+e(l1))

≤ qk−1l2

vn(x,i+e(l2)+e(k−1))

Page 97: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 93

and the result follows easily. If qk−1l1

> qk−1l2

, we have by (3.3.14)

qk−1l1

vn(x,i+e(l1)+e(k−1))≤ (qk−1

l1− qk−1

l2)vn(x,i+e(l1)+e(k))

+ qk−1l2

vn(x,i+e(l2)+e(k−1)).

Thus it remains to show that (4.3.1) with q replaced by q, with qjl1 = qjl1 forj > k, qjl2 = qjl2 for j ≥ k, and qkl1 = qkl1 + qk−1

l1− qk−1

l2holds. It is easily seen

that∑mj=k1

qjl1 ≤∑mj=k1

qjl2 for k1 = k, . . . ,m, completing the induction step.

By taking qjl = q(j)xay;i+e(l)

and l1 and l2 such that (l1) = j1 and (l2) = j2 weare finished with the term concerning the non-routable arrivals, in case (3.3.11)holds for all k. If

∑mj=k q

jl1≤∑mj=k q

jl2

holds for k = l1 and k > l2 only, we canshow (4.3.1) for k = l1, using the fact that vn(x,i+e(l1)+e(j1))

≤ vn(x,i+e(l2)+e(j2))

for all j1 and j2 with l1 ≤ j1 ≤ l2 and l1 ≤ j2 ≤ l2, in much the same way asthe induction step above. Now, by adding a dummy term we get the arrivalterm in a similar way as in the proof of lemma 3.2.1.

Consider the assignable server. Omit in the notation the dependence ofµ on x. Let j∗ be the index of the longest queue in (y, i + ej2). Note that itcannot be j1. We can take j∗ such that it is also the longest queue in i andi + ej1 . By (3.3.13), (3.3.14) and (3.3.15), we see that assigning the server tothe longest queue is optimal. We have

minjµji+ej1 v

n(y,i+ej1−ej)

+ (µ− µji+ej1 )vn(y,i+ej1 ) ≤

µj∗i+ej1 vn(y,i+ej1−ej∗ ) + (µ− µj∗i+ej1 )vn(y,i+ej1 )

(3.3.14)

µj∗i+ej1 vn(y,i+ej2−ej∗ ) + (µ− µj∗i+ej1 )vn(y,i+ej2 )

(3.3.15)

µj∗i+ej2 vn(y,i+ej2−ej∗ ) + (µ− µj∗i+ej2 )vn(y,i+ej2 ) =

minjµji+ej2 v

n(y,i+ej2−ej)

+ (µ− µji+ej2 )vn(y,i+ej2 ).

Finally, consider the fixed server. Omit again the x in the notation of µ.(Note the similarity between what follows and the way non-routable arrivalswere treated.) We will show that if there are numbers µ∗jl1 , µ∗jl2 such that∑mj=k µ

∗jl1≥∑mj=k µ

∗jl2

for 1 ≤ k ≤ m and l1 < l2 then

m∑j=k

µ∗jl1vn(x,i+e(l1)−e(j)) ≤

m∑j=k

µ∗jl2vn(x,i+e(l2)−e(j)) +

m∑j=k

(µ∗jl1 − µ∗jl2)vn(x,i+e(l2))

.

(4.3.2)Suppose the relation holds for fixed k, consider k − 1. If µ∗k−1l1

≥ µ∗k−1l2, we

have by (3.3.14) and (3.3.15),

µ∗k−1l1vn(x,i+e(l1)−e(k−1))

≤ µ∗k−1l2vn(x,i+e(l2)−e(k−1))

+(µ∗k−1l1−µ∗k−1l2)vn(x,i+e(l2))

.

Page 98: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

94 Proofs of dynamic programming results

Equation (4.3.2) follows easily. If µ∗k−1l1< µ∗k−1l2

, we have by (3.3.14)

µ∗k−1l1vn(x,i+e(l1)−e(k−1))

+ (µ∗k−1l2 − µ∗k−1l1)vn(x,i+e(l1)−e(k))

µ∗k−1l2vn(x,i+e(l2)−e(k−1))

.

Now (4.3.2) follows by induction, with µ∗kl2 replaced by µ∗kl2 +µ∗k−1l2−µ∗k−1l1

.Using a similar reasoning as in the case of non-routable arrivals we find that∑mj=k µ

∗jl1≥∑mj=k µ

∗jl2

only needs to hold for k = 1, . . . , l1, l2 + 1, . . . ,m.Now we prove (3.3.15), monotonicity. The term concerning the routable

arrivals is easy:

minj

q0xay;ijv

n(y,i+ej)

+ (qxay − q0xay;ij)v

n(y,i)

=

q0xay;i(1)v

n(y,i+e(1))

+ (qxay − q0xay;i(1))v

n(y,i)

(3.3.15)

qxayvn(y,i+e(1))

(3.3.14)

≤ qxayvn(y,i+ej1 )

(3.3.15)

minj

q0xay;i+ej1 j

vn(y,i+ej1+ej)+ (qxay − q0

xay;i+ej1 j)vn(y,i+ej1 )

.

Using (3.3.12) we can show

m∑j=l1+1

q(j)xay;iv

n(x,i+e(j))

+m∑

j=l1+1

(q(j)xay;i+e(l1)

− q(j)xay;i)v

n(x,i) ≤

m∑j=l1+1

q(j)xay;i+e(l1)

vn(x,i+e(l1)+e(j)),

similar to the analysis of (3.3.14). Because

vn(x,i) ≤ vn(x,i+e(j))

≤ vn(x,i+e(l1))≤ vn(x,i+e(l1)+e(j))

for j ≤ l1, we have the term wanted.The terms corresponding to the assignable server are similar to the terms

corresponding to the routable arrivals, the terms corresponding to the fixedservers are similar to the term corresponding to the non-routable arrivals.

Equation (3.3.16) is trivial to prove.

Proof of lemma 3.4.2. By induction. It is easily seen that v0 = 0 satisfies theinequalities. Assume the lemma holds up to n. We prove the inequalities forthe terms on the m classes and the terms on departures separately. Multiplyingwith qkxy, summing etc. give the complete inequalities. The terms on arrivalsare proven by considering the optimal action on the r.h.s., and then findingan action on the l.h.s. for which the inequality holds. We start with (3.4.3).

Page 99: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 95

Consider an arbitrary customer class l. Assume the optimal action in (x, i+ej2)is blocking. Then we have (take blocking in (x, i+ ej1)):

bl + vn(x,i+ej1 ) ≤ bl + vn(x,i+ej2 ),

by induction. If the optimal action in (x, i + ej2) is sending to server j1, wetake server j2 in (x, i+ ej1). Then we have:

vn(x,i+ej1+ej2 ) ≤ vn(x,i+ej2+ej1 ).

If the optimal action is server j∗ 6= j1 in state (x, i + ej2), we take the sameaction in (x, i+ ej1). We have by induction

vn(x,i+ej1+ej∗ ) ≤ vn(x,i+ej2+ej∗ ).

Now we have

minal

Ial = 0(bl + vn(x,i+ej1 )) +

∑j

Ial = jvn(x,i+ej1+ej)

minal

Ial = 0(bl + vn(x,i+ej2 )) +

∑j

Ial = jvn(x,i+ej2+ej)

for all l. The term, concerning the arrival process, but without arrivals, goesby induction. Consider the terms corresponding to departures. Terms forj 6= j1, j2 are done with induction. With the help of (3.4.4) we have, usingµj1 ≥ µj2 ,

µj1vn(x,i) + µj2v

n(x,i+ej1 ) ≤ µj2v

n(x,i) + µj1v

n(x,i+ej1 ) ≤ µj2v

n(x,i) + µj1v

n(x,i+ej2 ).

Combining these results gives vn+1(x,i+ej1 ) ≤ v

n+1(x,i+ej2 ).

We continue with (3.4.4). Take in (x, i) the optimal action of (x, i+ ej1).Then (3.4.4) follows immediately.

Consider (3.4.5). Let j∗ be the optimal action in (x, i) for some customerclass l. If j∗ 6= j1, take action j∗ on the l.h.s. and we have

vn(x,i+ej∗+ej1 ) ≤ b1 + vn(x,i+ej∗ )

by induction. If the optimal action is j1, reject in i+ ej1 . Then

bl + vn(x,i+ej1 ) ≤ b1 + vn(x,i+ej1 ).

If the optimal action is blocking, take blocking as action on the l.h.s. Fordepartures at servers j 6= j1 we have

µjvn(x,i+ej1−ej∨0) ≤ µjb1 + µjv

n(x,i−ej∨0)

Page 100: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

96 Proofs of dynamic programming results

by induction. For server j1 we have

µj1vn(x,i) ≤ µj1b1 + µj1v

n(x,i).

This completes the proof of (3.4.5).Rewrite (3.4.6):

vn(x,i+ej1 ) + vn(x,i+ej2 ) ≤ vn(x,i+ej2+ej1 ) + vn(x,i).

In the following table one can see the optimal actions of the r.h.s. in the leftcolumns and the actions establishing the inequalities in the right columns. Letj∗ = minj|(i+ ej1 + ej2)j = 0. Note that j∗ 6= j1, j2, and that j∗ cannot beoptimal in i, due to the choice of j1. The terms are identified by their states.

i+ ej2 + ej1 i i+ ej1 i+ ej2

0 0 0 0 induction0 j1 0 j1 equality0 j2 j2 0 equalityj∗ 0 0 j∗ twice inductionj∗ j1 j∗ j1 inductionj∗ j2 j2 j∗ induction

For example, if, for a certain customer class l, rejection is optimal in i,and if sending a customer to queue j∗ is optimal in i+ ej1 + ej2 , the inequalityis established by taking rejection in i+ ej1 and action j∗ in i+ ej2 , accordingto the fourth case in the table. Indeed,

vn(x,i+ej1 )−vn(x,i) ≤ v

n(x,i+ej1+ej2 )−v

n(x,i+ej2 ) ≤ v

n(x,i+ej1+ej2+ej∗ )−v

n(x,i+ej2+ej∗ )

by using induction at both steps, giving

bl + vn(x,i+ej1 ) + vn(x,i+ej2+ej∗ ) ≤ vn(x,i+ej1+ej2+ej∗ ) + bl + vn(x,i).

If i+ ej1 + ej2 = e, only the first three cases have to be considered.Regarding the departures we have, concerning server j1 and j2,

µj1vn(x,i) + µj2v

n(x,i+ej1 ) + µj1v

n(x,i+ej2 ) + µj2v

n(x,i)

at both sides. The other terms follow by induction.

Proof of lemma 3.5.1. The proof goes by induction. We start with (3.5.2).Let j∗ be the optimal action in i+ ej2 . The analysis goes as usual by differen-tiating between j∗ 6= j1 and j∗ = j1. If j∗ 6= j1, then

vn(y,i+ej1+ej∗ ,k)

(3.5.2)

≤ vn(y,i+ej2+ej∗ ,k).

Page 101: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 97

If j∗ = j1, thenvn(y,i+ej1+ej2 ,k) ≤ v

n(y,i+ej2+ej1 ,k).

Because qxay;i+ej1= qxay;i+ej2

, the term on arrival follows.The terms concerning departures go the same as in the proof of lemma

3.2.1. Consider a departure at queue j, j 6= j1, j2 with ij > 0:

µji+ej1 vn(x,i+ej1−ej ,k+1)

(3.5.2)

≤ µji+ej1 vn(x,i+ej2−ej ,k+1)

(3.5.3)

µji+ej2 vn(x,i+ej2−ej ,k+1) + (µji+ej1 − µji+ej2 )vn(x,i+ej2 ,k).

The terms corresponding to a departure from queue j1 and j2 will be consideredtogether. We have, by (3.5.2), that both vn(x,i+ej1−ej1 ,k+1) and vn(x,i+ej1−ej2 ,k+1)

are smaller than both vn(x,i+ej2−ej1 ,k+1) and vn(x,i+ej2−ej2 ,k+1). As µj1i+ej1 +µj2i+ej1 ≥ µj1i+ej2 + µj2i+ej2 , we have, together with (3.5.3),

µj1i+ej1 vn(x,i+ej1−ej1 ,k+1) + µj2i+ej1 v

n(x,i+ej1−ej2 ,k+1) ≤

µj1i+ej2 vn(x,i+ej2−ej1 ,k+1) + µj2i+ej2 v

n(x,i+ej2−ej2 ,k+1)+

(µj1i+ej1 + µj2i+ej1 − µj1i+ej2 − µj2i+ej2 )vn(x,i+ej2 ,k).

The terms concerning costs and the dummy transition follow easily. We con-tinue with (3.5.3). Let j∗ be the optimal action in i + ej1 . Then j∗ is alsoallowed in i. Now we have, by (3.5.3) and (3.5.4),

qxay;i minjvn(y,i+ej ,k+1) ≤

qxay;i+ej1vn(y,i+ej∗ ,k+1) + (qxay;i − qxay;i+ej1

)vn(y,i+ej1 ,k+1) ≤

qxay;i+ej1vn(y,i+ej1+ej∗ ,k) + (qxay;i − qxay;i+ej1

)vn(y,i+ej1 ,k) =

qxay;i+ej1minjvn(y,i+ej1+ej ,k)+ (qxay;i − qxay;i+ej1

)vn(y,i+ej1 ,k).

The arrival term follows as usual. Note that when i + ej1 = B and an arrivaloccurs, this customer is rejected. This is equivalent to taking qxay;i = 0 if|i| ≥ |B|.

By µji ≥ µji+ej1 , we have for j 6= j1 and ij > 0

µjivn(x,i−ej ,k+2) ≤ µji+ej1 v

n(x,i+ej1−ej ,k+1) + (µji − µji+ej1 )vn(x,i+ej1 ,k),

using once or twice (3.5.3). Using (3.5.3) gives for queue j1:

µj1ivn(x,i−ej1 ,k+2) + (µ− µj1i)vn(x,i,k+1) ≤ µv

n(x,i,k+1) ≤

µj1i+ej1 vn(x,i,k+1) + (µ− µj1i+ej1 )vn(x,i+ej1 ,k)

Again the terms concerning costs and the dummy transition follow easily. It istrivial to prove equation (3.5.4). The proofs of (3.5.5) and (3.5.2) are similar.

Page 102: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

98 Proofs of dynamic programming results

Proof of lemma 3.5.3. By induction. We follow the proof of lemma 3.5.1.First observe that

vn(x,i,k+1) ≤ vn(x,i+ej1 ,k) ≤ v

n(x,i,k),

or vn(x,i,k+1) ≤ vn(x,i−ej1 ,k+1) ≤ vn(x,i,k) if i = B. Thus also (3.5.4) holds. Theconditions on the arrival process and service rates are stronger than in theprevious model, thus the proof of (3.5.6), (3.5.7) and (3.5.9) is equal to that oflemma 3.5.1.

Now consider equation (3.5.8). Let j∗ be the optimal assignment in state(y, i, k). If j∗ 6= j1 then

minjvn(y,i+ej1+ej ,k) ≤ v

n(y,i+ej1+ej∗ ,k)

(3.5.8)

≤ vn(y,i+ej∗ ,k) = minjvn(y,i+ej ,k),

if j∗ = j1 then

minjvn(y,i+ej1+ej ,k) ≤ v

n(y,i+ej1 ,k) = min

jvn(y,i+ej ,k).

Because qxay;i = qxay, the terms on arrivals follow.The departure terms from each queue, except queue j1, follow easily with

induction. Note that we use here that µji+ej1 = µji if j 6= j1. Concerningqueue j1, we have

µj1i+ej1 vn(x,i,k+1) ≤ µj1iv

n(x,i−ej1 ,k+1) + (µj1i+ej1 − µj1i)v

n(x,i,k),

by (3.5.8) and (3.5.4).

Proof of lemma 3.6.1. The proof goes by induction. Assume that the lemmaholds up to n. First we show that the SIP is optimal for n + 1. Consider twoserver assignments, which differ only in the assignment of 2 servers, say serverk1 and k2, which are assigned to queue j1 and j2. In one assignment server k1

is assigned to queue j1 and server k2 to j2, in the other assignment v.v. Thedifference between the departure terms is

(pk1(x)−pk2(x))(µj1v

n(x,i−ej1 ) +(µ−µj1)vn(x,i)−µj2v

n(x,i−ej2 )− (µ−µj2)vn(x,i)

),

which is negative if pk1(x) > pk2(x) and j1 < j2. Thus queue j1 should beserved by the faster server. By taking pk2(x) = 0 we have that serving queuej1 is better than serving queue j2. Repeating this gives the optimality of theSIP.

We start with (3.6.1). Because |i| ≥ 2 we only deal with states for whichqkxay;i = qkxay, therefore we omit the i in the notation. The fact that qkxay;0 neednot be equal to qkxay plays a role only in the proof of (3.6.2). Rewrite (3.6.1) as

µj1vn(x,i−ej1 ) + (µj2 − µj1)vn(x,i) ≤ µj2v

n(x,i−ej2 ), µj2 − µj1 ≥ 0

Page 103: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 99

Consider the terms corresponding to arrivals. Assume a∗ is the optimal actionin (x, i− ej2). Then we have

µj1 mina

∑y

λxay

( m∑j=1

qjxayvn(y,i−ej1+ej)

+ (1−m∑j=1

qjxay)vn(y,i−ej1 )

)+

(µj2 − µj1) mina

∑y

λxay

( m∑j=1

qjxayvn(y,i+ej)

+ (1−m∑j=1

qjxay)vn(y,i))≤

µj1∑y

λxa∗y

( m∑j=1

qjxa∗yvn(y,i−ej1+ej)

+ (1−m∑j=1

qjxa∗y)vn(y,i−ej1 )

)+

(µj2 − µj1)∑y

λxa∗y

( m∑j=1

qjxa∗yvn(y,i+ej)

+ (1−m∑j=1

qjxa∗y)vn(y,i)) (3.6.1)

µj2∑y

λxa∗y

( m∑j=1

qjxa∗yvn(y,i−ej2+ej)

+ (1−m∑j=1

qjxa∗y)vn(y,i−ej2 )

)=

µj2 mina

∑y

λxay

( m∑j=1

qjxayvn(y,i−ej2+ej)

+ (1−m∑j=1

qjxay)vn(y,i−ej2 )

).

Note that we used µj1 ≤ µj2 explicitly here; if µj1 > µj2 there would have been2 expressions (with positive coefficients) on the r.h.s. and there would not havebeen 1 minimizing action. We would not have this problem if there were noactions to choose, i.e. if the arrivals are independent.

Consider the terms concerning departures. We write pj instead of pj(x),and assume that p1 ≥ · · · ≥ ps. We also assume |i| ≥ s+1. We distinguish twocases. First assume that there are customers in queues j∗1 ≤ · · · ≤ j∗s presentin state i− ej1 with j∗s ≤ j1. Because j1 < j2, the same action, say j∗1 , . . . , j

∗s ,

is optimal in i, i− ej1 and i− ej2 . We have for 1 ≤ k ≤ s:

µj1µj∗kvn(x,i−ej1−ej∗k

) + (µj2 − µj1)µj∗kvn(x,i−ej∗

k)+

µj1(µ− µj∗k)vn(x,i−ej1 ) + (µj2 − µj1)(µ− µj∗

k)vn(x,i)

(3.6.1)

µj2µj∗kvn(x,i−ej2−ej∗k

) + µj2(µ− µj∗k)vn(x,i−ej2 ).

(4.3.3)

Now the departure terms of (3.6.1) follow easily:

µj1 minl1,...,ls

s∑k=1

(pkµlkv

n(x,i−ej1−elk ) + pk(µ− µlk)vn(x,i−ej1 )

)+

(µj2 − µj1) minl1,...,ls

s∑k=1

(pkµlkv

n(x,i−elk ) + pk(µ− µlk)vn(x,i)

)=

Page 104: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

100 Proofs of dynamic programming results

s∑k=1

pkµj1µj∗kv

n(x,i−ej1−ej∗k

) + pk(µj2 − µj1)µj∗kvn(x,i−ej∗

k)

+

s∑k=1

pkµj1(µ− µj∗

k)vn(x,i−ej1 ) + pk(µj2 − µj1)(µ− µj∗

k)vn(x,i)

(4.3.3)

s∑k=1

pkµj2µj∗kvn(x,i−ej2−ej∗k

) +s∑

k=1

pkµj2(µ− µj∗k)vn(x,i−ej2 ) =

µj2 minl1,...,ls

s∑k=1

(pkµlkv

n(x,i−ej2−elk ) + pk(µ− µlk)vn(x,i−ej2 )

).

Concerning the second case, assume that all class j1 customers are served instate i. Consider the optimal assignment in i − ej2 , being j∗1 ≤ · · · ≤ j∗swith j∗s1 = j1. Assign server s1 in both i and i − ej1 to queue j2 and allother servers to the same queues as in i − ej2 . Then (4.3.3) holds for server1, . . . , s1 − 1, s1 + 1, . . . , s. For server s1 we have

µj1µj2vn(x,i−ej1−ej2 ) + µj1(µ− µj2)vn(x,i−ej1 )+

(µj2 − µj1)µj2vn(x,i−ej2 ) + (µj2 − µj1)(µ− µj2)vn(x,i)

(3.6.1)

µj2µj1vn(x,i−ej2−ej1 ) + µj2(µ− µj1)vn(x,i−ej2 ).

The terms concerning departures follow in the same way as in the first case.Now assume 2 ≤ |i| ≤ s. Suppose that server s1 is assigned to queue j1 in

state i−ej2 . The term concerning server s1 is similar to the corresponding termin the previous case, and in state i we keep one customer unserved. Again, theterms concerning departures follow easily.

It remains to study the dummy term, which goes by induction.We continue with (3.6.2), which is much easier to prove. Let a∗ be the

optimal action for the MDAP in (x, i). Note that qjxay;i−ej1≤ qjxay;i. Then we

have

mina

∑y

λxay

( m∑j=1

qjxay;i−ej1vn(y,i−ej1+ej)

+ (1−m∑j=1

qjxay;i−ej1)vn(y,i−ej1 )

)≤

∑y

λxa∗y

( m∑j=1

qjxa∗y;i−ej1vn(y,i−ej1+ej)

+ (1−m∑j=1

qjxa∗y;i−ej1)vn(y,i−ej1 )

) (3.6.2)

∑y

λxa∗y

( m∑j=1

qjxa∗y;ivn(y,i+ej)

+ (1−m∑j=1

qjxa∗y;i)vn(y,i)

)=

mina

∑y

λxay

( m∑j=1

qjxay;ivn(y,i+ej)

+ (1−m∑j=1

qjxay;i)vn(y,i)

).

Page 105: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Proofs of chapter 3 101

Let j∗1 , . . . , j∗s be the optimal assignment in (x, i). If j1 does not belong to this

action, we have

µkvn(x,i−ej1−ek) + (µ− µk)vn(x,i−ej1 )

(3.6.2)

≤ µkvn(x,i−ek) + (µ− µk)vn(x,i) (4.3.4)

for k = j∗1 , . . . , j∗s . Summing gives the expression wanted. If j1 does be-

long to the optimal action in (x, i), say j1 = j∗s , take the suboptimal actionj∗1 , . . . , j

∗s−1, 0 in (x, i− ej1). We have (4.3.4) for k = j∗1 , . . . , j

∗s−1. For the last

server we have

µvn(x,i−ej1 )

(3.6.2)

≤ µj1vn(x,i−ej1 ) + (µ− µj1)vn(x,i).

Summing gives the expression for the suboptimal action. As the optimal actionis even better, we have the inequality wanted.

Proof of lemma 3.7.1. By induction. Assume the lemma holds up to n.We start with the terms on arrivals of (3.7.1). We consider each customerseparately, instead of each queue separately. All customers who are not in thequeues in state i can enter the system in the states considered in (3.7.1). Theirterms go directly with induction. Consider the extra customers in queue j1and j2. We have

λj2µj1vni−ej1

+ λj2(µ− µj1)vni(3.7.1)

λj2µj2vni−ej2

+ λj2(µ− µj2)vni(3.7.2)

λj1µj2vni−ej2

+ (λj2µ− λj1µj2)vni .

This gives

λj1µj1vni + λj2µj1v

ni−ej1

+ λj1(µ− µj1)vni + λj2(µ− µj1)vni ≤

λj1µj2vni−ej2

+ λj2µj2vni + λj1(µ− µj2)vni + λj2(µ− µj2)vni ,

which are the terms on the extra customers.The departure and dummy terms can be proved in a similar way as in

the proof of lemma 1.11.5. We continue with (3.7.2). Again all terms followdirectly by induction, with an exception for the extra customer in queue j1.

Page 106: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

102 Proofs of dynamic programming results

Page 107: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Chapter 5

Uniformization

5.1. Introduction

The dynamic programming results of the previous chapters are obtained fordiscrete-time models. Here we establish, for the policies optimal in the discrete-time models, optimality at T in the continuous-time models, for all T . Firstwe make a distinction between policies and decision rules. A decision rule is afunction prescribing for each state which action to take (or more generally, foreach state it is a distribution on the actions). A policy R is, in the discrete-time case, a sequence of decision rules (f1, f2, . . .), with fn the decision ruleat time n. If the system is controlled continuously in [0,∞), R is a familyft, t ∈ [0,∞) with ft the decision rule at t.

We consider the following controllable model. We have a countable statespace E. If action a ∈ A(x) is chosen in x ∈ E, the system goes to y withintensity qxay. We assume that there is a constant α such that

∑y qxay ≤ α for

all a ∈ A(x), x ∈ E. A model satisfying this condition is called uniformizable.We will consider this model for various types of cost functions.

Unfortunately not all models considered in the previous chapters conformto this description. Particularly, in the customer assignment models we firstchoose an action in the arrival process. Then, immediately after a transition inthe arrival process, the assignment action has to be chosen, possibly dependingon the state of the arrival process just reached. To be able to use the resultsof the forthcoming sections, we rewrite it in the standard form as follows. Instate (x, i) (with x the state of the arrival process) we have as possible actions(a, jz; z ∈ Λ) with a in the action set of the arrival process and 1 ≤ jz ≤ m forall z, and jz an allowable action in i. Here a is the action in the arrival process,and jz is the queue to assign the arriving customer to if the arrival processmoves to z. Thus each action has |Λ| + 1 components, giving the action setsA((x, i)). The non-negative transition intensities are for example in the modelof section 2.2:

q(x,i)(a,jz)(y,i+ej) = λxayqxay if jy = j

q(x,i)(a,jz)(y,i) = λxay(1− qxay)

q(x,i)(a,jz)(x,i−ej) = µ if ij > 0

q(x,i)(a,jz)(x,i) = 1−∑y

λxay − µ∑j

δij

Page 108: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

104 Uniformization

If the model is uniformizable, we can rewrite the dynamic programming equa-tion of the embedded discrete-time chain in the form of (2.2.1):

vn+1(x,i) = min

(a,jz ;z∈Λ)

∑y

(q(x,i)(a,jz)(y,i+ejy )v

n(y,i+ejy ) + q(x,i)(a,jz)(y,i)v

n(y,i)

)+

∑j

δijq(x,i)(a,jz)(x,i−ej)vn(x,i−ej) + q(x,i)(a,jz)(x,i)v

n(x,i)

=

mina

∑y

minjy

λxay

(qxayv

n(y,i+ejy ) + (1− qxay)vn(y,i)

)+

∑j

δijµvn(x,i−ej) + (1−

∑y

λxay − µ∑j

δij )vn(x,i) =

mina

∑y

λxay

(qxay min

jvn(y,i+ej)+ (1− qxay)vn(y,i)

)+

∑j

δijµvn(x,i−ej) + (1−

∑y

λxay − µ∑j

δij )vn(x,i).

A disadvantage of this way of rewriting is the fact that models that originallyhad only finite action sets now have infinite action sets.

A way to get around this problem is to allow for 2 transitions immediatelyafter each other at the jump times. We illustrate this idea again with themodel of section 2.2. Let the jump times be exponentially distributed withrate α = γ + mµ. Assume that the process is in (x, i). An action a ∈ A(x) isselected and we have as transition probabilities p for the first jump:

p(x,i)a(y,i,0) =λxayα

qxay

p(x,i)a(y,i,1) =λxayα

(1− qxay)

p(x,i)a(x,i,2) = mµ

α

p(x,i)a(x,i,3) = 1− λxayα−mµ

α

The third component of the state indicates the event at the immediate secondtransition, with probabilities denoted p. Take j as the assignment action.

p(x,i,0)j(x,i+ej) = 1

p(x,i,1)j(x,i) = 1

p(x,i,2)j(x,(i−ej1 )+) =1m

Page 109: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Uniformization with fixed parameter 105

p(x,i,3)j(x,i) = 1

Here we also show that equation (2.2.1) can be obtained. Let wn be the valuefunction for the present model. We will show that if w2n = vn in states of theform (x, i), then w2n+2 = vn+1. For ease of notation, assume α = 1.

w2n+2(x,i) = min

a

∑y

λxayqxayw2n+1(y,i,0) +

∑y

λxay(1− qxay)w2n+1(y,i,1)+

mµw2n+1(x,i,2) + (1−

∑y

λxay −mµ)w2n+1(x,i,3)

)=

mina

∑y

λxayqxay minjw2n

(y,i+ej)+

∑y

λxay(1− qxay)w2n(y,i)+

µ∑j

w2n(x,(i−ej)+) + (1−

∑y

λxay −mµ)w2n(x,i)

)This completes the induction step.

The results for the discrete-time models are of two types. First we havethe models of chapter 1. The optimal policies obtained there are myopic, i.e.they have the same decision rule for all n. Continuous-time results for thesemodels are obtained in the next section.

In section 5.3 models with horizon-dependent optimal policies are studied.Here extra conditions are necessary to obtain optimality at T .

5.2. Uniformization with fixed parameter

In this section we assume that we have a uniformizable problem, and that theoptimal policy of the discrete-time model is myopic and independent of theuniformization parameter. Furthermore, assume that the costs are bounded,either from above or below. Consider a model in which there are only costsat T , thus the problem is how to control the model from 0 to T . We call theclass of policies in the continuous-time model that only can change actions atthe jump times the semi-Markov policies. Note that this is not a restriction forthe customer assignment models, because there the only action of importanceis the one taken at the jump times. In the server assignment models however,it is a restriction. Denote with φT (φT (R)) the minimal costs (the costs usingpolicy R) at T . Let R∗ be the policy with f∗t = f∗ for all t, with f∗ the optimaldecision rule in the discrete-time model. Then we have the following.

5.2.1. Theorem. φT (R∗) ≤ φT (R) for each semi-Markov policy R.

Proof. The evolution of the process is completely described by two indepen-dent random processes: the Poisson process generating the transition timesand the embedded chain generating the actual transitions at the transitiontimes. Note that the former process does not depend on the policy chosen,

Page 110: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

106 Uniformization

the latter however does. We condition on the transition times, both for R andR∗. Let ω ∈ Ω be a realization of the transition time process with probabilityspace (Ω, P ), with ω = (t1, . . . , tu), 0 ≤ t1 ≤ · · · ≤ tu ≤ T , the t’s being thetransition times. Note that u is a realization of a Poisson distributed randomvariable. The decision rule used at t by R is completely determined by thejump times before t and the states at these moments. This induces a policy Rωin the embedded chain. Note that R∗ω does not depend on ω, therefore we useR∗ also for the discrete-time policy. Denote by vnx (Rω, ω) the value function ofthe embedded chain (not necessarily in the standard form). By the optimalityof R∗ we have vnx (R∗, ω) = vnx ≤ vnx (Rω, ω). Denote by φTx (R) the expectedcosts at T using R and starting in x. Then

φTx (R∗) =∫

Ω

vuxdP (ω) ≤∫

Ω

vux(Rω, ω)dP (ω) = φTx (R).

Note that, although φTx (R) can be infinite, it is well defined due to the bound-edness of the costs and therefore of the vn.

The optimal policy also minimizes the costs from 0 up to T , because thatis the integral over the costs from 0 to T . Thus, we do not need to introduceimmediate costs in the dynamic programming equation.

The process just described is called uniformization. It is essential that therate out of each state is uniformly bounded (otherwise we cannot formulate thediscrete-time dynamic programming equations) and that the policy R∗ is thesame for each n. In the models of the chapters 2 and 3 the latter condition isnot satisfied, giving need for a limiting argument, which is the subject of thenext section.

Note that if a policy is stochastically optimal in the discrete-time model,it is also stochastically optimal in the continuous-time model.

Summarizing, we have the following.

5.2.2. Corollary. The policies minimizing the costs in the discrete-time mod-els considered in chapter 1 minimize the costs at T (from 0 to T ) in thecontinuous-time models in the class of semi-Markov policies, if the costs arebounded, either from above or below.

The decisions are taken on the Poisson epochs, even if there is a dummytransition. By increasing the uniformization parameter we add decision epochs.This way we can approximate continuous-time control. Roughly speaking theclass of limiting policies are called strongly regular in Hordijk & Van der DuynSchouten [29]. More precisely, a policy is strongly regular if for almost allsample paths the time points at which the control is discontinuous has Lebesguemeasure zero. See Hordijk & Van der Duyn Schouten [29] for details.

5.2.3. Corollary. The policies minimizing the costs in the discrete-time mod-els considered in chapter 1 minimize the costs at T (from 0 to T ) in the

Page 111: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Continuous-time Bellman equation 107

continuous-time models in the class of strongly regular policies, if the costsare bounded, either from above or below.

Also results on discounted and average costs can be obtained. It is clearfrom theorem 5.2.1 that R∗ minimizes

∫ T0

e−βtφtx(R) dt and 1T

∫ T0φtx(R) dt and

their limits for T →∞, if they exist.Usually definitions for discounted and average optimality other than the

ones given above, using semi-Markov Decision Processes, are used. We cantranslate the continuous-time problems into discrete-time ones, like the oneswe studied in chapter 1. See for example Serfozo [63] for this equivalence.Then, under suitable conditions guaranteeing the convergence of the successiveapproximation scheme, optimality of R∗ for average and discounted optimalityfollows. Convergence of successive approximation can be proved for exampleusing negative dynamic programming (Ross [60]) or by showing ν-geometricrecurrence (Spieksma [69]).

A complication using successive approximation is that the analysis of chap-ter 1 only considers costs at the end of the horizon, as we took vn of the formvn = inffP (f)vn−1, v0 = c, with P (f) the transition matrix under decisionrule f . However, we are interested in wn = inffc+βP (f)wn−1 with w0 = 0.By the assumption of this section that the optimal policy is myopic, we havewn = v0 + · · · + βn−1vn−1. If R∗ = (f, f, . . .) is the optimal policy, this givesus for arbitrary R

wn(R∗) = wn ≤ wn(R).

5.3. Continuous-time Bellman equation

In this section we give another approach to continuous-time control. We showthat under mild conditions on the cost functions the solutions of the dynamicprogramming equations converge to the solution of the continuous-time Bell-man equation. Hence the structure of the optimal value functions carry over tothe continuous-time model, and therefore so does the structure of the optimalpolicy.

We can use this method not only in the models of chapter 1, but alsoin the (non-myopic) models of the chapters 2 and 3. The method is usuallyreferred to as time-discretization. Our analysis is based on the results of VanDijk [73], as he allows for both positive and negative unbounded costs. Besidesthis he considers salvage costs. We rewrite his results here for the simpler caseof a uniformizable model. Denote with G(f) the infinitesimal generator of theprocess if decision rule f is used.

For the customer assignment model we would have the following generator.Assume the current state is (x, i) and f(x, i) = (a, jz). Then, for example

G(f)(x,i)(y,i+ej) = λxayqxay if j = jy,

G(f)(x,i)(x,i−ej) = µ if ij > 0

Page 112: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

108 Uniformization

and

G(f)(x,i)(x,i) = −(∑

y

λxay + µm∑j=1

δij

).

Basic to the analysis is the continuous-time Bellman equation. Heuristically,this equation can be derived as follows. Let φt denote again the expected costsfor horizon t. We are interested in φT , thus φt are in fact the expected costsfrom T − t to T . Assume that continuously over time costs with rate c areincurred, φ0 are the costs at the end. Then we have:

d

dtφs = inf

fc+G(f)φs.

Integrating from 0 to t gives

φt − φ0 =∫ t

0

inffc+G(f)φsds,

the Bellman equation. Note that we have a model with immediate costs, ascontrasted with the model used in uniformization. We need to do it this waybecause we cannot introduce immediate costs afterwards, for the same reasonthat we cannot use uniformization with a fixed parameter here. That is, forminimizing costs at different T we have different optimal policies.

Now we introduce our computational scheme. We assume that the modelis uniformizable, i.e. there is a constant α such that for each state x and decisionrule f we have |G(f)xx| ≤ α. Let h be a positive number, h ≤ 1/α. DefinePh(f) = hG(f) + I. Thus Ph(f) is the transition matrix of the discrete-timemodel obtained by uniformization with parameter 1/h. Take hc as immediatecosts. Now define

vh,n+1 = inffhc+ Ph(f)vh,n, h(n+ 1) ≤ T,

vh,0 = φ0.

We will show that vh,k with k = bt/hc, t ≤ T , converges as h → 0 to thesolution of the Bellman equation under certain conditions. Heuristically, whenseen as uniformization, this can be explained by noting that the number ofjumps before T converges to a constant as h decreases (which follows fromlemma A.2). When seen as discretization, vh,n is the first order approximationof the costs at hn. By the infinitesimal properties, the transition rates convergeto their first order approximations as h→ 0.

The conditions involve the weighted supremum norm, defined as follows:‖b‖ν = supx |bx|/νx with ν > 0 the bounding vector. For a matrix the normis defined as follows: ‖A‖ν = supx

∑y |Axy|νy/νx. We will often use that

‖Ab‖ν ≤ ‖A‖ν‖b‖ν . We assume the following.

Page 113: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Continuous-time Bellman equation 109

5.3.1. Assumption. There are ν ≥ 1, constants K1 and K2 such that

‖G(f)‖ν ≤ K1 for all f, ‖c‖ν ≤ K2 and ‖φ0‖ν ≤ K2.

We will check these conditions for various cost functions. If φ0 is anindicator function and c = 0, we take ν = e. Then ‖G(f)‖ν ≤ 2α and ‖φ0‖ν ≤1. In the customer assignment model with c(x,i) or φ0

(x,i) = i1 + · · ·+ im, takeν(x,i) = i1 + · · ·+im∨1. Then ‖G(f)‖ν ≤ 2α and ‖c‖ν , ‖φ0‖ν ≤ 1. In the serverassignment models with c(x,i) or φ0

(x,i) = i1c1 + · · ·+imcm take ν(x,i) = (i1|c1|+· · · + im|cm|) ∨ 1. Because ν(x,i+ej)/ν(x,i) and ν(x,i−ej)/ν(x,i) ≤ 1 + maxj |cj |,‖G(f)‖ν ≤ 2α(1 + maxj |cj |) and ‖c‖ν , ‖φ0‖ν ≤ (1 + maxj |cj |). Similarly, ifc(x,i) or φ0

(x,i) = (i1c1 + · · ·+ imcm)n take ν(x,i) = (i1|c1|+ · · ·+ |im|cm)n ∨ 1,giving ‖G(f)‖ν ≤ 2α(1 + maxj |cj |)n and ‖c‖ν , ‖φ0‖ν ≤ (1 + maxj |cj |)n.

5.3.2. Theorem. There are φtx such that vh,bt/hcx → φtx with h = 2−m as

m→∞, for t ≤ T and all i.

Proof. First we show that all vh,n with hn ≤ T are ν-bounded:

‖vh,m‖ν ≤ supf‖hc+ Ph(f)vh,m−1‖ν ≤

h supf‖c‖ν + sup

f‖Ph(f)‖ν‖vh,m−1‖ν ≤ hK2 + (1 + hK1)‖vh,m−1‖ν

(5.3.1)

Now we have, since 1 + c ≤ ec,

‖vh,n‖ν ≤n−1∑k=0

(1 + hK1)khK2 + (1 + hK1)n‖vh,0‖ν ≤ T eTK1K2 + eTK1K2.

(5.3.2)Let us denote the r.h.s. by C1.

We will prove the convergence by first deriving a relation between vh,n andvh/2,2n. By induction on n we prove

vh/2,2n ≤ vh,n + ν(1 + hK1)n(nh2 + h)C2 (5.3.3)

for C2 ≥ K1K2 +K21C1. Assume the inequality holds up to k.

vh/2,2k+2 = inffh/2c+ Ph/2(f)vh/2,2k+1 ≤

inffh/2(I + Ph/2(f))c+ Ph/2(f)Ph/2(f)vh/2,2k =

inff(h+ (h/2)2G(f))c+ (Ph(f) + (h/2)2(G(f))2)vh/2,2k ≤

inffhc+ Ph(f)vh/2,2k+ sup

f(h/2)2G(f)c+ sup

f(h/2)2(G(f))2vh/2,2k ≤

Page 114: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

110 Uniformization

vh,k+1 + ν(1 + hK1)k+1(kh2 + h)C2 + supf(h/2)2G(f)c+

supf(h/2)2(G(f))2vh/2,2k.

We have‖(h/2)2G(f)c‖ν ≤ (h/2)2K1K2,

and, using (5.3.2),

‖(h/2)2(G(f))2vh/2,2k‖ν ≤ (h/2)2K21‖vh/2,2k‖ν ≤ h2K2

1C1,

givingsupf(h/2)2G(f)c+ sup

f(h/2)2(G(f))2vh/2,2k ≤

νh2(K1K2 +K21C1) ≤ νh2C2.

Thus, because (1 + hK1)k ≥ 1, the inequality holds.Because b2t/hc = 2bt/hc or 2bt/hc+ 1, we have

vh/2,b2t/hc ≤ vh/2,2bt/hc + νh(K1C1 +K2),

by (5.3.1). Thus

vh/2,b2t/hc ≤ vh,bt/hc + ν(b thch2 + h)C,

if C ≥ eTK1C2 +K1C1 +K2.Iterating this last inequality k times, for h of the form 2−m, gives

v2−(m+k),bt/2−(m+k)c ≤ v2−m,bt/2−mc +k−1∑l=0

ν(T + 1)C2−m2−l ≤

v2−m,bt/2−mc + ν(T + 1)C2−m+1.

Because the space of vectors with bounded ν-norm is a Banach space, we havethat vh,bt/hcx for each x has at least one limit point. To show that there is aunique limit point, suppose that, for fixed x, v′x and v′′x are limit points, withv′x < v′′x . Take ε < (v′′x − v′x)/3, and m such that |v2−m,bt/2−mc

x − v′x| < ε and

νx(T + 1)C2−m+1 < ε. Then v2−(m+k),bt/2−(m+k)cx < v′′x − ε for all k. Hence v′′x

is not a limit point.

5.3.3. Theorem. The function φt is a solution of the Bellman equation.

Proof. We have, for h = 2−m,

vh,n+1 − vh,n = h inffG(f)vh,n,

and thus

vh,bt/hc − vh,0 =∫ hbt/hc

0

inffG(f)vh,bs/hcds.

The left hand side converges to φt−φ0 for each i. By dominated convergence ther.h.s. converges to

∫ t0

inffG(f)φsds for fixed i, giving the Bellman equation.For further details, we refer to Van Dijk [73].

Page 115: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Continuous-time Bellman equation 111

We will not go into the details of showing that this solution is unique. Dueto the finite action sets of the models we consider the infima are always attainedand optimal policies exist. And as the discrete-time value functions convergesto the continuous-time value function, the inequalities we typically prove forthe discrete-time models also hold for the continuous-time models. This meansthat the optimality results also hold for the continuous-time models. As weconsidered both terminal costs and costs over time, the results hold both forcosts at T and for costs over time.

5.3.4. Corollary. The policies minimizing the costs in the discrete-time mod-els considered in chapters 2 and 3 minimize the costs at T (from 0 to T ) inthe continuous-time models, if the transition rates and costs satisfy assumption5.3.1.

Regarding discounted and average optimality, the same remarks as in theprevious section apply here.

Page 116: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

112 Uniformization

Page 117: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Appendix A

The approximation of point processes

Here we show that any marked arrival stream can be approximated in thesense of weak convergence by a series of MAP’s. For stationary point processesthis has already been proven by Herrmann [20]. Note that we used the marksin the previous chapters to indicate the class of arrivals and server vacations.We use the following definition of an MAP:

A.1. Definition. (Markov Arrival Process) Let Λ be the, possibly count-able, state space of a Markov process with transition rates λxy, x, y ∈ Λ. Whenthis process moves from x to y with probability qvxy an arrival with mark v oc-curs, with

∑v∈B q

vxy ≤ 1 for all x, y ∈ Λ and B ⊂ IR+. The triple (Λ, λ, q) is

an MAP.

A series of random variables Xm = (Xmn )n≤N on IRN converges weakly

to X = (Xn)n≤N , notated as Xm D−→ X, if IEf(Xm) −→ IEf(X) for allcontinuous and bounded f . Assume Xm (X) has distribution function Fm(F ).

In Schassberger [62] it is shown, for N = 1, that to have Xm D−→ X, wecan take Xm such that

Fm(x) = F (0) +∞∑k=1

(F ( km )− F (k−1

m ))Ekm(x), (A.1)

where Ekm(x) is the d.f. of a gamma distributed r.v. with k phases and in-tensity m, i.e. Ekm(x) =

∑∞l=k e

−mx (mx)l

l! , the probability that a Poisson(mx)distributed r.v. has k or more successes. The result holds also if the mass at 0is omitted and if the mixture is taken finite, e.g.

Fm(x) = F ( 1m )E1

m(x)+m2−1∑k=2

(F ( km )−F (k−1

m ))Ekm(x)+

(1−F (m

2−1m )

)Em

2

m (x).

An heuristic explanation is easily given. The mass in a small interval, in thelimit each point, is approximated by a series of gamma distributions with equalmean and increasing intensity. Such a series converge to their mean a.s.

A similar result can be obtained for finite-dimensional r.v.’s. This hasalready been shown in lemma 6.1 of Hordijk & Schassberger [28]. We give a

Page 118: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

114 Appendix A

different proof here. First we construct Xm. Define

Cm(k) =

[0, 1

m ] if k = 1,

(k−1m , km ] if k ∈ 2, . . . ,m2 − 1,

(m2−1m ,∞) if k = m2.

Now we have as approximation

Fm(x) =∑

1≤kj≤m2

j=1,...,N

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

) N∏j=1

Ekjm (xj). (A.2)

We see that the mass of each cube with length of the sides 1m is put on the

upper corner, say x′. Then each component x′j of this vector is approximatedby an independent gamma distribution with parameter m and mx′j phases,giving an expectation of x′j .

A.2. Lemma. Xm D−→ X.

Proof. It is well known that weak convergence is equivalent with convergenceof the d.f. in each continuity point of F . Take such a point x. Choose an ε > 0.By continuity there exists a δ > 0 such that |F (x+s)−F (x)| ≤ ε

N+4 if |s| < δ.Now assume the integer l, a power of 2, is large enough such that

√N/l < δ

and l2−1l > maxi xi. The first condition guarantees there are vectors x =

(x1, . . . , xN ) and x = (x1, . . . , xN ) such that xj l and xj l are integer, xj < xj <

xj and the product set∏Nj=1[xj , xj ] is contained in the ball around x with

radius δ. By the integer condition x and x lie at the top corner of the cube∏Nj=1 Cl(xj l) and

∏Nj=1 Cl(xj l). As we only consider powers of 2, x and x lie

at corners as well if m > l. The second condition assures that Cm(xjm) isbounded for all j if m ≥ l.

The sum in the definition of Fm can be split in N+2 parts, namely k|1 ≤kj ≤ xjm, k|1 ≤ kj ≤ xjm,∃j : kj > xjm, k|k1 > x1m, . . . , k|kN >xNm. Note that these sets are not disjoint, the last N overlap, and thatk|kj ≤ xjm,∃j : kj > xjm = k|1 ≤ kj ≤ xjm\k|1 ≤ kj ≤ xjm. Now wehave if m ≥ l: ∣∣∣Fm(x)− F (x)

∣∣∣ ≤ ∣∣∣F (x)− F (x)∣∣∣+

∣∣∣ ∑1≤kj≤xjm

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

)(1−

N∏j=1

Ekjm (xj))∣∣∣+

∣∣∣ ∑1≤kj≤xjm∃j:kj>xjm

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

) N∏j=1

Ekjm (xj)∣∣∣+

Page 119: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

The approximation of point processes 115

∣∣∣ ∑k1>x1m

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

) N∏j=1

Ekjm (xj)∣∣∣+ · · ·+

∣∣∣ ∑kN>xNm

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

) N∏j=1

Ekjm (xj)∣∣∣.

The inequality holds because the probabilities of the second term of the r.h.s.sum to F (x).

We give a bound for each term. It is easily seen that |F (x)−F (x)| ≤ εN+4

if m ≥ l. Consider the second term of the r.h.s. If kj ≤ xjm then Ekjm (x) ≥

Exjmm (x). Because x < x, limm→∞E

xjmm (x) = 1. We choose m large enough

such that F (x)(1 −

∏Ekjm (x)

)≤(1 −

∏Exjmm (x)

)≤ ε

N+4 . The probabilitiesin the next term summed gives F (x)−F (x), therefore this term is bounded by2 εN+4 . For the last N terms we have the following. If kj > xjm then Ekjm (x) ≤Exjmm (x). By choosing m large enough we have Exjmm (x) ≤ ε

N+4 which givesthe inequality wanted. All equalities summed gives |Fm(x)− F (x)| ≤ ε.

We are interested in the convergence of (Xmn )n∈IN to (Xn)n∈IN. We

are working in the product topology. Then the following holds (e.g. by Billings-ley [6]):

(Xmn )n∈IN

D−→ (Xn)n∈IN

if and only if(Xm

n )n≤ND−→ (Xn)n≤N

for all finite N . However, first we have to check whether (Xn)n∈IN is welldefined. This is done by checking consistency of the finite dimensional r.v.s(Xm

n )n≤N (see Loeve [43], p. 94).

A.3. Lemma. (Xmn )n≤N are consistent for all N .

Proof. By the symmetry of (A.2) it suffices to show that the projection of(Xm

n )n≤N on IRN−1+ is equally distributed as (Xm

n )n≤N−1. We have, as∪m2

kN=1Cm(kN ) = IR+,

IP(Xm

1 ≤ x1, . . . , XmN−1 ≤ xN−1, X

mN ∈ IR+

)=

∑1≤kj≤m2

j=1,...,N

IP(X1 ∈ Cm(k1), . . . , XN ∈ Cm(kN )

)N−1∏j=1

Ekjm (xj) =

∑1≤kj≤m2

j=1,...,N−1

IP(X1 ∈ Cm(k1), . . . , XN−1 ∈ Cm(kN−1)

)N−1∏j=1

Ekjm (xj).

Page 120: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

116 Appendix A

It is easily seen that the lemmas remain valid if we replace (A.2) by

Fm(x) =∑

1≤kj≤m2

j=1,...,N

IP(. . .)∏j≤Nj odd

Ekjm (xj)∏j≤Nj even

Ixj ≤

kjm

. (A.3)

Now consider the ∞-dimensional r.v. (Sn, Vn)n∈IN. Here Sn is the nthinterarrival time, Vn is the mark belonging to the nth arrival. As (Sn, Vn)n≤Nis a 2N -dimensional r.v. we can apply the results obtained above by tak-ing X

(m)2n−1 = S

(m)n and X

(m)2n = V

(m)n . With the superscript (m) we mean

that the expression holds both with and without the superscript m. Thus(Smn , V mn )n∈IN is well defined by lemma A.3 and

(Smn , V mn )n∈IND−→ (Sn, Vn)n∈IN

holds.Note that if Vn ∈ 1, . . . , l, as in the server assignment model, then also

X(m)2n ∈ IN, and X(m)

2n = Vn for m large enough, thus avoiding non-integer classnumbers.

We continue by constructing an MAP (Λ, λ, q) which generates the inter-arrival times and marks (Smn , V mn )n∈IN for an arbitrary m. First we constructΛ. Take for each N ∈ IN all vectors of the form

(β, s1, . . . , sN , v1, . . . , vN−1

)with sn, vn ∈ 1, . . . ,m2, 1 ≤ β ≤ sN and IP

(Sn ∈ Cm(sn), n ≤ N ;Vn ∈

Cm(vn), n ≤ N − 1)> 0.

Being in state(β, s1, . . . , sN , v1, . . . , vN−1

)sojourn time N is produced.

The integer β indicates the current phase of the gamma distribution.The transition rates and arrival probabilities are:

λ(β,s1,...,sN ,v1,...,vN−1)(β+1,s1,...,sN ,v1,...,vN−1) = m,β < sN :qv(β,s1,...,sN ,v1,...,vN−1)(β+1,s1,...,sN ,v1,...,vN−1) = 0.λ(β,s1,...,sN ,v1,...,vN−1)(1,s1,...,sN+1,v1,...,vN ) =β = sN :

mIP(Sn ∈ Cm(sn), n ≤ N + 1;Vn ∈ Cm(vn), n ≤ N

)IP(Sn ∈ Cm(sn), n ≤ N ;Vn ∈ Cm(vn), n ≤ N − 1

) ,qvN−1m

(β,s1,...,sN ,v1,...,vN−1)(1,s1,...,sN+1,v1,...,vN ) = 1.

All other transition intensities are 0. Note that the transition rate out of eachstate is equal to m. The transition mechanism is illustrated in figure A.1.The transition marked I (III) corresponds to an arrival of a customer withmark vN−1

m (vNm ); the next arrival will take place after sN (sN+1) phases. Attransitions marked II no arrivals occur. The result proved above can easily beextended to multi-dimensional and not necessarily positive marks.

Page 121: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

The approximation of point processes 117

................................................................................................................................................................

................................................

................................

............................................................................................................................................sN−1,s1,...,sN−1,

v1,...,vN−2...............................................................................................

I................................

................................

...............

.........

........

...............................................................................

.................

................................................................................................................................................................

................................................

................................

............................................................................................................................................ 1,s1,...,sN ,v1,...,vN−1

...............................................................................................II ..............................................................................................................................................

..................................................................

................................

............................................................................................................................................ 2,s1,...,sN ,v1,...,vN−1

...............................................................................................II . . . .

. . . . ............................................................................................... ................................................................................................................................................................

................................................

................................

............................................................................................................................................sN ,s1,...,sN ,v1,...,vN−1

...............................................................................................III................

................................

...............................

.........

........

...............................................................................

.................

................................................................................................................................................................

................................................

................................

............................................................................................................................................ 1,s1,...,sN+1,v1,...,vN

...............................................................................................

Figure A.1.

The analysis so far has to do with interarrival times and is thus in thecustomer time scale. However, weak convergence of (marked) point processesis, in general, defined in the physical time scale. To complete the analysiswe have to prove that weak convergence of the interarrival times entails weakconvergence of the point process. This result can be found in Asmussen &Koole [3].

Page 122: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Appendix B

Phase-type distributions of DFR/IFR distributions

Consider phase-type distributions of the form (A.1). The objective of thisappendix is to give a characterization of these distributions if the approximateddistribution is DFR or IFR. We will see, in the case of a decreasing (increasing)failure rate distribution, that the probability that a phase-type distribution con-sists of k phases, conditional that it consists of k or more phases, is decreasing(increasing) in k. (Decreasing and increasing are used in the non-strict sense.)For the DFR case our result is a special case of the characterization of Hordijk& Ridder [27].

Let F be a non-negative distribution function. For fixed m we defineβ1 = F (1/m) and βk = F (k/m) − F ((k − 1)/m) for k > 1. Again, let Ekm(x)be the d.f. of the gamma distribution with k phases and intensity m. Now take

Fm(x) =∞∑k=1

βkEkm(x).

It is clear, by lemma A.2, that Fm converges weakly to F . Now Fm can beseen as the time until absorption of a Markov process with initial distribution(0, β1, β2, . . .) and transitions depicted as follows

. . . . . . ................................................................................................................................ ...............................

..............................................................................................................................2 ................................................................................................................................

m

..........................................................................................................

................................................... 1 ................................................................................................................................

m

..........................................................................................................

................................................... 0

Figure B.1.

Consider the following Markov process which starts in state 1:

...............................

..............................................................................................................................1 ................................................................................................................................

(1− α1)m............................................................

..............................................

................................................... 2 ................................................................................................................................

(1− α2)m. . . . . .

. . . . . ....................................................................................................................................................................................................................................................................................................................................................................................... ........................................................

...........................................................

..............................................................................................................................0

....................................................................................................

............................α1m

....................................................................................................

............................α2m

Figure B.2.

Page 123: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Phase-type distributions of DFR/IFR distributions 119

Now take

αn =

βn

1−∑n−1

k=1βk

if∑n−1k=1 βk < 1,

1 if∑n−1k=1 βk = 1.

Then it is easily seen that the time until absorption in both processes is equallydistributed. Vice versa, βn = (1− α1) · · · (1− αn−1)αn.

We can define a distribution to be DFR or IFR if the failure rate (definedas f(t)/(1−F (t)), with f the density of F ) is decreasing or increasing. However,then we implicitly assume that the failure rate, and thus the density, exists.To avoid this, we prefer to use the definition of Barlow & Prochan [5], whichis only in terms of F (t) = 1 − F (t). Then it follows for example that F withF (t) = It ≥ x, the deterministic distribution, is also IFR, although its failurerate does not exist.

B.1. Definition. (DFR and IFR) A non-negative distribution function is:DFR if F (t+ s)/F (t) is increasing in t ≥ 0 with F (t) > 0, for each s ≥ 0;IFR if F (t + s)/F (t) is decreasing in −∞ < t < ∞ with F (t) > 0, for eachs ≥ 0.

Now we can formulate the main result of this appendix:

B.2. Theorem. If F is DFR (IFR) then αn is decreasing (increasing) in n,for all m.

Proof. First we consider the DFR case. Take s = 1/m and t = 1/m, 2/m, . . ..Then, according to the definition of DFR, F (n/m)/F ((n−1)/m) is increasing.Therefore (F (n/m)− F ((n− 1)/m))/F ((n− 1)/m) is decreasing in n. By thedefinition of βn, F ((n − 1)/m) = 1 − F ((n − 1)/m) = 1 −

∑n−1k=1 βk. Because

βn = F (n/m) − F ((n − 1)/m), αn is decreasing in n if n ≥ 2. As β1 =F (1/m) ≥ F (1/m)− F (0), we also have α1 ≥ α2.

Concerning IFR distributions, the analysis goes completely analogous, ex-cept for β1. We show that F (0) = 0 or 1. Assume F (0) = a, 0 < a < 1. Bythe right-continuity of distribution functions we can find t1 and ε such thatF (t1 + ε)/F (t1) > 1−a, and F (t1) > 0. Because F (0)/F (−ε) = 1−a, we havea contradiction with the IFR assumption. Thus F (0) = 0 or 1, in the formercase giving β1 = F (1/m)− F (0), and in the latter case αn = 1 for all n.

A disadvantage of this method is that we need an infinite number of states.Therefore we change the process of figure B.2, making the state space finite, asshown in figure B.3.

This corresponds with changing the approximation into

Fm(x) =m2∑k=1

βkEkm(x) + (1−

m2∑k=1

βk)∞∑k=1

(1− βm2)k−1βm2Em2+k

m (x).

It is easily checked that the approximation lemma A.2 also holds for Fm.

Page 124: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

120 Appendix B

...............................

..............................................................................................................................1 ................................................................................................................................

(1− α1)m............................................................

..............................................

................................................... 2 ................................................................................................................................

(1− α2)m. . . . . . . . . . ....................................................................................................

............................

(1− αm2−1)m............................................................

..............................................

................................................... m2 ....................................................................

.................................................. ............................(1− αm2)m

...............................

..............................................................................................................................0 ..................................................................................................................................................................................................................................................................................................................................................................................................................

........................................................ . . . . . . . . . . .............................................................................................................................

................................................................................................................................

............................α1m

....................................................................................................

............................α2m

....................................................................................................

............................αm2m

Figure B.3.

Page 125: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Appendix C

Majorization

In the customer assignment models the class of all allowable cost functionscan often be characterized with the help of majorization. For two types oforderings, originating from the symmetric case (see e.g. section 1.2) and thecase B = 1 (see e.g. section 1.3), we have a complete characterization. For themore general model of section 3.2 we give a conjecture for the correct ordering.

In the first ordering, all vectors considered are componentwise smaller thanthe buffer vector B. Consider the ordering ≺, with i ≺ i∗ if there are i1, . . . , in,i0 = i and in = i∗, such that

ik = ik−1 − ej1 + ej2 if 0 < ik−1j1

< ik−1j2

+ 1 (C.1)

orik = ik−1 + ej (C.2)

orik is a permutation of ik−1. (C.3)

Now consider the weak submajorization ordering ≺w (see Marshall & Olkin[45]). We write i ≺w i∗ if

∑kj=1 i[j] ≤

∑kj=1 i

∗[j] for all k, with i[1] ≥ · · · ≥ i[m]

the decreasing rearrangement of i. Thus, the sum of the kth largest componentsof i is smaller than that of i∗.

C.1. Theorem. The orderings ≺ and ≺w are equivalent.

Proof. i ≺ i∗ ⇒ i ≺w i∗. Take i0, . . . , in as in (C.1), (C.2) or (C.3). It is easyto see that ik−1 ≺w ik for all k. Because ≺w is a preordering transitivity holdsand i ≺w i∗.i ≺w i∗ ⇒ i ≺ i∗. We construct i0, . . . , in such that i = i0 ≺ · · · ≺ in = i∗.Assume that the k largest components of ik are equal to, and in the same placeas, the k largest components of i∗, and i = i0 ≺ · · · ≺ ik ≺w i∗. We constructik+1 with the property that either ik+1 has the k+ 1 largest components equalto i∗ and ik ≺ ik+1 ≺w i∗, or ik+1 = i∗. Repeating this gives the result. Forsimplicity of notation assume that k = 0.

Take the largest component of i0, say queue j1, and interchange it withthe component of i0 with the index of the longest queue of i∗, say queue j2.Call the resulting vector i′. Then, as i0[1] ≤ i∗[1], i

0[1] fits in the buffer of queue

Page 126: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

122 Appendix C

j2. Because i0j1 ≥ i0j2 , i0j2 fits in the buffer of queue j1, thus i′ ≤ B. We haveby symmetry i0 ≺ i′ and trivially i′ ≺w i∗.

If i′[2] = 0, the result follows by (C.2), because all components except i′[1]

are 0. Thus, assume i′[2] > 0.Now we transfer a customer from i′[2] to i′[1]. Call the resulting vector i′′.

By (C.1) we have i′ ≺ i′′. To show i′′ ≺w i∗ we distinguish the following twocases.

In case i′[2] > i′[3], then i′[1] + i′[2] = i′′[1] + i′′[2] and i′′ ≺w i∗ follows immedi-ately.

In case i′[2] = · · · = i′[k] > i′[k+1] then i′′[1] + · · · + i′′[l] = 1 + i′[1] + · · · + i′[l],l < k. However, it is straightforward to see that i′[1] + · · ·+ i′[l] < i∗[1] + · · ·+ i∗[l].Thus i′′ ≺w i∗.

Repeat this until either i′′[1] = i∗[1] (and call the resulting vector i1, repeatthe argument) or i′′[1] = 0 (which case is already handled).

The equivalence of theorem C.1 gives that the class of functions satisfy-ing for example (1.2.2), (1.2.3) and (1.2.4) is precisely the class of functionspreserving weak submajorization. These functions are called the weak Schurconvex functions, cf. Marshall & Olkin [45]. According to Marshall & Olkin[45], a similar result has been shown by Muirhead [48]. He shows (presumablyfor B = ∞) that the ordering obtained by transfers of the form (C.1) and(C.2) is equivalent to the majorization ordering. This ordering is like the weakmajorization ordering, but with the additional constraint

∑nj=1 i[j] =

∑nj=1 i

∗[j].

In much the same way we can give the generalization of the results ofsection 1.3. There we took B = 1. It appears that the result can easily beextended to arbitrary buffers. Then the ordering agrees with the partial sumordering of Chang et al. [12], used there in the context of a server assignmentmodel. Again, all vectors considered are smaller than B. Define the partialordering ≺ as follows: i ≺ i∗ if there are i0, . . . , in with i0 = i and in = i∗ suchthat

ik = ik−1 − ej1 + ej2 if ik−1j1

> 0 and j1 < j2 (C.4)

orik = ik−1 + ej . (C.5)

It is easily seen that the class of cost functions satisfying (1.3.2) and (1.3.3) isprecisely the class of functions preserving the ordering ≺.

We show that the ordering ≺ is equivalent to an ordering ≺′, defined by:i ≺′ i∗ if

∑mj=k ij ≤

∑mj=k i

∗j for k = 1, . . . ,m.

C.2. Theorem. The orderings ≺ and ≺′ are equivalent.

Proof. i ≺ i∗ ⇒ i ≺′ i∗. Take i = i0 ≺ · · · ≺ in = i∗ as in the definition of ≺.It is easy to see that ik−1 ≺′ ik for all k. Due to the transitivity of ≺′ we havei ≺′ i∗.

Page 127: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Majorization 123

i ≺′ i∗ ⇒ i ≺ i∗. We construct i0, . . . , in such that i = i0 ≺ · · · ≺ in = i∗.First add

∑mj=1 i

∗j −

∑mj=1 ij customers to state i0, adding them to the queues

with smallest indices, without passing the buffer sizes. Call this state i1. Theni ≺ i1 ≺′ i∗. Clearly, if i1 = i∗, we are ready. If not, let j2 be the highestnumbered queue with i1j2 < i∗j2 . Now, construct i′ = i1 − ej1 + ej2 , with j1 thehighest numbered queue with j1 < j2 and i′j1 > 0. Since

∑mj=k i

1j <

∑mj=k i

∗j

for k = j1 + 1, . . . , j2 we have i′ ≺′ i∗. Repeat this construction until we havei′j1 = i∗j1 . Choose a new j1 and repeat.

Finally, i1 ≺ · · · ≺ in and transitivity gives i ≺ i∗.

Let us look at the ≺′-preserving functions. Allowed cost functions areci =

∑mj=k ij , for all k, the total number of customers in the m−k queues with

slowest servers. Hence the FQP minimizes the total number of customers inthe m− k queues with slowest servers stochastically for all k = 1, . . . ,m. It isclear that there are other interesting ≺′-preserving functions, e.g. the weightedtotal number of customers with increasing weights.

Finally, consider the model of section 3.2. To study the allowable costfunctions, define the ordering ≺ as follows: i ≺ i∗ if there are i1, . . . , in, i0 = iand in = i∗, such that

ik = ik−1 − ej1 + ej2 if 0 < ik−1j1

< ik−1j2

+ 1 and j1 < j2 (C.6)

orik = ik−1 + ej (C.7)

or

ik a permutation of ik−1 with j1 and j2 exchanged, ik−1j1

> ik−1j2

and j1 < j2.(C.8)

Now define an ordering as follows: i ≺′ i∗ if∑nj=k(ij − l)+ ≤

∑nj=k(i∗j − l)+

for all k and l.

C.3. Conjecture. The orderings ≺ and ≺′ are equivalent.

Page 128: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Appendix D

Computational issues

In this section we compare different computational methods, mostly froma practical point of view. We consider value iteration (also called successive ap-proximation), both for discounted and average costs and uniformization (witha large parameter).

We introduce some notation. Let λxay be the transition intensity from xto y using action a and cx the costs in state x. Assume that the state space isfinite (which simplifies the analysis but is also necessary for the computations)and that

∑y λxay = α for all x and a. Consider the following iteration scheme:

vn+1x = min

a

cx + β

∑y

λxayα

vny

It is well known that vnx converges to the minimal discounted costs if β ∈ [0, 1)and that, under an aperiodicity assumption, vn+1

x −vnx converges to the minimalaverage costs if β = 1.

We start by motivating the choices for the discount factor made in table2.3.1. Assume that the costs are continuously incurred over time, meaningthat the costs at t are multiplied with βt. Then we take in the discrete modelβ = α/(log(β−1) +α) as discount factor. In table D.1 the values of β are givenfor the choices of β taken in table 2.3.1, for the typical value α = 5. It issurprising to compare the values of β and β.

β β

0.01 0.520.1 0.680.25 0.780.5 0.880.75 0.95

Table D.1. Discount factors

Our computations were done on workstations. For the model of section 2.3we give in table D.2 the computer time in seconds and the number of iterations,for λ = 1, B = 20 and an accuracy of 10−10, needed to calculate the discountedand average costs (β = 1) under the SQP with value iteration. Each iterationtakes about 4 seconds. Note that the number of states is approximately 1.6·105.When λ (or B) is increased the number of iterations sharply increases. Forexample, if λ = 1.9 and B = 30, 43288 iterations were needed.

Page 129: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Computational issues 125

β time iterations0.01 172 420.1 277 690.25 419 1050.5 758 1920.75 1708 436

1 5761 1224

Table D.2. Value iteration

Our experience is that it is easy to overlook some of the possible transitions.In the discounted case the value function still converges, but to the wrongsolution. However, in the average cost case, in a model which is irreducibleunder each policy, the costs converge to 0. Therefore it is preferable also toprogram the average cost case, even if we are only interested in discountedcosts.

We continue with uniformization. Consider the following iteration scheme:

vh,n+1x = min

a

hcx +

∑y

hλxayvny + (1− hα)vh,nx

As we showed in section 5.3, vh,bT/hc converges to the costs from 0 to T ash → 0. Little is known about the convergence of this method. Some boundson the speed of convergence can be found in Van Dijk [72] and [73]. Ourcomputational experience is summarized in table D.3. There for various valuesof T and h the total costs from 0 to T are given, again for the model of section2.3, starting from the states with in each queue 10 customers. Other startingstates give similar results. The number of iterations is h−1. Each iteration takesa little less time than for discounted and average costs, because we do not haveto check whether we are finished iterating. (Because we did computations onseveral computers, some of which were faster than others, we do not supplycomputer times.) Note that because α = 5, h needs to be smaller than 0.2. ForT = 1, it seems that an accuracy of 10−10 is obtained for h = 10−10, meaning1010 iterations. If each iteration takes 3 seconds, this takes approximately 950years. This explains why we computed, for several models, discounted andaverage costs, but not costs from 0 to T .

T = 1 10 100

h =0.1 39.550000 350.510743 1033.855844

0.01 39.505000 350.062022 1033.855578

0.001 39.500500 350.017155

0.0001 39.500050

Table D.3. Uniformization with α→∞

Page 130: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

126 Appendix D

Page 131: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

References

[1] I.J.B.F. Adan, J. Wessels & W.H.M. Zijm (1989). Analysis of the asymmet-ric shortest queue problem. Queueing Systems, Theory and Applications8: 1–58.

[2] E. Altman & G.M. Koole (1995). On submodular value functions of dy-namic programming. Technical report, INRIA Sophia Antipolis.

[3] S. Asmussen & G.M. Koole (1993). Marked point processes as limits ofMarkovian arrival streams. Journal of Applied Probability 30: 365–372.

[4] J.S. Baras, D.-J. Ma & A.M. Makowski (1985). K competing queues withgeometric service requirements and linear costs: the µc-rule is always op-timal. Systems & Control Letters 6: 173–180.

[5] R.E. Barlow & F. Proschan (1975). Statistical Theory of Reliability andLife Testing. Holt, Rinehart and Winston, New York.

[6] P. Billingsley (1968). Convergence of Probability Measures. Wiley, NewYork.

[7] J. Bruno, P. Downey & G.N. Frederickson (1981). Sequencing tasks withexponential service times to minimize the expected flow time or makespan.Journal of the ACM 28: 100–113.

[8] C. Buyukkoc, P. Varaiya & J. Walrand (1985). The cµ rule revisited. Ad-vances in Applied Probability 17: 237–238.

[9] R. Chakka & I. Mitrani (1994). Heterogeneous multiprocessor systems withbreakdowns: Performance and optimal repair strategies. Theoretical Com-puter Science 125: 91–109.

[10] C.S. Chang (1992). A new ordering for stochastic majorization: Theoryand applications. Advances in Applied Probability 24: 604–634.

[11] C.S. Chang, X. Chao & M. Pinedo (1990). A note on queues with Bernoullirouting. Proceedings of the 29th Conference on Decision and Control,Hawaii, 897–902.

[12] C.S. Chang, X. Chao, M. Pinedo & R.R. Weber (1992). On the optimalityof LEPT and cµ-rules for machines in parallel. Journal of Applied Proba-bility 29: 667–681.

[13] C.S. Chang, A. Hordijk, R. Righter & G. Weiss (1994). The stochasticoptimality of SEPT in parallel machine scheduling. Probability in the En-gineering and Informational Sciences 8: 179–188.

[14] R.B. Cooper & S. Palakurthi (1989). Heterogeneous-server loss systemswith ordered entry: an anomaly. Operations Research Letters 8: 347–349.

Page 132: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

128 References

[15] D.J. Daley (1987). Certain optimality properties of the first-come first-served discipline for G|G|s queues. Stochastic Processes and their Appli-cations 25: 301–308.

[16] R. Dekker & A. Hordijk (1991). Denumerable semi-Markov decision chainswith small interest rates. Annals of Operations Research 28: 185–212.

[17] C. Derman, G.J. Lieberman & S.M. Ross (1980). On the optimal assign-ment of servers and a repairman. Journal of Applied Probability 17: 577–581.

[18] S.G. Foss (1982). Extremal problems in queueing theory. Ph.D. thesis,Novosibirsk State University (in Russian).

[19] B. Hajek (1984). Optimal control of two interacting service stations. IEEETransactions on Automatic Control 29: 491–499.

[20] U. Herrmann (1965). Ein Approximationssatz fur Verteilungen stationarerzufalliger Punktfolgen. Mathematische Nachrichten 30: 377–381.

[21] A. Hordijk & G.M. Koole (1990). On the optimality of the generalizedshortest queue policy. Probability in the Engineering and InformationalSciences 4: 477–487.

[22] A. Hordijk & G.M. Koole (1992). On the shortest queue policy for thetandem parallel queue. Probability in the Engineering and InformationalSciences 6: 63-79.

[23] A. Hordijk & G.M. Koole (1992). The µc-rule is not optimal in the sec-ond node of the tandem queue: a counterexample. Advances in AppliedProbability 24: 234–237.

[24] A. Hordijk & G.M. Koole (1992). On the assignment of customers to par-allel queues. Probability in the Engineering and Informational Sciences 6:495-511.

[25] A. Hordijk & G.M. Koole (1993). On the optimality of LEPT and µcrules for parallel processors and dependent arrival processes. Advances inApplied Probability 25: 979–996.

[26] A. Hordijk & G.M. Koole (1995). On suboptimal policies in multi-classtandem models. To appear in Probability in the Engineering and Informa-tional Sciences.

[27] A. Hordijk & A. Ridder (1987). Stochastic inequalities for an overflowmodel. Journal of Applied Probability 24: 696–708.

[28] A. Hordijk & R. Schassberger (1982). Weak convergence for generalizedsemi-Markov processes. Stochastic Processes and their Applications 12:271–291.

[29] A. Hordijk & F. van der Duyn Schouten (1985). Markov decision drift pro-cesses; conditions for optimality obtained by discretization. Mathematicsof Operations Research 10: 160–173.

[30] D.J. Houck (1987). Comparison of policies for routing customers to parallelqueueing systems. Operations Research 35: 306–310.

[31] P.K. Johri (1989). Optimality of the shortest line discipline with state-dependent service times. Europian Journal of Operations Research 41:157–161.

Page 133: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

References 129

[32] M.N. Katehakis & A. Levine (1985). A dynamic routing problem—Nu-merical procedures for light traffic conditions. Applied Mathematics andComputation 17: 267–276.

[33] M.N. Katehakis & C. Melolidakis (1989). On the optimal maintenanceof systems and control of arrivals in queues. Technical report, GraduateSchool of Management, Rutgers University.

[34] J.F.C. Kingman (1961). Two similar queues in parallel. Annals of Mathe-matical Statistics 32: 1314–1323.

[35] G.M. Koole (1991). Assigning multiple customer classes to parallel servers.Technical report TW-91-07, Leiden University.

[36] G.M. Koole (1992). On the optimality of FCFS for networks of multi-serverqueues. Technical report BS-R923, CWI, Amsterdam.

[37] G.M. Koole (1993). Optimal server assignment in the case of service timeswith monotone failure rates. Systems and Control Letters 20: 233–238.

[38] G.M. Koole (1994). On the pathwise optimal Bernoulli routing policy forhomogeneous parallel servers. Technical report 2443, INRIA Sophia An-tipolis.

[39] G.M. Koole (1995). Optimal Repairman Assignment in Two SymmetricMaintenance Models. European Journal of Operations Research 82: 295–301.

[40] G.M. Koole and M. Vrijenhoek (1995). Scheduling a repairman in a finitesource system. Technical report TW-95-03, Leiden University.

[41] V.G. Kulkarni & Y. Serin (1995). Optimal implementable policies: dis-counted cost case. In: W.J. Stewart (ed.), Computations with MarkovChains. Kluwer, Boston.

[42] Z. Liu and D. Towsley (1994). Optimality of the round robin routing policy.Journal of Applied Probability 31: 466–475.

[43] M. Loeve (1977). Probability Theory I, 4th ed. Springer-Verlag, New York.[44] A. Loeve & M. Pols (1990). Optimale toewijzing van klanten in wachtrij-

modellen met twee bedieningscentra. Master thesis, Leiden University.[45] A.W. Marshall & I. Olkin (1979). Inequalities: Theory of Majorization and

its Applications. Academic Press, New York.[46] R. Menich & R.F. Serfozo (1991). Monotonicity and optimality of sym-

metric parallel processing systems. Queueing Systems, Theory and Appli-cations 9: 403–418.

[47] A. van Moorsel & E. de Vries (1989). Optimale toewijzingsstrategieen bijeenvoudige netwerken van wachtrijen. Master thesis, Leiden University.

[48] R.F. Muirhead (1903). Some methods applicable to identities and inequal-ities of symmetric algebraic functions of n letters. Proceedings EdinburghMathematical Society 21: 144–157.

[49] P. Nain (1989). Interchange arguments for classical scheduling problemsin queues. Systems and Control Letters 12: 177–184.

[50] P. Nain, P. Tsoucas & J. Walrand (1989). Interchange arguments in stoch-astic scheduling. Journal of Applied Probability 27: 815–826.

Page 134: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

130 References

[51] M.F. Neuts (1979). A versatile Markovian point process. Journal of Ap-plied Probability 16: 764–779.

[52] M.F. Neuts (1981). Matrix-Geometric Solutions in Stochastic Models; anAlgorithmic Approach. John Hopkins, Baltimore.

[53] M.F. Neuts (1989). Structured Stochastic Matrices of M |G|1 Type andtheir Applications. Marcel Dekker, New York.

[54] R. Nobel & H.C. Tijms (1990). Optimal routing of customers to parallelservice groups. Memorandum 90-47, Vrije Universiteit Amsterdam.

[55] M. Pinedo & L. Schrage (1982). Stochastic shop scheduling: A survey. In:M.A.H. Dempster, J.K. Lenstra & A.H.G. Rinnooy Kan (eds.), Determin-istic and Stochastic Scheduling. Reidel, Dordrecht.

[56] R. Righter (1995). Optimal policies for scheduling repairs and allocatingheterogeneous servers. To appear in Journal of Applied Probability.

[57] R. Righter & J.G. Shanthikumar (1989). Scheduling multiclass single ser-ver queueing systems to stochastically maximize the number of successfuldepartures. Probability in the Engineering and Informational Sciences 3:323–333.

[58] R. Righter & J.G. Shanthikumar (1991). Extension of the bivariate char-acterization for stochastic orders. Advances in Applied Probability 24:506–508.

[59] R. Righter & J.G. Shanthikumar (1991). Extremal properties of the FIFOdiscipline in queueing networks. Journal of Applied Probability 29: 967–978.

[60] S.M. Ross (1983). Introduction to Stochastic Dynamic Programming. Aca-demic Press, New York.

[61] M. Rudemo (1973). Point processes generated by transitions of Markovchains. Advances in Applied Probability 5: 262–286.

[62] R. Schassberger (1973). Warteschlangen. Springer-Verlag, Wien.[63] R.F. Serfozo (1979). An equivalence between continuous and discrete time

Markov decision processes. Operations Research 27: 616–620.[64] K. Seth (1977). Optimal service policies, just after idle periods, in two-

server heterogeneous queueing systems. Operations Research 25: 356–360.[65] S. Shenker & A. Weinrib (1989). The optimal control of heterogeneous

queueing systems: a paradigm for load-sharing and routing. IEEE Trans-actions on Computers 38: 1724–1735.

[66] M.J. Sobel (1990). Throughput maximization in a loss queueing systemwith heterogeneous servers. Journal of Applied Probability 27: 693–700.

[67] M.J. Sobel & C. Srivastava (1991). Full-service policy optimality with het-erogeneous servers. Working paper.

[68] P.D. Sparaggis & D. Towsley (1993). Optimal routing in systems with ILRservice time distributions. CMPSCI Technical Report 93-13, University ofMassachusetts at Amherst.

[69] F.M. Spieksma (1990). Geometrically ergodic Markov chains and the op-timal control of queues. Ph.D. thesis, Leiden University.

Page 135: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

References 131

[70] F.M. Spieksma (1991). The existence of sensitive optimal policies in twomulti-dimensional queueing models. Annals of Operations Research 28:273–296.

[71] D. Towsley, P.D. Sparaggis & C.G. Cassandras (1992). Optimal routingand buffer allocation for a class of finite capacity queueing systems. IEEETransactions on Automatic Control 37: 1446–1451.

[72] N.M. van Dijk (1983). Controlled Markov processes; time-discretization —networks of queues. Ph.D. thesis, Leiden University.

[73] N.M. van Dijk (1988). On the finite horizon Bellman equation for con-trolled Markov jump models with unbounded characteristics: existence andapproximations. Stochastic Processes and their Applications 28: 141–157.

[74] J. Walrand (1988). An Introduction to Queueing Networks. Prentice-Hall,Englewood Cliffs.

[75] R.R. Weber (1978). On the optimal assignment of customers to parallelqueues. Journal of Applied Probability 15: 406–413.

[76] R.R. Weber (1982). Scheduling jobs with stochastic processing require-ments on parallel machines to minimize makespan or flowtime. Journal ofApplied Probability 19: 167–182.

[77] R.R. Weber (1992). The interchangeability of tandem queues with het-erogeneous customers and dependent service times. Advances in AppliedProbability 24: 727–737.

[78] J. Weishaupt (1994). Optimal myopic policies and index policies for stoch-astic scheduling problems. Zeitschrift fur Operations Research 40: 75–89.

[79] G. Weiss (1982). Multiserver stochastic scheduling. In: M.A.H. Dempster,J.K. Lenstra & A.H.G. Rinnooy Kan (eds.), Deterministic and StochasticScheduling. Reidel, Dordrecht.

[80] G. Weiss & M. Pinedo (1980). Scheduling tasks with exponential servicetimes on nonidentical processors to minimize various cost functions. Jour-nal of Applied Probability 17: 187–202.

[81] W. Whitt (1986). Deciding which queue to join: some counterexamples.Operations Research 34: 55–62.

[82] W. Winston (1977). Optimality of the shortest line discipline. Journal ofApplied Probability 14: 181–189.

[83] R.W. Wolff (1977). An upper bound for multi-channel queues. Journal ofApplied Probability 14: 884–888.

[84] R.W. Wolff (1987). Upper bounds on work in system for multichannelqueues. Journal of Applied Probability 24: 547–551.

Page 136: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

132 References

Page 137: Stochastic Scheduling and Dynamic Programming …koole/publications/thesis/thesis.pdf2 Introduction The previous results are only interesting in continuous time, due to the way of

Index

Bellman equation 108Cyclic Assignment Policy (CAP) 20Dependent Markov Decision Arrival Process (DMDAP) 57DFR 119discretization 107Equal Splitting Policy (ESP) 20Fastest Queue Policy (FQP) 9FCFS 19failure rate 119IFR 119LAST 27LEPT 29list policy 73majorization 121Markov Arrival Process (MAP) 5, 113Markov Decision Arrival Process (MDAP) 33Markov Modulated Poisson Process (MMPP) 6MAST 27partial information 38partial sum ordering 122pathwise optimality 13Phase-type distribution 5, 113Schur convexity 122Shorter Faster Queue Policy (SFQP) 59Shortest Queue Policy (SQP) 7Shortest Workload Policy (SWP) 17Smallest Index Policy (SIP) 24static policy 38stochastic optimality 8tandem µc-rule 49uniformization 6, 105, 107µc-rule 23


Recommended