Minimization of demand paging for the LRU stack model of program behavior

Information Processing Letters 16 (1983) 99-104 North-Holland Publishing Company

26 February 1983

MINIMIZATION OF DEMAND PAGING FOR THE LRU STACK MODEL OF PROGRAM BEHAVIOR

Christopher WOOD and Eduardo B. FERNANDEZ

Department ?f Electrical Engineering, University of Miamr. Coral Gables, FL 33124, U.S.A

Tomas LANG

Department of Computer Science, Unioersrty of Califormu. Los Angeles, CA 90024, U.S.A.

Communicated by Alan Shaw

Received 15 March 1982

Revised 28 September 1982

The minimization of page faults in a demand paging environment where program behavior is described by the LRU stack model is studied. Previous work on this subject considered a specific type of stack probability distribution. As there exist practical situations which do not satisfy this assumption, we extend the analysis to arbitrary distributions. The minimization is

stated as an optimization problem under constraints, a method to obtain a class of optimal solution is presented. and a

fixed-space replacement algorithm to implement these solutions is proposed. The parameters of this replacement algorithm can

be varied to adapt to specific stack probability distributions and to the number of allocated pages in memory. This paper also

determines a necessary and sufficient condition for the optimality of the LRU algorithm.

Keywords: Page replacement algorithms, demand paging, LRU stack model, performance optimization. virtual memory, memory management

1. Introduction

The allocation of storage in virtual memory systems has received a great deal of attention in recent years. A similar problem is the allocation of space in primary-memory buffers in the operation of a database management system. In both cases, allocation policies are necessary to minimize the number of references that cannot be satisfied from primary memory and require, therefore, an access to slower secondary storage.

To analyze the performance of alternative allocation policies it is necessary to postulate a model of the dynamic behavior of the references. Several such models have been proposed and their accu- racy validated empirically [2]. In this paper we use the simple least recently used (LRU) stack model which has been found to provide reasonable ap- proximations to the behavior of real programs.

The advantages and shortcomings of this and

other models are presented in detail by Spirn [3]. In the LRU stack model, the total address space is divided into equally sized pages. A stack is defined which at a given time indicates the ordering of all pages according to their last time of reference, the most recently used being on top of the stack. The dynamic behavior is statistically described by as- signing to the ith stack page, a time-invariant probability of reference p(i),

An allocation policy specifies which part of the total address space is mapped into the primary memory at each instant. We consider here fixed space allocation in which the number of pages allocated to a program is fixed throughout the operation. This contrasts with variable space allocation in which the space allocated to each program varies according to their relative needs.

Spirn and Denning [2] have shown that, for a

0020-0190/83/0000-0000/$03.00 0 1983 North-Holland 99

Volume 16, Number 2 INFORMATION PROCESSING LETTERS

LRU stack model of behavior allocation of M pages, the LRU rithm is optimal if

min{p(i) Ii < M} >, max{p(i) Ii >

and with a fixed replacement algo-

fuults per reference, F, is given by

F = 5 p(i>[ 1 - q(i)] , Ml.

i=l

26 February 1983

We propose here a replacement algorithm for the LRU stack model that is optimal for arbitrary probability distributions in the reference stack. The LRU algorithm is then a particular case of this optimal replacement algorithm. We also determine a necessary and sufficient condition for the stack probability distributions for which the LRU replacement algorithm is optimal. This condition is weaker than the corresponding condition derived by Spirn and Denning [2]. The problem of minimizing page faults is formally stated in Sec- tion 2, and a general form of the solution is derived. Section 3 defines the new condition under which the LRU algorithm is optimal. Section 4 proposes a replacement algorithm which is proved to be optimal.

2. Minimization of page faults for the LRU stack

model

In a demand paging environment where program behavior is described by the LRU stack model, the problem of minimizing page faults can be stated as an optimization problem under constraints. We determine here the general form of the solutions for this optimization.

2.1. The optimization problem

Let S be a reference stack of length D which represents the reference string of a program whose size is D pages. When page S(i) is referenced, it goes to the top of the stack, S(l), and all pages S(j) (j -C i) shift down one position. In a stack model of program behavior, time-invariant, independent reference probabilities p(i) are assigned to all the positions i of S. We assume that p(D) is not zero, so that it is possible for every page in the stack to shift to the higher positions. M pages of real memory are available to hold the referenced pages.

A page fuult occurs when the referenced page is not in main memory. The expected number ofpage

100

where q(i) is the probability that S(i) is in any of the M memory pages. Since Cp(i) = 1, this can be written as

F = 1 - E p(i)q(i). i=I

The probability q(i) depends on the replacement algorithm. For LRU, q(i) = 1 for i < M and = 0 for i > M. For this case

F = 1 - E p(i). i=l

Minimization of page faults then implies maxi-

mization of the hit ratio

I-I = E p(i)q(i), i=l

subject to the following constraints: (i) q( 1) = 1, since the first stack position con-

tains the last referenced page which must always be in memory,

(ii) EYE ,q(i) = M, since there are M pages allocated in memory, and

(iii) q(i) is non-increasing, since, in a demand paging system, pages are brought into memory only when they are referenced, i.e., when they are at the top of the stack. Therefore, to be in memory when it is in position k of the stack, a page also has to be in memory for all previous stack positions.

2.2. General form of the solutions to the optimization problem

Theorem 2.1. Let the hit ratio

H = 5 diMi) i=l

be the objective function to be maximized subject to the constraints

D

q(l) = 1, c q(i) = M i=l

Volume 16. Number 2 INFORMATION PROCESSING LETTERS 26 February 1983

Q(i)

I

Fig. 1. General form of the optimal q(i) for r i s.

il c I ’

q(r) = 1 I -.- 7 q’(s)

i 0 o(s) B

I---__ ___+ q/[l)

I

I 0 P(l)

! 0 I * I

0 I I s I

Fig. 2. Illustration of the proof of Theorem 2.1.

und q(i) non-increusing. Then, the distribution q that muximizes this objective function for a given distribution p belongs to u family of functions with puram-

eters r and s, which have the following form (Fig. 1): forl<r<M<s<D,

or

q(i) = 1, l,<i<r,

q(i) = (M - r)/(s - r), r + 1 G i < s,

q(i) = 0, s+l<i<D,

49 = 1, I<i<M,

q(i) = 0, M+l,<i<D.

(This last solution can be considered as a particular case of the former with a suitable definition of (M - r)/(s - r) when r = s = M.)

That is, the distribution is such that for i < r the corresponding stack pages are always in main

101

Volume 16. Number 2 INFORMATION PROCESSING LETTERS 26 February 1983

memory, for r < i < s the pages are in memory

with probability (M - r)/(s - r) and for i > s they are never in main memory. In other words, the distribution is a descending staircase with just one step of value different from 0 or 1.

Proof of Theorem 2.1. This is a linear programming problem with bounded variables [4]. We prove the theorem by induction on the number of different values of q which are not 1 or 0.

Basis Step: This step follows by contradiction. We assume that the optimal solution has two steps with values different from 1 or 0 (A in Fig. 2) and show that there is always a better solution with just one step different from 0 or 1 (B or C in Fig. 2).

The value of H for the two-step distribution A

is

HA = k p(i) + t dikes) i=l i=r+ I

+ C p(ih(t), i=s+l

and, since

D

c q(i) = M, i=l

(1)

we have the condition

r + (s - r)q(s) + (t - s)q(t) = M (2)

where q(s) and q(t) are as illustrated in Fig. 2. For the one-step distribution B we have (q’(t) as

shown in Fig. 2)

H, = i p(i) + i p(i)q’(t), i=l i=r+l

with the condition

r + (t - r)q’(t) = M.

Comparing (1) and (3) we have

Ha - H, = k p(i)[q’(t) -q(s)] i=r+ 1

(3)

(4)

+ C di)[q’(t) - s(t)1 3

i=s+l

since q(i) and q’(i) are non-increasing and C q(i) =

C q’(i) = M,

q(s) ’ q’(t) ’ q(t).

Therefore,

Ha>HA if C p(i)[q’(t) - q(t)1 2 i=s+l

a C p(i)[q(s) - s’(t)1 . (5) i=r+l

Also, from (2) and (4) we obtain

q’(t) - q(t) = [(s - r)/(t - ~11 [q(s) - s’(t)]. Replacing this last expression in (5) we have

t

C P(i>[(s - r)/(t - s>l [q(s) - q’(t)1 a i=s+l

a C P(i)[ds) - q’(t)1 i=r+ I

and

H,>HH, if

i 1=r+l

p(i) G [(s - r)/(t - s)l ,i+, P(i),

that is,

Ha>HA if p(r+ l,s)<p(s+ l,t),

where p(a, b) is the average of the p(i) between i = a and i = b.

This result is intuitively satisfactory because, if the density is higher in the region (s + 1, t), it is advantageous to have these pages in main memory with the highest possible probability.

By a similar argument we have

Hc>H, if P(r+ l,s)>p(s+ 1,t).

Therefore, either Ha > H, or H, 2 H,, i.e., given a solution with two different values for q, there always exists a better (or equal) solution with only one value.

Induction Step: By a similar reasoning as in the Basis Step, for any solution with n + 1 different values of q it is possible to find a better (or equal) solution having only n different q values. The case

102

Volume 16. Number 2 INFORMATION PROCESSING LETI-ERS 26 February 1983

with no value different from 0 to 1 is also a

possible optimal solution. It can be shown that this solution is optimal only for particular distributions as characterized in the next solution. q

The hit ratio of the class of solutions defined by Theorem 2.1 is (Fig. 1)

H(r, s, M) =

= $, p(i) + [(M - r)/(s - r)] i p(i) i=r+ I

= rp(1, r) + (M - r)fs(r + 1, s),

or for the special case where r = s = M

H(M) = i p(i) = MF( 1, M), i=l

(6)

where p(a, b) is the average of the p(i) between i = a and i = b.

The optimization problem has then been trans- formed into the problem of determining optimal values for r and s (r,, and sO, respectively) for a given distribution p, such that H is a maximum.

The optimal solution rO, s0 can be found by evaluating H(r, s, M) for all possible values of r and s. The total number of evaluations is (M - 1) x (D - M) + 1. A heuristic procedure for reducing this number is given in [5].

3. Optimality condition for the LRU memory replacement algorithm

We now determine a necessary and sufficient condition for the optimality of the LRU algorithm. Since the LRU algorithm keeps in memory the pages corresponding to the first M positions of the stack, according to Theorem 2.1 this algorithm is optimal if r0 = s,, = M.

Comparing the value of the hit ratio for LRU with the hit ratio of the other possible optimal solutions, the LRU algorithm is optimal if, for all r and s,

Mp(l,M)=&l,r)+(M-r)p(r+I,M)

ar@(l,r)i-(M-r)p(r+l,s),

that is,

p(r+ l,M)>p(r+ 1,s).

The following lemma provides an equivalent (and more convenient) condition.

Lemma 3.1

j$r+ l,M)>p(r+ 1,s)

e P(r+ l,M)>,p(M+ 1,s).

Proof. Decomposing p(r + 1, s) in the intervals [r + 1, M] and [M + 1, s], we have

Ij(r + 1, M) > [(M - r)/(s - r)]p(r + 1, M)

+ [(s - M)/(s - r)]p(M + 1, s).

Therefore,

[l - (M - r)/(s - r)]j$r + 1, M) 2

2 [(s - M)/(s - r)]i$M + 1, s>,

that is,

p(r + 1, M) > p(M + 1, s).

The “only if” part of the lemma can be proved by a similar argument. 0

Theorem 3.2 below follows directly from Lemma 3.1.

Theorem 3.2. The LRU replacement algorithm is optimal if

min fi(r + 1, M) > M~;:r,I$M + 1, s). Icr<M ,

This is a necessary and sufficient condition for the optimality of the LRU algorithm. It is a weaker condition than the corresponding condition derived by Spirn and Denning [2].

4. Optimal replacement algorithm

We present here a replacement algorithm that implements the solutions defined by Theorem 2.1. This algorithm can be shown to be a stack algorithm.

103

Volume 16, Number 2 INFORMATION PROCESSING LETTERS 26 February 1983

Definition 4.1. The predecessor of page S”‘(j) for j > 1 is the page S(k) that occupies position j at the time t + 1. There are two possible values for k:

(i) k = j - 1 if, at time t, a reference is made to

a page St(i) such that i > j - 1, (ii) k = j if the reference is to page S’(i) such

that i <j - 1.

Lemma 4.2. Zf the predecessor of page S"'(j) is not replaced, then q(j) = q(j - 1).

Proof. If the predecessor of page S”‘(j) is not replaced, the probability that, at time t + 1, page S’+‘(j) is in memory is

where q’(j) is the probability that page S(j) is in memory at time t. In equilibrium, qt+‘(j) = q’(j), and in combination with (7) we then have

r j-l

1 ZE 4C_9 I 1 --C P(i) i=l

and since

D

D

i I C p(i) sci - I>> (8 > i=J

c p(i) = 1 and p(D) * 0

(i) For 1 < j < r0 the predecessor of S(j) is never replaced (only pages S(r,) and S(s,) can be replaced, and since q( 1) = 1, we have

q(2) = q(3) = . . . = q(rO - 1) = 1.

(ii) The predecessor of S(r,) can be replaced only when a reference is made to a page S(i) for j < rO. Since by (i) these pages are always in memory, the predecessor of S(r,) is never replaced and

q(rO) = q(rO - 1) = 1.

(iii) Since S(s,) is always replaced when a reference is made to S(i) with j > sO, the pages S(k), k > sO, can never be in memory and therefore

q(sO+ l)= ... =q(D)=O.

(iv) For (rO + 1) <j < s0 the predecessory of S(j) is never replaced, therefore

q(s,)= ... =q(rO+ 1).

Since Cp= ,q(i) = M,

q(r,, + 1) = . . . = q(sO) = (M - rO)/(sO - rO).

The distribution q is therefore identical to the optimal distribution defined in Theorem 2.1 and consequently the replacement algorithm is optimal. 0

we have q(i) = q(j - 1). 0

References Theorem 4.3. Zf r, and sO are the optimal values for r and s, k is the referenced stack position, and S(k) is not in memory, then the following replacement

algorithm is optimal:

If S(s,) is in memory and k > sO then replace S(Q) else replace S(r,).

(S(r,) will be shown in the proof to be always in

memory.)

Proof. For the algorithm we are considering we have the following:

111

PI

[31

[41

[51

A.J. Smith, Analysis of the optimal, look-ahead demand

paging algorithms. SIAM J. Comput. 5 (4) (1976) 743-757. J.R. Spirn and P.J. Denning. Experiments with program

locality, Proc. AFIPS 41 (AFIPS Press. Montvale. NJ) pp.

611-621.

J.R. Spirn. Program Behavior: Models and Measurements

(North-Holland, Amsterdam, 1977).

V.A. Sposito, Linear and Nonlinear Programming (Iowa

State University Press, Ames. IA, 1975) pp. 84-89. C. Wood. E.B. Fernandez and T. Lang. Minimization of demand paging for the LRU stack model of program behavior, Tech. Rept. G320-7689. IBM Los Angeles Scien- tific Center, 1977.

104

Date post:	28-Aug-2016
Category:	Documents
Upload:	christopher-wood
View:	214 times
Download:	0 times

Minimization of demand paging for the LRU stack model of program behavior

Documents